Skip to content

feat: Phase D.9: Numbering style control (restart, custom lists, nested lists)#74

Closed
citconv-agents[bot] wants to merge 3 commits intomasterfrom
agent/issue-22
Closed

feat: Phase D.9: Numbering style control (restart, custom lists, nested lists)#74
citconv-agents[bot] wants to merge 3 commits intomasterfrom
agent/issue-22

Conversation

@citconv-agents
Copy link
Copy Markdown

@citconv-agents citconv-agents bot commented Apr 5, 2026

Summary

Implements #22

This PR was automatically generated by the Developer Agent.

Original Issue

Add comprehensive numbering/list control. Most requested upstream feature (python-openxml#582: 83 comments, #25: 37 comments, python-openxml#122: 27 comments).

API Design

Restart numbering:

  • paragraph.restart_numbering() — restarts the numbered list counter at 1
  • Sets w:numPr > w:numId to a new abstract num with restart override

Custom list styles:

  • document.numbering.add_numbering_definition(levels) — create a custom multi-level list
  • Each level: number format (decimal, lowerAlpha, upperRoman, bullet), text pattern ("%1.", "%1.%2"), indent, font
  • Returns a NumberingDefinition that can be applied to paragraphs

Nested lists:

  • paragraph.list_level — get/set the indent level (0-8)
  • paragraph.list_format — get/set the numbering definition and level
  • Support multi-level lists: applying level 1 under a level 0 creates nesting

Access existing numbering:

  • document.numbering.definitions — iterate all numbering definitions
  • paragraph.numbering_format — read the current list style

XML

  • w:numPr > w:numId (which numbering definition), w:ilvl (indent level)
  • w:numbering > w:abstractNum (definition), w:num (instance with overrides)
  • w:lvlOverride > w:startOverride for restart

Upstream issues: python-openxml#582, #25, python-openxml#122


Generated by Developer Agent using Claude Code

Implement comprehensive numbering/list control addressing upstream
issues python-openxml#582, #25, and python-openxml#122. Adds three-layer support following existing
codebase patterns:

oxml layer:
- CT_AbstractNum for abstract numbering definitions with multi-level support
- CT_Lvl for level format (numFmt, lvlText, start value)
- Extended CT_Numbering with abstractNum management
- CT_NumPr with ilvl_val/numId_val property accessors

Parts layer:
- Extended NumberingPart with proper new() factory and numbering_element

Proxy layer:
- Numbering: document.numbering with definitions list and
  add_numbering_definition() for custom multi-level lists
- NumberingDefinition: num_id, abstract_num_id, levels, restart()
- LevelFormat: level, number_format, text_pattern, start
- ListFormat: paragraph.list_format with level, num_id, apply()
- Paragraph: list_level, numbering_format, restart_numbering()

Closes #22

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review — PR #74 (Numbering Style Control)

Reviewed: 2026-04-05
Branch: agent/issue-22
Scope: All changed files vs origin/master

Summary

No security issues found. The PR adds numbering/list formatting support (new CT_AbstractNum, CT_Lvl, Numbering, NumberingDefinition, LevelFormat, ListFormat classes) and is clean across all checked categories.

Files Reviewed

File Finding
src/docx/oxml/numbering.py Clean
src/docx/numbering.py Clean
src/docx/parts/numbering.py Clean
src/docx/text/listformat.py Clean
src/docx/text/paragraph.py Clean
src/docx/parts/document.py Clean
src/docx/document.py Clean
src/docx/oxml/__init__.py Clean
tests/oxml/test_numbering.py Clean
tests/test_numbering.py Clean
tests/text/test_listformat.py Clean

Detailed Findings

Injection Risks — NONE

XPath construction (oxml/numbering.py lines 335, 458):

xpath = './w:lvl[@w:ilvl="%d"]' % ilvl           # integer-only format
xpath = './w:abstractNum[@w:abstractNumId="%d"]' % abstractNumId  # integer-only format

Both use %d with Python int values. No string interpolation; XPath injection is not possible.

XML attribute setting (e.g. CT_Lvl.lvlText_val, CT_Lvl.numFmt_val setters):

lvlText.set(qn("w:val"), value)
numFmt.set(qn("w:val"), value)

Uses lxml's element.set() which automatically escapes <, >, &, " in attribute values. No XML injection possible even with adversarial string input.

Hardcoded XML template (parts/numbering.py NumberingPart.new()): Static byte literal with no user-controlled content.

Dependency Risks — NONE

No new third-party packages introduced. All imports are from the existing docx, lxml, and stdlib.

Data Exposure — NONE

No file I/O operations that accept user-controlled paths. No logging of sensitive data.

Secrets — NONE

No API keys, tokens, passwords, or credentials anywhere in the diff.

Conclusion

The PR is safe to merge from a security standpoint.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

PR #74 — Numbering Style Control (restart, custom lists, nested lists)

Overall this is well-structured work that follows the codebase's three-layer pattern. Tests for the new oxml and proxy classes are thorough. A couple of correctness issues need fixing before merge.


Bug: NumberingPart.new() passes package=None (potential save failure)

src/docx/parts/numbering.py:595

return cls(partname, content_type, element, None)  # None = package

Every other Part in this codebase passes a real Package reference (e.g. FootnotesPart.default(cls, package) per CLAUDE.md). The XmlPart base stores the package for later use (e.g. resolving relationships, saving). Passing None will silently work until something inside NumberingPart needs self.package, at which point it raises AttributeError or NullPointerEquivalent — most likely during Document.save() on a document that triggered lazy creation of the numbering part.

Fix: NumberingPart.new() should accept a package arg and propagate it, and DocumentPart.numbering_part should pass self.package:

# parts/numbering.py
@classmethod
def new(cls, package: Package) -> NumberingPart:
    ...
    return cls(partname, content_type, element, package)

# parts/document.py
numbering_part = NumberingPart.new(self.package)

Bug: False get_or_add_lvl annotation on CT_AbstractNum

src/docx/oxml/numbering.py:311

get_or_add_lvl: Callable[[], CT_Lvl]

ZeroOrMore descriptors generate add_*() and _insert_*() — they do not generate get_or_add_*() (that's only produced by ZeroOrOne). This annotation falsely advertises a method that doesn't exist. Any code that calls abstract_num.get_or_add_lvl() will raise AttributeError at runtime.

Since no current code calls this, remove the annotation:

# delete this line:
get_or_add_lvl: Callable[[], CT_Lvl]

Missing tests for new Paragraph properties

src/docx/text/paragraph.py:709–772

Four new public properties/methods were added to Paragraph:

  • list_format
  • list_level (getter + setter)
  • numbering_format
  • restart_numbering()

None of them have tests in tests/text/test_paragraph.py. The ListFormat class itself is well-tested, but numbering_format has non-trivial traversal logic (lookup through numbering part → abstractNum → lvl) and restart_numbering() modifies the numbering part — both warrant direct unit tests.


Minor: Unreachable code in _next_abstractNumId

src/docx/oxml/numbering.py:480

for n in range(0, len(ids) + 1):
    if n not in ids:
        return n
return len(ids)  # unreachable — loop always finds a gap

range(0, len(ids)+1) produces len(ids)+1 candidates; ids has at most len(ids) elements, so by pigeonhole the loop always returns. The trailing return is dead code. Remove it to avoid confusion.


Minor: Redundant element aliases in proxy classes

src/docx/numbering.py:50–51, 105–106, 144–145

Each proxy class stores both self._element (set by ElementProxy.__init__) and a named alias (self._numbering, self._num, self._lvl) pointing to the same object. Other proxy classes in this codebase (e.g. Footnote, Comment) use self._element directly. The aliases add noise without adding clarity. Use self._element throughout, or if you prefer the named form, don't call super().__init__() separately.


Minor: Unused import under TYPE_CHECKING

src/docx/numbering.py:41

if TYPE_CHECKING:
    from docx.oxml.ns import qn

qn is never referenced in any type annotation in this file. Remove the import.

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
- NumberingPart.new() now accepts and propagates package arg (fixes save failure)
- Remove false get_or_add_lvl annotation on CT_AbstractNum (ZeroOrMore doesn't generate it)
- Remove unreachable return in _next_abstractNumId
- Remove redundant element aliases in proxy classes, use self._element
- Remove unused TYPE_CHECKING import of qn in numbering.py
- Add tests for Paragraph.list_format, list_level, numbering_format, restart_numbering()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents citconv-agents bot removed security-passed Security agent passed review-changes-needed Review agent requested changes labels Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review: PR #74 — Numbering Style Control

Summary

Reviewed all changed files for injection risks, dependency risks, data exposure, and secrets. No security issues found.

Files Reviewed

  • src/docx/document.py — added numbering property
  • src/docx/numbering.py — new proxy objects (Numbering, NumberingDefinition, LevelFormat)
  • src/docx/oxml/__init__.py — new element class registrations
  • src/docx/oxml/numbering.py — new CT_AbstractNum, CT_Lvl, extended CT_Num, CT_NumPr, CT_Numbering
  • src/docx/oxml/text/parfmt.py — type annotation updates
  • src/docx/parts/document.py — added numbering property
  • src/docx/parts/numbering.py — implemented NumberingPart.new()
  • src/docx/text/listformat.py — new ListFormat proxy
  • src/docx/text/paragraph.py — added list-related properties/methods
  • Test files — unit tests for new functionality

Findings

XML Injection / XPath Injection

Not a risk. Two XPath expressions use dynamic values:

  • src/docx/oxml/numbering.py:324: './w:lvl[@w:ilvl="%d"]' % ilvl
  • src/docx/oxml/numbering.py:447: './w:abstractNum[@w:abstractNumId="%d"]' % abstractNumId

Both use %d format specifier, which enforces integer conversion. Non-integer input raises a TypeError before the XPath is evaluated — no injection possible.

XML Attribute Value Injection

Not a risk. User-supplied strings (format, text from level specs) are set via lxml's element.set(qn("w:val"), value) method (e.g., src/docx/oxml/numbering.py:277, src/docx/oxml/numbering.py:293). lxml properly escapes XML special characters in attribute values, preventing XML injection.

XXE (XML External Entity)

Not a risk. The hardcoded XML template in src/docx/parts/numbering.py:576-591 is a static byte string with no entity declarations or external references. The parse_xml() function used throughout this codebase already has XXE protections in place (consistent with the rest of the library).

New Dependencies

None. No new packages added to requirements or setup files.

Secrets / Credentials

None found. No API keys, tokens, passwords, or credentials anywhere in the diff.

Path Traversal

Not applicable. No file path operations in the changed code.

Data Exposure

No issue. The numbering_element property exposing the internal CT_Numbering element is consistent with existing patterns throughout the codebase (e.g., footnotes, comments) and is an intentional part of the API design.

Conclusion

All changes are limited to OOXML numbering definition management within a document processing library. The code follows established safe patterns. No security concerns identified.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

PR #74 — Add numbering style control (restart, custom lists, nested lists)

Overall the implementation is solid: the three-layer architecture is respected, proxy objects are clean, and test coverage is good. There are a few bugs that need fixing before merge.


Bug 1 (High): Off-by-one in CT_Lvl successors — src/docx/oxml/numbering.py:45-47

_tag_seq indices:

9: "w:lvlJc"
10: "w:pPr"
11: "w:rPr"

The pattern in this codebase (and in CT_PPr) is that an element at index N uses successors=_tag_seq[N+1:]. Lines 45-47 break this pattern — each element includes itself in its own successors tuple:

lvlJc = ZeroOrOne("w:lvlJc", successors=_tag_seq[9:])   # 9 should be 10
pPr   = ZeroOrOne("w:pPr",   successors=_tag_seq[10:])  # 10 should be 11
rPr   = ZeroOrOne("w:rPr",   successors=_tag_seq[11:])  # 11 should be ()

This will cause wrong XML element ordering when pPr or rPr is inserted inside a <w:lvl> via get_or_add_*. Fix:

lvlJc = ZeroOrOne("w:lvlJc", successors=_tag_seq[10:])
pPr   = ZeroOrOne("w:pPr",   successors=_tag_seq[11:])
rPr   = ZeroOrOne("w:rPr",   successors=())

Bug 2 (Medium): restart_numbering() and NumberingDefinition.restart() hardcode ilvl=0src/docx/text/paragraph.py:765 and src/docx/numbering.py:129

Both methods always create the <w:lvlOverride> with ilvl=0. For a paragraph at list level 2, this creates a restart override on level 0, not level 2 — leaving the paragraph's actual level unaffected.

In paragraph.restart_numbering():

# current code
lvl_override = new_num.add_lvlOverride(ilvl=0)

# fix — use the paragraph's current level
current_level = list_fmt.level or 0
lvl_override = new_num.add_lvlOverride(ilvl=current_level)

In NumberingDefinition.restart(), the method doesn't have access to a target level. Consider either:

  • Adding a level: int = 0 parameter so callers can specify which level to restart, or
  • Documenting that restart() always restarts level 0 (which is the most common case)

At minimum, the docstring should mention the ilvl=0 limitation.


Convention Violation: XML template should be a file, not an inline bytes literal — src/docx/parts/numbering.py:576-591

CLAUDE.md explicitly states: "XML templates go in src/docx/templates/"

The existing parts (FootnotesPart, CommentsPart, etc.) all load their XML from src/docx/templates/default-*.xml files. The 14-line inline bytes literal in NumberingPart.new() should be extracted to src/docx/templates/default-numbering.xml and loaded the same way other parts do it (e.g. via pkg_resources or importlib.resources).


Minor: Missing blank lines before class definitions in src/docx/numbering.py:40

from docx.shared import ElementProxy

class Numbering(ElementProxy):   # ← missing two blank lines before class

All other modules in the codebase use two blank lines before top-level class definitions (PEP 8). Lines 40, 96, and 134 each need a blank line added above them.


Not a bug, but worth noting: DocumentPart.numbering creates a new proxy on each access

src/docx/parts/document.py:120-125 — the numbering property constructs a new Numbering wrapper on every call, unlike numbering_part which uses @lazyproperty. This is currently harmless because Numbering holds no mutable state of its own, but it's inconsistent with how other parts (e.g. footnotes_part) are cached. Consider @lazyproperty if performance becomes a concern.

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
- Fix off-by-one in CT_Lvl successors (lvlJc, pPr, rPr included
  themselves in their own successors tuples)
- Fix restart_numbering() to use paragraph's current level instead of
  hardcoding ilvl=0
- Add level parameter to NumberingDefinition.restart() with default 0
- Extract inline XML bytes literal to default-numbering.xml template
- Add missing PEP 8 blank line before Numbering class

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents citconv-agents bot removed security-passed Security agent passed review-changes-needed Review agent requested changes labels Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review — PR #74 (Numbering Style Control)

Reviewed: 2026-04-05
Branch: agent/issue-22 vs origin/master
Scope: All changed files


Files Reviewed

File Change
src/docx/document.py Added numbering property
src/docx/numbering.py New proxy classes (Numbering, NumberingDefinition, LevelFormat)
src/docx/oxml/__init__.py Registered new element classes
src/docx/oxml/numbering.py New CT_AbstractNum, CT_Lvl; extended CT_Num, CT_Numbering, CT_NumPr
src/docx/oxml/text/parfmt.py Added type annotations for numPr
src/docx/parts/document.py Added numbering property; fixed NumberingPart.new() call
src/docx/parts/numbering.py Implemented new() and _default_numbering_xml()
src/docx/templates/default-numbering.xml New static XML template
src/docx/text/listformat.py New ListFormat proxy
src/docx/text/paragraph.py Added list-format properties and restart_numbering()
tests/ New and updated unit tests

Findings

XPath Construction — CLEAN

Locations:

  • src/docx/oxml/numbering.py:327'./w:lvl[@w:ilvl="%d"]' % ilvl
  • src/docx/oxml/numbering.py:451'./w:abstractNum[@w:abstractNumId="%d"]' % abstractNumId

Both XPath expressions use %d integer formatting. Integer values cannot contain XPath metacharacters, so there is no XPath injection risk. Types are enforced via ST_DecimalNumber attribute validators upstream.

XML Attribute Values from User Input — CLEAN

Locations:

  • CT_Lvl.numFmt_val setter (oxml/numbering.py:275–280) — calls numFmt.set(qn("w:val"), value)
  • CT_Lvl.lvlText_val setter (oxml/numbering.py:291–296) — calls lvlText.set(qn("w:val"), value)

Values are set via lxml's element.set(), which automatically XML-escapes attribute values. No raw string interpolation into XML text. No injection risk.

File Path Construction — CLEAN

Location: src/docx/parts/numbering.py:586–589

path = os.path.join(os.path.split(__file__)[0], "..", "templates", "default-numbering.xml")

Path is derived entirely from __file__ (module location), not from any user-supplied input. The .. component traverses from parts/ to docx/ — a fixed, deterministic path to a bundled template. No path traversal risk.

Static XML Template — CLEAN

Location: src/docx/templates/default-numbering.xml

No <!DOCTYPE> declaration and no external entity references. Not susceptible to XXE. Loaded once as a bundled resource via parse_xml(), which is an existing library function with no user-controlled content.

New Dependencies — NONE

No new packages added to pyproject.toml or any requirements file. No dependency risk.

Secrets in Code — NONE

No API keys, tokens, passwords, or credentials found in any changed file.

Data Exposure — NONE

No sensitive data is read, logged, or returned in an unsafe manner. All accessed data is OOXML numbering metadata from the document being processed.


Summary

No security issues found. The PR follows safe patterns throughout:

  • Integer-constrained XPath queries prevent injection
  • lxml attribute setters handle XML escaping
  • File paths are package-relative and not user-controlled
  • No new dependencies introduced
  • No secrets or sensitive data handling

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

PR #74 — Numbering Style Control (restart, custom lists, nested lists)

The PR is well-structured and introduces useful functionality with good test coverage. Three
issues need to be addressed before merging.


Issue 1 — Layering violation: proxy layer accesses private oxml internals

File: src/docx/numbering.py:75

abstract_num_id = numbering._next_abstractNumId
abstract_num = CT_AbstractNum.new(abstract_num_id)

Numbering.add_numbering_definition() reaches into a private CT_Numbering property to
allocate an abstractNumId, then passes the pre-allocated ID into CT_AbstractNum.new().
This exposes an implementation detail of the oxml layer to the proxy layer, and is
inconsistent with how add_num() works — add_num handles numId allocation entirely
internally.

The fix is to add a factory method to CT_Numbering that creates and inserts a new
abstractNum with the next available ID, analogous to add_num:

# in CT_Numbering
def new_abstractNum(self) -> CT_AbstractNum:
    """Create, insert, and return a new <w:abstractNum> with next available ID."""
    abstract_num_id = self._next_abstractNumId
    abstract_num = cast(
        CT_AbstractNum,
        OxmlElement("w:abstractNum", attrs={qn("w:abstractNumId"): str(abstract_num_id)}),
    )
    return self._insert_abstractNum(abstract_num)

Then Numbering.add_numbering_definition() becomes:

abstract_num = numbering.new_abstractNum()
abstract_num_id = abstract_num.abstractNumId
# ... add levels ...
num = numbering.add_num(abstract_num_id)

add_abstractNum(element) and _next_abstractNumId can then remain internal to the oxml
layer (or be removed entirely).


Issue 2 — Code duplication: restart logic repeated in two places

Files: src/docx/text/paragraph.py:768–789 and src/docx/numbering.py:122–134

Paragraph.restart_numbering() duplicates the oxml operations in NumberingDefinition.restart() almost exactly. The proxy layer already has the right abstraction — Paragraph.restart_numbering() should delegate to it:

def restart_numbering(self) -> None:
    list_fmt = self.list_format
    num_id = list_fmt.num_id
    if num_id is None or num_id == 0:
        return
    numbering = self.part.numbering
    num_def = next((d for d in numbering.definitions if d.num_id == num_id), None)
    if num_def is None:
        return
    current_level = list_fmt.level or 0
    new_defn = num_def.restart(level=current_level)
    list_fmt.num_id = new_defn.num_id

This keeps the oxml mutation logic in one place and makes Paragraph.restart_numbering()
a thin orchestration layer.


Issue 3 — _next_abstractNumId return type annotation vs. implicit None return

File: src/docx/oxml/numbering.py:282–289

@property
def _next_abstractNumId(self) -> int:
    id_strs = self.xpath("./w:abstractNum/@w:abstractNumId")
    ids = [int(id_str) for id_str in id_strs]
    for n in range(0, len(ids) + 1):
        if n not in ids:
            return n
    # implicit return None — type annotation says int

The loop is guaranteed by the pigeonhole principle to always return n, but Python type
checkers (and pyright) will flag the missing return. The sibling method _next_numId uses
break + a post-loop return num, which is the established pattern here. Use the same
pattern:

@property
def _next_abstractNumId(self) -> int:
    id_strs = self.xpath("./w:abstractNum/@w:abstractNumId")
    ids = [int(id_str) for id_str in id_strs]
    for n in range(0, len(ids) + 1):
        if n not in ids:
            break
    return n

Minor notes (no change required)

  • from typing import List throughout new files: since from __future__ import annotations
    is present, list[X] is usable everywhere and is the more modern style. Fine to leave as-is
    if the project wants to remain compatible with older type checkers.
  • The default-numbering.xml template is missing a trailing newline (cosmetic only).

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
@loadfix
Copy link
Copy Markdown
Owner

loadfix commented Apr 5, 2026

Closing to re-run through the fixed pipeline (Product → Developer → Security → Review → Merge).

@loadfix loadfix closed this Apr 5, 2026
@loadfix loadfix deleted the agent/issue-22 branch April 5, 2026 07:03
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Product Agent — APPROVED

APPROVED

The issue is well-defined with a concrete API design, clear XML mapping, and strong upstream demand (multiple high-comment issues). The scope fits squarely within python-docx's purpose of OOXML manipulation. The previous review cycles have already surfaced specific bugs (package=None, off-by-one successors, layering violation) that the developer will need to address, but those are implementation-level concerns, not gaps in the issue specification itself.

Handing off to the Developer Agent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-pr PR created by an agent review-changes-needed Review agent requested changes security-passed Security agent passed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant