feat: Phase D.9: Numbering style control (restart, custom lists, nested lists) by citconv-agents[bot] · Pull Request #108 · loadfix/python-docx

citconv-agents · 2026-04-05T09:07:09Z

Summary

Implements #22

This PR was automatically generated by the Developer Agent.

Original Issue

Add comprehensive numbering/list control. Most requested upstream feature (python-openxml#582: 83 comments, #25: 37 comments, python-openxml#122: 27 comments).

API Design

Restart numbering:

paragraph.restart_numbering() — restarts the numbered list counter at 1
Sets w:numPr > w:numId to a new abstract num with restart override

Custom list styles:

document.numbering.add_numbering_definition(levels) — create a custom multi-level list
Each level: number format (decimal, lowerAlpha, upperRoman, bullet), text pattern ("%1.", "%1.%2"), indent, font
Returns a NumberingDefinition that can be applied to paragraphs

Nested lists:

paragraph.list_level — get/set the indent level (0-8)
paragraph.list_format — get/set the numbering definition and level
Support multi-level lists: applying level 1 under a level 0 creates nesting

Access existing numbering:

document.numbering.definitions — iterate all numbering definitions
paragraph.numbering_format — read the current list style

XML

w:numPr > w:numId (which numbering definition), w:ilvl (indent level)
w:numbering > w:abstractNum (definition), w:num (instance with overrides)
w:lvlOverride > w:startOverride for restart

Upstream issues: python-openxml#582, #25, python-openxml#122

Generated by Developer Agent using Claude Code

…s) (#22) Add comprehensive numbering/list control addressing the most requested upstream features (python-openxml#582, #25, python-openxml#122). New API: - document.numbering — access numbering definitions - document.numbering.add_numbering_definition(levels) — create custom multi-level lists with format, text pattern, indent, and font options - document.numbering.definitions — iterate all numbering definitions - paragraph.list_level — get/set indent level (0-8) - paragraph.list_format — get/set numbering definition ID and level - paragraph.numbering_format — read current list format string - paragraph.restart_numbering() — restart counter at 1 via lvlOverride Implementation: - CT_AbstractNum, CT_Lvl oxml element classes for abstract numbering - CT_NumPr ilvl_val/numId_val properties - CT_PPr numPr_ilvl_val/numPr_numId_val convenience properties - NumberingPart.new() with default-numbering.xml template - Numbering, NumberingDefinition, LevelFormat proxy classes - _ListFormat helper on Paragraph Closes #22 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

citconv-agents · 2026-04-05T09:09:25Z

Security Agent Report

SECURITY_PASS

Security Review: PR #108 — Numbering Style Control

Summary

All changed files were reviewed. No security issues found.

Files Reviewed

src/docx/document.py — added numbering property (delegation only)
src/docx/numbering.py — new proxy classes (Numbering, NumberingDefinition, LevelFormat)
src/docx/oxml/numbering.py — new CT_AbstractNum, CT_Lvl element classes; extended CT_Numbering, CT_NumPr
src/docx/oxml/text/parfmt.py — added numPr-related convenience properties to CT_PPr
src/docx/parts/document.py — added numbering property
src/docx/parts/numbering.py — implemented NumberingPart.new() and _default_numbering_xml()
src/docx/templates/default-numbering.xml — new static XML template
tests/oxml/test_numbering.py — unit tests (test files only)
tests/test_numbering.py — unit tests (test files only)
tests/test_numbering_integration.py — unit tests (test files only)

Checks

XML Injection / XXE

User-supplied values (number_format, text, font, indent) are assigned as lxml element attribute values via element.attrib[key] = value. lxml escapes special characters in attribute values, preventing XML injection.
parse_xml() uses an etree.XMLParser configured with resolve_entities=False (line 19 of src/docx/oxml/parser.py), preventing XXE attacks.
The new default-numbering.xml template contains no DOCTYPE declaration or external entity references.

Path Traversal

_default_numbering_xml() constructs its file path using os.path.split(__file__)[0] — derived entirely from the module's own install location, not from user input. No traversal risk.

New Dependencies

No new external packages introduced.

Secrets in Code

No API keys, tokens, passwords, or credentials present.

Data Exposure / Unsafe File Handling

The only file read is the static bundled template at a fixed relative path inside the package. No user-controlled paths are opened.

Result

CLEAN — no security issues identified.

citconv-agents · 2026-04-05T09:12:57Z

Review Agent

REVIEW_CHANGES_NEEDED

PR #108 — Numbering Style Control Review

Overall the implementation is solid: architecture follows project patterns correctly, tests are thorough and well-structured, and the three-layer design (oxml → parts → proxy) is properly observed. A few fixable issues below.

1. Unused imports in `src/docx/numbering.py` (lines 5, 7)

from typing import TYPE_CHECKING, Iterator, List   # Iterator never used
from docx.shared import Twips                       # Twips never used

Iterator and Twips are imported but never referenced anywhere in the file. Remove both.

2. Repeated deferred imports inside a loop in `src/docx/numbering.py` (lines 68–86)

for i, level_spec in enumerate(levels):
    ...
    if indent is not None:
        from docx.oxml.parser import OxmlElement   # ← inside loop
        from docx.oxml.ns import qn                # ← inside loop
        ...
    if font is not None:
        from docx.oxml.parser import OxmlElement   # ← again, same loop
        from docx.oxml.ns import qn                # ← again, same loop

The same two imports appear inside two separate if branches within the same for loop. Move them to module-level (alongside the other if TYPE_CHECKING block or as unconditional imports), or at minimum hoist them to the top of the add_numbering_definition method. Running repeated import statements inside a loop is unnecessary overhead and an unusual pattern in this codebase.

3. Asymmetric orphan-`<w:numPr>` cleanup in `CT_PPr` — `src/docx/oxml/text/parfmt.py` (lines 339–346 vs. 356–366)

The numPr_numId_val setter correctly removes the parent <w:numPr> when both <w:ilvl> and <w:numId> are gone:

# numPr_numId_val setter (lines 360–364)
numPr.numId_val = None
if numPr.ilvl is None and numPr.numId is None:
    self._remove_numPr()

But the numPr_ilvl_val setter (lines 341–345) only clears <w:ilvl> and returns — it never checks whether <w:numPr> is now an empty shell:

# numPr_ilvl_val setter — missing cleanup
numPr = self.numPr
if numPr is not None:
    numPr.ilvl_val = None   # leaves <w:numPr/> if numId is also None
return

A sequence like pPr.numPr_numId_val = 5; pPr.numPr_ilvl_val = 2; pPr.numPr_ilvl_val = None; pPr.numPr_numId_val = None correctly cleans up. But pPr.numPr_ilvl_val = 2; pPr.numPr_ilvl_val = None leaves an empty <w:numPr> in the XML, which Word treats as a no-op but is technically invalid per schema. Mirror the same cleanup in the numPr_ilvl_val setter:

@numPr_ilvl_val.setter
def numPr_ilvl_val(self, value: int | None):
    if value is None:
        numPr = self.numPr
        if numPr is not None:
            numPr.ilvl_val = None
            # -- remove numPr if both children gone --
            if numPr.ilvl is None and numPr.numId is None:
                self._remove_numPr()
        return
    self.get_or_add_numPr().ilvl_val = value

Minor notes (no change required)

CT_AbstractNum.add_lvl uses self.append() rather than _insert_lvl() — correct because lvl has successors=(), so insertion is equivalent to append.
DocumentPart.numbering_part is correctly a lazyproperty with lazy creation; DocumentPart.numbering returning a fresh Numbering wrapper each call is consistent with how comments and footnotes work.
_next_abstractNumId mirrors the O(n²) pattern of the existing _next_numId — acceptable for typical document sizes.

- Remove unused imports (Iterator, Twips) from numbering.py - Hoist repeated in-loop imports (OxmlElement, qn) to module level - Add orphan numPr cleanup to numPr_ilvl_val setter, mirroring numPr_numId_val Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

citconv-agents · 2026-04-05T09:17:17Z

Security Agent Report

SECURITY_PASS

Security Review — PR #108

Branch: agent/issue-22
Reviewed: 2026-04-05
Scope: Numbering style control (restart, custom lists, nested lists)

Files Reviewed

File	Status
`src/docx/document.py`	CLEAN
`src/docx/numbering.py`	CLEAN
`src/docx/oxml/__init__.py`	CLEAN
`src/docx/oxml/numbering.py`	CLEAN
`src/docx/oxml/text/parfmt.py`	CLEAN
`src/docx/parts/document.py`	CLEAN
`src/docx/parts/numbering.py`	CLEAN
`src/docx/templates/default-numbering.xml`	CLEAN
`src/docx/text/paragraph.py`	CLEAN
`tests/oxml/test_numbering.py`	CLEAN
`tests/test_numbering.py`	CLEAN
`tests/test_numbering_integration.py`	CLEAN

Findings

XML Injection / XXE

No issues found.

All user-supplied values (number_format, text, start, indent, font) in add_numbering_definition() are set as XML attribute values via lxml's attribute dict (attrib[qn(...)] = value). lxml automatically escapes special characters in attribute values, preventing XML injection.
The new template default-numbering.xml contains no DOCTYPE declarations, no external entity references, and no processing instructions that could be leveraged for XXE.
The XPath expression in CT_Numbering._next_abstractNumId ("./w:abstractNum/@w:abstractNumId") is a hardcoded string — no user input flows into XPath queries.

Path Traversal

No issues found.

The file read in NumberingPart._default_numbering_xml() constructs its path entirely from __file__ and hardcoded path components:

path = os.path.join(
    os.path.split(__file__)[0], "..", "templates", "default-numbering.xml"
)

No user input influences this path.

New Dependencies

No new dependencies introduced. The only new stdlib import is os (used in parts/numbering.py), which is safe.

Secrets / Credentials

None found. No API keys, tokens, passwords, or other sensitive data in any changed file.

Data Exposure / Unsafe File Handling

No issues found. The template file is opened in binary read mode ("rb") and only used internally to seed a new part's XML. No user-controlled paths or filenames are involved.

Notes (Non-Security)

add_numbering_definition() accepts arbitrary string values for number_format and text without validating against the OOXML enumeration. This is a data-integrity / robustness concern, not a security issue — invalid values would produce a malformed but non-exploitable .docx file. Validation against an allowlist would improve robustness but is out of scope for security review.
indent values are taken as integers; str(indent) is used for the attribute — safe because type coercion occurs before XML insertion.

Conclusion

No security issues identified. The changes follow established patterns in the codebase, rely on lxml for safe XML serialization, introduce no new third-party dependencies, and handle all file I/O using hardcoded internal paths.

citconv-agents · 2026-04-05T10:27:32Z

Review Agent

REVIEW_CHANGES_NEEDED

Summary

The PR adds solid numbering/list control infrastructure with a clean three-layer design and good test coverage. A few issues need addressing before merging.

Issues

1. Dead code: `Numbering._numbering_part` is never used

File: src/docx/numbering.py:21-23

def __init__(self, numbering_elm: CT_Numbering, numbering_part: NumberingPart):
    self._numbering_elm = numbering_elm
    self._numbering_part = numbering_part  # <-- never read

self._numbering_part is stored but no method or property in Numbering ever accesses it. Neither definitions nor add_numbering_definition uses it. The parameter and attribute should be removed, or a clear use case that requires the part reference should be added.

2. Unused import `Callable`

File: src/docx/oxml/numbering.py:5

from typing import Callable, List, cast

Callable is imported but never referenced in the file. Remove it.

3. `Paragraph.list_level` duplicates `Paragraph.list_format.level`

File: src/docx/text/paragraph.py:610-625 and 706-724

Both paragraph.list_level (direct property on Paragraph) and paragraph.list_format.level (via _ListFormat) expose the exact same numPr_ilvl_val getter/setter. This creates two public APIs for the same data, which is confusing and creates a maintenance burden. Either:

Remove list_level and have users go through list_format.level, or
Remove _ListFormat.level and use list_level for that data

The _ListFormat class still has value via .num_id, which isn't exposed directly on Paragraph, so the class itself is useful. But the duplication of level should be resolved.

4. `DocumentPart.numbering` creates a new proxy on every access

File: src/docx/parts/document.py:56-61

@property
def numbering(self) -> Numbering:
    from docx.numbering import Numbering as NumberingCls
    numbering_part = self.numbering_part
    return NumberingCls(numbering_part.numbering_element, numbering_part)

The existing pattern in this codebase (see comments, footnotes) is to delegate to a part-level property that caches the proxy. Every call to document.numbering currently constructs a fresh Numbering object. For the simple proxy it is today this is not a bug, but it's inconsistent with the existing pattern and will be surprising to users who do:

n1 = doc.numbering
n2 = doc.numbering
assert n1 is n2  # fails

Consider making NumberingPart own a numbering property (cached via lazyproperty) that returns the Numbering proxy, then delegate from DocumentPart.numbering to self.numbering_part.numbering, following the exact same pattern as footnotes and comments.

Positive Notes

Three-layer architecture (proxy / part / oxml) is correctly followed throughout.
CT_Lvl, CT_AbstractNum element declarations have correct successors ordering.
_next_abstractNumId algorithm is correct (pigeonhole principle guarantees termination; the return 0 fallback comment is accurate).
restart_numbering() correctly creates a new <w:num> with a <w:lvlOverride>/<w:startOverride> rather than mutating the existing definition.
XPath namespace handling in _next_abstractNumId follows the same pattern as the existing _next_numId.
Test coverage is thorough — unit tests at oxml, proxy, and integration layers.

citconv-agents bot added the agent-pr PR created by an agent label Apr 5, 2026

citconv-agents bot mentioned this pull request Apr 5, 2026

Phase D.9: Numbering style control (restart, custom lists, nested lists) #22

Open

citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026

citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026

citconv-agents bot removed security-passed Security agent passed review-changes-needed Review agent requested changes labels Apr 5, 2026

citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026

citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase D.9: Numbering style control (restart, custom lists, nested lists)#108

feat: Phase D.9: Numbering style control (restart, custom lists, nested lists)#108
citconv-agents[bot] wants to merge 2 commits intomasterfrom
agent/issue-22

citconv-agents bot commented Apr 5, 2026

Uh oh!

citconv-agents bot commented Apr 5, 2026

Uh oh!

citconv-agents bot commented Apr 5, 2026

Uh oh!

citconv-agents bot commented Apr 5, 2026

Uh oh!

citconv-agents bot commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

citconv-agents bot commented Apr 5, 2026

Summary

Original Issue

API Design

XML

Upstream issues: python-openxml#582, #25, python-openxml#122

Uh oh!

citconv-agents bot commented Apr 5, 2026

Security Review: PR #108 — Numbering Style Control

Summary

Files Reviewed

Checks

XML Injection / XXE

Path Traversal

New Dependencies

Secrets in Code

Data Exposure / Unsafe File Handling

Result

Uh oh!

citconv-agents bot commented Apr 5, 2026

PR #108 — Numbering Style Control Review

1. Unused imports in src/docx/numbering.py (lines 5, 7)

2. Repeated deferred imports inside a loop in src/docx/numbering.py (lines 68–86)

3. Asymmetric orphan-<w:numPr> cleanup in CT_PPr — src/docx/oxml/text/parfmt.py (lines 339–346 vs. 356–366)

Minor notes (no change required)

Uh oh!

citconv-agents bot commented Apr 5, 2026

Security Review — PR #108

Files Reviewed

Findings

XML Injection / XXE

Path Traversal

New Dependencies

Secrets / Credentials

Data Exposure / Unsafe File Handling

Notes (Non-Security)

Conclusion

Uh oh!

citconv-agents bot commented Apr 5, 2026

Summary

Issues

1. Dead code: Numbering._numbering_part is never used

2. Unused import Callable

3. Paragraph.list_level duplicates Paragraph.list_format.level

4. DocumentPart.numbering creates a new proxy on every access

Positive Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Unused imports in `src/docx/numbering.py` (lines 5, 7)

2. Repeated deferred imports inside a loop in `src/docx/numbering.py` (lines 68–86)

3. Asymmetric orphan-`<w:numPr>` cleanup in `CT_PPr` — `src/docx/oxml/text/parfmt.py` (lines 339–346 vs. 356–366)

1. Dead code: `Numbering._numbering_part` is never used

2. Unused import `Callable`

3. `Paragraph.list_level` duplicates `Paragraph.list_format.level`

4. `DocumentPart.numbering` creates a new proxy on every access