Skip to content

feat: Phase D.13: Insert paragraph/table at arbitrary position#99

Open
citconv-agents[bot] wants to merge 2 commits intomasterfrom
agent/issue-26
Open

feat: Phase D.13: Insert paragraph/table at arbitrary position#99
citconv-agents[bot] wants to merge 2 commits intomasterfrom
agent/issue-26

Conversation

@citconv-agents
Copy link
Copy Markdown

@citconv-agents citconv-agents bot commented Apr 5, 2026

Summary

Implements #26

This PR was automatically generated by the Developer Agent.

Original Issue

Add ability to insert paragraphs and tables at specific positions. Upstream python-openxml#156 (23 comments).

API Design

  • paragraph.insert_paragraph_after(text, style) — insert a new paragraph after this one
  • paragraph.insert_paragraph_before() already exists — verify it works correctly
  • paragraph.insert_table_after(rows, cols) — insert a table after this paragraph
  • table.insert_paragraph_after(text, style) — insert after a table
  • body.insert_element_before(new_element, reference_element) — low-level

Implementation

Uses lxml's addnext()/addprevious() on the underlying elements. The high-level methods create the element and position it.

Upstream issue: python-openxml#156


Generated by Developer Agent using Claude Code

Add ability to insert paragraphs and tables at specific positions:
- Paragraph.insert_paragraph_after(text, style)
- Paragraph.insert_table_after(rows, cols, width)
- Table.insert_paragraph_after(text, style)
- CT_Body.insert_before(new_element, reference_element)
- _Body.insert_element_before(new_element, reference_element)

Uses lxml addnext()/addprevious() on underlying elements. Verified
existing insert_paragraph_before() works correctly.

Closes #26

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review Report — PR #99

Branch: agent/issue-26
Date: 2026-04-05
Reviewer: Security Agent

Summary

No security issues found. All changes are internal XML tree manipulation using trusted lxml operations.

Files Reviewed

File Change Type
src/docx/document.py Added insert_element_before() to _Body
src/docx/oxml/document.py Added insert_before() to CT_Body
src/docx/oxml/text/paragraph.py Added add_p_after() to CT_P
src/docx/table.py Added insert_paragraph_after() to Table
src/docx/text/paragraph.py Added insert_paragraph_after() and insert_table_after() to Paragraph
tests/oxml/test_document.py Tests for CT_Body.insert_before()
tests/test_table.py Tests for Table.insert_paragraph_after()
tests/text/test_paragraph.py Tests for new Paragraph insertion methods

Analysis

XML/XXE Injection

CLEAN. New elements are created via OxmlElement("w:p") (hardcoded tag names) and CT_Tbl.new_tbl(), then inserted using lxml's addprevious() / addnext() methods. No raw XML strings from user input are parsed. No fromstring() or parse_xml() calls on untrusted data.

User Text Handling

CLEAN. The optional text parameter in insert_paragraph_after() is passed through paragraph.add_run(text)run.text = textself._r.text = text, which assigns to lxml element's .text property. lxml automatically escapes special characters (<, >, &, etc.) when setting text content, preventing XML injection.

Path Traversal

CLEAN. No file system operations in any changed code.

New Dependencies

CLEAN. No new external dependencies introduced.

Secrets / Sensitive Data

CLEAN. No API keys, tokens, passwords, or credentials present.

Element Insertion Safety

CLEAN. insert_element_before() accepts BaseOxmlElement typed parameters — already-parsed lxml element objects, not raw strings. The caller is responsible for constructing valid elements through the existing API, which is consistent with the existing codebase pattern.

Conclusion

The PR adds insert_paragraph_after(), insert_table_after(), and related low-level helpers for inserting block-level elements at arbitrary positions in a document body. All operations use lxml's safe tree-manipulation APIs. No security concerns identified.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

Summary

The new insert_paragraph_after, insert_table_after (on Paragraph), and insert_paragraph_after (on Table) APIs are useful additions with good test coverage. However, two real issues need addressing before merge.


Issue 1: Dead code — _Body.insert_element_before is never called

File: src/docx/document.py:291–301

_Body.insert_element_before was added but is never called anywhere in the codebase. The new insertion methods (Paragraph.insert_table_after, Table.insert_paragraph_after) bypass _Body entirely and call addnext directly on XML elements in the proxy layer. This means the CT_Body.insert_before oxml method is also only reachable from tests — nothing in the live code path exercises it.

Either:

  • Wire up _Body.insert_element_before into a user-facing API (e.g., expose it on Document for insertions at arbitrary positions), and add a proxy-level test, or
  • Remove the dead _Body.insert_element_before method and the corresponding CT_Body.insert_before if there is no caller.

Issue 2: Architecture inconsistency — proxy layer bypasses the oxml layer

Files: src/docx/text/paragraph.py:261–263, src/docx/table.py:63–65

The codebase architecture requires that XML manipulation happen in CT_* classes (oxml layer), with proxy classes only delegating. This PR is inconsistent in following that pattern:

Correct (Paragraph.insert_paragraph_after, line 230):

p = self._p.add_p_after()   # delegates to oxml layer ✓

Incorrect (Paragraph.insert_table_after, lines 261–263):

tbl = CT_Tbl.new_tbl(rows, cols, width)
self._p.addnext(tbl)         # raw XML manipulation in proxy layer ✗

Incorrect (Table.insert_paragraph_after, lines 63–65):

new_p = OxmlElement("w:p")
self._tbl.addnext(new_p)     # raw XML manipulation in proxy layer ✗

The fix is to add the missing oxml-layer methods and delegate to them:

  1. Add CT_P.add_tbl_after(tbl: CT_Tbl) -> CT_Tbl in src/docx/oxml/text/paragraph.py, mirroring the existing add_p_after:

    def add_tbl_after(self, tbl: CT_Tbl) -> CT_Tbl:
        """Insert `tbl` directly after this element and return it."""
        self.addnext(tbl)
        return tbl

    Then in Paragraph.insert_table_after:

    tbl = CT_Tbl.new_tbl(rows, cols, width)
    self._p.add_tbl_after(tbl)
  2. Add CT_Tbl.add_p_after() -> CT_P in src/docx/oxml/table.py, mirroring CT_P.add_p_after:

    def add_p_after(self) -> CT_P:
        """Return a new `<w:p>` element inserted directly after this one."""
        new_p = cast(CT_P, OxmlElement("w:p"))
        self.addnext(new_p)
        return new_p

    Then in Table.insert_paragraph_after, remove the inline OxmlElement import and use:

    new_p = self._tbl.add_p_after()

The oxml-layer methods also need unit tests in tests/oxml/test_paragraph.py and tests/oxml/test_table.py respectively.

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
- Remove dead code: `_Body.insert_element_before` and `CT_Body.insert_before`
  (never called from live code paths)
- Fix architecture inconsistency: move XML manipulation from proxy layer to
  oxml layer by adding `CT_P.add_tbl_after` and `CT_Tbl.add_p_after` methods
- Update `Paragraph.insert_table_after` and `Table.insert_paragraph_after` to
  delegate to the new oxml-layer methods instead of calling `addnext` directly
- Add unit tests for the new oxml-layer methods

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents citconv-agents bot removed security-passed Security agent passed review-changes-needed Review agent requested changes labels Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review Report — PR #99

Date: 2026-04-05
Branch: agent/issue-26
Reviewer: Security Agent

Summary

No security issues found. The changes are safe to merge.

Changes Reviewed

File Nature of Change
src/docx/document.py Trailing newline only
src/docx/oxml/table.py New add_p_after() method on CT_Tbl
src/docx/oxml/text/paragraph.py New add_p_after() and add_tbl_after() methods on CT_P
src/docx/table.py New insert_paragraph_after() method on Table
src/docx/text/paragraph.py New insert_paragraph_after() and insert_table_after() methods on Paragraph
tests/ (multiple) Test coverage for the above

Security Checks

XML Injection / XXE

PASS. No raw user-supplied XML strings are parsed anywhere in these changes.

  • add_p_after() creates elements via OxmlElement("w:p") — a fixed, trusted tag name, not user-controlled.
  • add_tbl_after(tbl) inserts a pre-constructed element; no XML parsing from external input.
  • CT_Tbl.new_tbl(rows, cols, width) builds elements programmatically from numeric/dimensional arguments.
  • Text passed to insert_paragraph_after(text=...) flows through paragraph.add_run(text), which stores it as XML text content. lxml properly escapes this, preventing XML injection.

Path Traversal

PASS. No file paths are handled in these changes.

Dependency Risk

PASS. No new dependencies are introduced. No changes to setup.cfg, pyproject.toml, or requirements files.

Secrets / Credentials

PASS. No API keys, tokens, passwords, or other credentials present.

Data Exposure

PASS. No file I/O, no network calls, no logging of sensitive data.

Input Validation

PASS.

  • rows and cols are int typed; width is Length typed — numeric parameters with no string-injection surface.
  • style parameter follows the same pattern as the existing insert_paragraph_before() method already in the codebase.

Conclusion

All changes are narrow, internal API additions that follow established patterns in the python-docx codebase. No security concerns were identified.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

Summary

The PR adds insert_paragraph_after to both Paragraph and Table, and insert_table_after to Paragraph. The implementation is clean, follows existing patterns, and is well-tested. One issue found in the new test code.


Issue: Wrong type cast in tests/oxml/test_table.py:28

# tests/oxml/test_table.py:28
body = cast(CT_P, element("w:body/w:tbl"))   # <-- CT_P is wrong
tbl = cast(CT_Tbl, body[0])

element("w:body/w:tbl") returns a w:body element, not a w:p. Casting it to CT_P gives pyright incorrect type information and is misleading to readers. The established pattern in this test suite (see tests/oxml/test_document.py:18) is:

body = cast(CT_Body, element("w:body/w:tbl"))

Fix: change cast(CT_P, ...) to cast(CT_Body, ...) and add CT_Body to the imports from docx.oxml.document.


Notes (no changes required)

  • add_tbl_after(tbl: BaseOxmlElement) -> BaseOxmlElement — the broad type is acceptable; using CT_Tbl would require a TYPE_CHECKING guard to avoid circular imports, and the return value is unused at the call site anyway.
  • Local OxmlElement import inside CT_Tbl.add_p_after — consistent with avoiding any potential circular import at module level; not a problem.
  • if text: (instead of if text is not None:) — intentional and consistent with the existing insert_paragraph_before pattern.

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-pr PR created by an agent review-changes-needed Review agent requested changes security-passed Security agent passed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants