Skip to content

feat: Phase D.17: Floating images (non-inline positioning)#103

Open
citconv-agents[bot] wants to merge 3 commits intomasterfrom
agent/issue-30
Open

feat: Phase D.17: Floating images (non-inline positioning)#103
citconv-agents[bot] wants to merge 3 commits intomasterfrom
agent/issue-30

Conversation

@citconv-agents
Copy link
Copy Markdown

@citconv-agents citconv-agents bot commented Apr 5, 2026

Summary

Implements #30

This PR was automatically generated by the Developer Agent.

Original Issue

Add support for floating (anchored) images, not just inline.

API Design

  • paragraph.add_floating_image(path, width, height, position) — adds a wp:anchor instead of wp:inline
  • FloatingImage: .position (relative to page/margin/column/paragraph), .wrap_type (square, tight, through, behind, in_front), .offset (horizontal, vertical)
  • Read existing floating images from paragraphs

XML

  • wp:anchor (vs wp:inline for current images)
  • wp:positionH, wp:positionV with relative positioning
  • wp:wrapSquare, wp:wrapTight, wp:wrapNone, etc.

Complexity: High — requires understanding DrawingML anchor element structure.

Upstream: python-openxml#159 (14 comments)


Generated by Developer Agent using Claude Code

Add support for floating (anchored) images via wp:anchor elements,
complementing the existing inline image (wp:inline) support.

- Add WD_WRAP_TYPE, WD_RELATIVE_HORZ_POS, WD_RELATIVE_VERT_POS enums
- Extend CT_Anchor with new/new_pic_anchor constructors and position properties
- Add FloatingImage proxy class with wrap_type, position, and offset properties
- Add Paragraph.add_floating_image() and Paragraph.floating_images
- Add StoryPart.new_pic_anchor() for anchor element creation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents citconv-agents bot added the agent-pr PR created by an agent label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review — PR #103 (Floating Image / wp:anchor Support)

Reviewed files:

  • src/docx/enum/shape.py
  • src/docx/oxml/shape.py
  • src/docx/parts/story.py
  • src/docx/shape.py
  • src/docx/text/paragraph.py
  • tests/test_floating_image.py

XML Injection / XXE

CT_Anchor._anchor_xml() — The XML template is a fully static string; the only interpolation is % nsdecls("wp", "a", "pic", "r"), which generates namespace declarations from known constant strings. No user-controlled data is embedded. No risk.

positionH.set("relativeFrom", horz_relative_from) — Attribute values are set via lxml's Element.set(), which correctly escapes special characters (<, >, &, "). No injection risk.

offset_elm.text = str(horz_offset)horz_offset is typed as int; str(int) produces only decimal digits. No injection risk.

anchor.set("behindDoc", "1" if behind_doc else "0") — Literal string constants only. No risk.


Path Traversal

image_path_or_stream is passed to the pre-existing get_or_add_image() method, which is unchanged in this PR. filename (extracted as image.filename) is used only as a display label in docPr.name, never for file-system access. No new path traversal surface introduced.


Input Validation

wrap_type strings are validated implicitly through tag_map.get(wrap_type, "wp:wrapNone") — unknown values fall back to a safe default. horz_relative_from / vert_relative_from are stored as XML attribute values (lxml-escaped) and later deserialized via WD_RELATIVE_HORZ_POS(value) enum lookup. An unrecognised attribute value from a malformed document would raise ValueError (a robustness concern, not a security issue).


New Dependencies

None added.


Secrets / Sensitive Data

No API keys, tokens, passwords, or credentials in the diff.


Summary

No security issues found. The changes are confined to XML element construction using lxml's safe attribute/text APIs with static templates, and proxy classes wrapping those elements. The PR is safe to merge from a security perspective.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

PR #103: Floating Image Support (wp:anchor positioning)

The overall structure is solid — CT_Anchor, FloatingImage, StoryPart.new_pic_anchor, and Paragraph.add_floating_image all follow project conventions well. Tests are thorough and cover the main cases. A few real issues need addressing before merge.


1. Round-trip inconsistency: WD_WRAP_TYPE.NONE vs WD_WRAP_TYPE.IN_FRONT

File: src/docx/enum/shape.py, src/docx/text/paragraph.py:86, src/docx/shape.py:140

WD_WRAP_TYPE.NONE and WD_WRAP_TYPE.IN_FRONT produce identical XML (wrapNone + behindDoc="0"), but only IN_FRONT is ever returned when reading back. This breaks round-trip consistency:

# User creates with default wrap_type (NONE)
img = para.add_floating_image("photo.png")
# User reads it back
img.wrap_type  # → WD_WRAP_TYPE.IN_FRONT, not NONE

Additionally, the default parameter in add_floating_image is wrap_type: WD_WRAP_TYPE = WD_WRAP_TYPE.NONE, which misleads callers — they set NONE and get back IN_FRONT.

Fix: Remove WD_WRAP_TYPE.NONE from the enum entirely (or rename it to make clear it's an alias), and change the default in add_floating_image to WD_WRAP_TYPE.IN_FRONT. All usages in the test file (wrap_map in add_floating_image) will need updating.


2. FloatingImage position properties have no setters (asymmetric with width/height)

File: src/docx/shape.py:155-170

FloatingImage exposes width and height with both getter and setter, but horz_offset, vert_offset, horz_pos_relative, and vert_pos_relative are getter-only. There are no corresponding CT_Anchor mutator methods either. A user who creates a floating image and then wants to reposition it cannot do so through the API — the position is frozen at creation time.

This is asymmetric with the dimension properties, which do support mutation. Either add the setters (with corresponding CT_Anchor property setters), or document explicitly in the docstring of each read-only property that mutation is not yet supported.

The CT_Anchor oxml layer would need something like:

@horz_offset.setter
def horz_offset(self, value: int) -> None:
    positionH = self.find(qn("wp:positionH"))
    if positionH is not None:
        offset_elm = positionH.find(qn("wp:posOffset"))
        if offset_elm is not None:
            offset_elm.text = str(value)

Minor Notes (no changes required)

  • super().__init__() in FloatingImage.__init__ (src/docx/shape.py:108): FloatingImage only inherits from object, so this call is harmless but unnecessary. Existing convention in this codebase (e.g. InlineShape) omits it.
  • _set_wrap_type silently swallows unknown wrap_type strings (src/docx/oxml/shape.py:207): tag_map.get(wrap_type, "wp:wrapNone") silently falls back to wrapNone for unknown values like "behind" or "in_front". Given these are internal strings, this is safe but could mask bugs. A ValueError on unrecognised input would be safer.
  • No SVG support in new_pic_anchor: StoryPart.new_pic_inline handles SVG content types, but new_pic_anchor does not. This is a known gap (not introduced here) but worth noting for completeness.
  • CT_Anchor is registered in oxml/__init__.py at line 60 — confirmed, no issue there.

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
Remove WD_WRAP_TYPE.NONE to fix round-trip inconsistency where creating
with NONE returned IN_FRONT on read-back. Default wrap_type is now
IN_FRONT. Add setters for horz_offset, vert_offset, horz_pos_relative,
and vert_pos_relative on both CT_Anchor and FloatingImage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents citconv-agents bot removed security-passed Security agent passed review-changes-needed Review agent requested changes labels Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review Report — PR #103 (Floating Image Support)

Summary

All changed files reviewed. No security issues found.


Files Reviewed

  • src/docx/enum/shape.py — new enum classes
  • src/docx/oxml/shape.pyCT_Anchor element class and _set_wrap_type helper
  • src/docx/parts/story.pynew_pic_anchor method on StoryPart
  • src/docx/shape.pyFloatingImage proxy class
  • src/docx/text/paragraph.pyadd_floating_image / floating_images
  • tests/test_floating_image.py — test suite (no production code risk)

Checks Performed

XML / XXE Injection

CLEAN.

  • _anchor_xml() uses Python % string formatting only for nsdecls(...), which is an internal function that generates hardcoded namespace declarations from a fixed prefix-to-URI map. No user data enters this template string.
  • positionH.set("relativeFrom", horz_relative_from) and positionV.set("relativeFrom", vert_relative_from) call lxml's element.set(), which properly escapes attribute values. No XML attribute injection is possible.
  • offset_elm.text = str(horz_offset) / str(vert_offset) — both parameters are typed as int; str(int) produces only decimal digits.
  • _set_wrap_type() uses OxmlElement(tag) where tag is always a value from a hardcoded tag_map dict (with a safe fallback of "wp:wrapNone" for unknown inputs). No user-controlled string reaches OxmlElement.
  • anchor.set("behindDoc", "1" if behind_doc else "0") — controlled boolean expression, no injection surface.
  • No external XML entity declarations or resolve_entities=True usage introduced.

Path Traversal / Unsafe File Handling

CLEAN.

The image_path_or_stream parameter follows the same code path as the existing add_picture() API (StoryPart.get_or_add_image). No new file-handling logic was introduced; risk profile is unchanged from the existing API.

New Dependencies

CLEAN.

No changes to pyproject.toml, setup.cfg, or any requirements file. No new packages introduced.

Secrets / Credentials in Code

CLEAN.

No API keys, tokens, passwords, or other credentials present in any changed file.

Input Validation

LOW (informational only — not a vulnerability).

horz_relative_from and vert_relative_from accept arbitrary strings at the CT_Anchor oxml layer. Invalid values (e.g. "invalid") would produce non-conforming OOXML that Word may reject, but this cannot cause a security exploit. The public Paragraph.add_floating_image() API enforces enum types (WD_RELATIVE_HORZ_POS, WD_RELATIVE_VERT_POS), so the raw-string path is only reachable by callers who intentionally bypass the high-level API.


Verdict

SECURITY_PASS — No exploitable vulnerabilities identified.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

PR #103 — Floating Image Support

The overall structure is solid and follows the three-layer architecture correctly. The enum additions, CT_Anchor XML construction, and FloatingImage proxy are all well-designed. A few issues need attention before merging.


1. Missing wrap_type setter on FloatingImage (blocking)

File: src/docx/shape.py

FloatingImage.wrap_type has a getter but no setter, making it inconsistent with every other property on the class (width, height, horz_offset, vert_offset, horz_pos_relative, vert_pos_relative all have setters). A module-level _set_wrap_type() function already exists and could power this setter directly.

@wrap_type.setter
def wrap_type(self, value: WD_WRAP_TYPE) -> None:
    wrap_map = {
        WD_WRAP_TYPE.SQUARE: ("square", False),
        WD_WRAP_TYPE.TIGHT: ("tight", False),
        WD_WRAP_TYPE.THROUGH: ("through", False),
        WD_WRAP_TYPE.TOP_AND_BOTTOM: ("topAndBottom", False),
        WD_WRAP_TYPE.IN_FRONT: ("none", False),
        WD_WRAP_TYPE.BEHIND: ("none", True),
    }
    wrap_str, behind_doc = wrap_map[value]
    _set_wrap_type(self._anchor, wrap_str)
    self._anchor.set("behindDoc", "1" if behind_doc else "0")

2. No tests for Paragraph.add_floating_image or Paragraph.floating_images (blocking)

File: tests/test_floating_image.py

Paragraph.add_floating_image and Paragraph.floating_images are new public API entry points, but neither has any tests. The test file only covers CT_Anchor and FloatingImage in isolation.

At minimum, add tests verifying:

  • add_floating_image returns a FloatingImage and appends a run with a drawing to the paragraph.
  • floating_images returns FloatingImage instances for each wp:anchor in the paragraph, and returns an empty list for paragraphs with no anchors.

3. Silent setter failure when position elements are absent (minor)

File: src/docx/oxml/shape.pyhorz_offset.setter, vert_offset.setter, horz_relative_from.setter, vert_relative_from.setter

All four setters silently do nothing if the expected child element (wp:positionH / wp:positionV) is absent. While the _anchor_xml template always includes these, a document parsed from disk could theoretically omit them. The getter returns a default value, so the caller would have no indication the set was ignored. Consider raising ValueError or ensuring the element is created on demand.


4. WD_WRAP_TYPE enum is missing NONE and maps two values to the same XML state (minor)

File: src/docx/enum/shape.py

The enum exposes IN_FRONT and BEHIND rather than a NONE value, but the underlying XML uses wrapNone for both, differentiated only by behindDoc. The docstring on WD_WRAP_TYPE says it "Maps to the wp:wrapNone, wp:wrapSquare, …" child elements, but IN_FRONT and BEHIND both map to wrapNone. This will confuse readers. Add a clarifying note to the class docstring or comments explaining that IN_FRONT/BEHIND are both wrapNone with behindDoc=0/1.


Summary

# Severity File
1 Blocking src/docx/shape.py — missing wrap_type setter
2 Blocking tests/test_floating_image.py — no Paragraph-level tests
3 Minor src/docx/oxml/shape.py — silent setter no-ops
4 Minor src/docx/enum/shape.py — confusing docstring

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
- Add wrap_type setter to FloatingImage using existing _set_wrap_type()
- Add Paragraph-level tests for add_floating_image and floating_images
- Raise ValueError in CT_Anchor setters when position elements are absent
- Clarify WD_WRAP_TYPE docstring re IN_FRONT/BEHIND both mapping to wrapNone

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@citconv-agents citconv-agents bot removed security-passed Security agent passed review-changes-needed Review agent requested changes labels Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Security Agent Report

SECURITY_PASS

Security Review Report — PR #103 (Floating Image Support)

Summary

All changed files were reviewed. No exploitable security vulnerabilities were found. The PR adds floating image (wp:anchor) support: new enums, a CT_Anchor oxml class, a FloatingImage proxy, StoryPart.new_pic_anchor(), and Paragraph.add_floating_image().


Files Reviewed

File Change Type
src/docx/enum/shape.py New enums: WD_WRAP_TYPE, WD_RELATIVE_HORZ_POS, WD_RELATIVE_VERT_POS
src/docx/oxml/shape.py New CT_Anchor methods, _set_wrap_type() helper
src/docx/parts/story.py New StoryPart.new_pic_anchor() method
src/docx/shape.py New FloatingImage proxy class
src/docx/text/paragraph.py New Paragraph.add_floating_image(), floating_images property
tests/test_floating_image.py New test file

Findings

PASS — XML Injection / XXE

  • _anchor_xml() builds XML via a static format string with % nsdecls("wp", "a", "pic", "r"). The nsdecls() call receives only hardcoded namespace prefixes — no user input reaches the format string.
  • Attribute writes use lxml's .set() method (e.g., positionH.set("relativeFrom", horz_relative_from)), which properly escapes values — XML attribute injection is not possible.
  • No parse_xml() / etree.parse() calls on user-supplied raw XML or external entities.

PASS — Path Traversal / File Handling

  • image_path_or_stream is forwarded through StoryPart.get_or_add_image(), the same path used by the existing new_pic_inline(). No new file-handling code was introduced.

PASS — Dependency Risk

  • No new third-party dependencies were added. All imports are from the standard library or existing internal modules.

PASS — Secrets in Code

  • No API keys, tokens, passwords, or credentials present.

PASS — Data Exposure

  • No logging of sensitive data; no new serialization paths that could leak internal state.

Low-Severity Observation (Not a Vulnerability)

File: src/docx/oxml/shape.py, CT_Anchor.new_pic_anchor() (lines ~92–133)
Severity: LOW (informational only)

The internal CT_Anchor.new_pic_anchor() accepts horz_relative_from, vert_relative_from, and wrap_type as raw strings without validating them against the OOXML-allowed value sets. An unrecognized wrap_type silently defaults to "wp:wrapNone", which is safe behavior. The public Paragraph.add_floating_image() API enforces valid values via enums (WD_RELATIVE_HORZ_POS, WD_RELATIVE_VERT_POS, WD_WRAP_TYPE), which is the correct pattern.

Risk: A developer calling the internal oxml API directly with an invalid string could produce non-compliant OOXML. This is consistent with the rest of the codebase (which also uses unvalidated strings at the oxml layer) and is not exploitable.

Recommended fix (optional): Add runtime validation or type narrowing for horz_relative_from/vert_relative_from at the oxml layer if strict OOXML compliance is desired.


Conclusion

The PR is clean. No security vulnerabilities were identified. The implementation follows established patterns in the codebase and the public API correctly constrains inputs via enumerations.

@citconv-agents citconv-agents bot added the security-passed Security agent passed label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

Floating Image Support (wp:anchor) — PR Review

The overall design is sound: clean three-layer implementation (enum → oxml → proxy), tests cover the main paths, and the XPath queries correctly rely on BaseOxmlElement.xpath() which injects nsmap automatically (confirmed in xmlchemy.py:692). One real issue and a few minor concerns below.


1. Duplicate wrap_map dict (real issue — DRY violation)

The same WD_WRAP_TYPE → (wrap_str, behind_doc) mapping is defined in two separate places:

  • src/docx/text/paragraph.py:527-535 inside Paragraph.add_floating_image()
  • src/docx/shape.py:422-429 inside FloatingImage.wrap_type.setter

If a new wrap type is added or a value changes, both copies must be updated in sync. Extract to a module-level constant (e.g. in src/docx/shape.py):

_WRAP_TYPE_MAP: dict[WD_WRAP_TYPE, tuple[str, bool]] = {
    WD_WRAP_TYPE.SQUARE: ("square", False),
    WD_WRAP_TYPE.TIGHT: ("tight", False),
    WD_WRAP_TYPE.THROUGH: ("through", False),
    WD_WRAP_TYPE.TOP_AND_BOTTOM: ("topAndBottom", False),
    WD_WRAP_TYPE.IN_FRONT: ("none", False),
    WD_WRAP_TYPE.BEHIND: ("none", True),
}

Then both callers reference _WRAP_TYPE_MAP[wrap_type].


2. Misleading inline comment (minor)

src/docx/oxml/shape.py:218:

    # -- wrapNone with behindDoc differentiates behind/in_front --
    return "none"

This fallback fires when no wrap element is found at all (the for-loop above already returns "none" for a real wp:wrapNone element). The comment describes the wrong case. Either remove it or change it to something like # -- no wrap element found; default to none --.


3. Unnecessary deferred import inside method body (minor)

src/docx/shape.py:420:

@wrap_type.setter
def wrap_type(self, value: WD_WRAP_TYPE) -> None:
    from docx.oxml.shape import _set_wrap_type

There is no circular-import risk here (docx.oxml.shape does not import docx.shape), so this should be a regular top-level import. Deferred imports inside methods are reserved for actual circular-import resolution in this codebase.


Positive notes

  • The XPath .//w:drawing/wp:anchor at paragraph.py:130 is correct: the overridden xpath() in BaseOxmlElement automatically passes nsmap.
  • _set_wrap_type correctly uses findall on direct children (not descendants) before re-inserting the new element before wp:docPr, which matches the OOXML schema ordering.
  • behindDoc is correctly treated as an unprefixed attribute on wp:anchor, so anchor.set("behindDoc", ...) is valid without Clark notation.
  • Test coverage is thorough: all six wrap-type enum values are exercised, offset getters/setters, position reference getters/setters, and the paragraph-level API are all tested.
  • The fake_parent fixture from tests/conftest.py is correctly re-used without redefinition.

@citconv-agents citconv-agents bot added the review-changes-needed Review agent requested changes label Apr 5, 2026
@citconv-agents
Copy link
Copy Markdown
Author

citconv-agents bot commented Apr 5, 2026

Review Agent

REVIEW_CHANGES_NEEDED

PR #103 – Floating Image Support (wp:anchor)

Overall the implementation is well-structured and follows project conventions. The oxml/proxy/part layering is correct, the XML template mirrors _inline_xml, and test coverage is solid. Two real issues and one minor nit need to be addressed.


Issue 1 – Duplicate wrap_map dict (DRY violation)

Files: src/docx/shape.py:422-429 and src/docx/text/paragraph.py:527-534

The identical mapping from WD_WRAP_TYPE to (wrap_str, behind_doc) is copied verbatim in two places:

# shape.py – FloatingImage.wrap_type.setter
wrap_map = {
    WD_WRAP_TYPE.SQUARE: ("square", False),
    WD_WRAP_TYPE.TIGHT: ("tight", False),
    WD_WRAP_TYPE.THROUGH: ("through", False),
    WD_WRAP_TYPE.TOP_AND_BOTTOM: ("topAndBottom", False),
    WD_WRAP_TYPE.IN_FRONT: ("none", False),
    WD_WRAP_TYPE.BEHIND: ("none", True),
}

# paragraph.py – Paragraph.add_floating_image
wrap_map = {   # exact copy
    ...
}

If a new WD_WRAP_TYPE member is ever added it must be updated in both places. This should be factored into a single helper. A clean option is a module-level function in src/docx/shape.py (or src/docx/enum/shape.py) and imported by both callers:

def _wrap_type_to_xml(wrap_type: WD_WRAP_TYPE) -> tuple[str, bool]:
    """Return (wrap_str, behind_doc) pair for the given WD_WRAP_TYPE."""
    _MAP = {
        WD_WRAP_TYPE.SQUARE: ("square", False),
        WD_WRAP_TYPE.TIGHT: ("tight", False),
        WD_WRAP_TYPE.THROUGH: ("through", False),
        WD_WRAP_TYPE.TOP_AND_BOTTOM: ("topAndBottom", False),
        WD_WRAP_TYPE.IN_FRONT: ("none", False),
        WD_WRAP_TYPE.BEHIND: ("none", True),
    }
    return _MAP[wrap_type]

Issue 2 – CT_Anchor.horz_offset/vert_offset setters raise ValueError when element is absent, but getters silently return 0

File: src/docx/oxml/shape.py:145-173

The getter returns 0 when wp:positionH or wp:posOffset is absent, but the setter raises ValueError for the same condition. This asymmetry will surprise callers who read a 0 offset from an existing document and then try to write it back. Since the _anchor_xml template always produces these elements, this only affects wp:anchor elements read from pre-existing documents.

The setter should either create the missing element (consistent with get_or_add_* patterns used elsewhere in the codebase) or the getter and setter should agree on what "absent" means. At minimum, the docstring should document the precondition.


Nit – super().__init__() call in FloatingImage.__init__

File: src/docx/shape.py:378

FloatingImage doesn't inherit from anything (just object), yet calls super().__init__(). InlineShape in the same file does not. The call is harmless but inconsistent; it can be removed.


What looks good

  • The three-layer architecture is correctly followed: CT_Anchor at oxml, StoryPart.new_pic_anchor at parts, FloatingImage/Paragraph.add_floating_image at proxy.
  • The _anchor_xml template follows the _inline_xml pattern exactly.
  • _set_wrap_type correctly handles element insertion position relative to wp:docPr.
  • WD_WRAP_TYPE.IN_FRONT/BEHIND dual-mapping via behindDoc is clearly documented in the enum docstring.
  • Test coverage is comprehensive: CT_Anchor construction, all wrap types, position offsets, reference frames, and paragraph integration are all tested.
  • The fake_parent fixture from tests/conftest.py is correctly relied upon; no fixture is missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-pr PR created by an agent review-changes-needed Review agent requested changes security-passed Security agent passed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant