feat: Phase D.23: Watermark support (text and image)#100
feat: Phase D.23: Watermark support (text and image)#100citconv-agents[bot] wants to merge 5 commits intomasterfrom
Conversation
Add Section.add_text_watermark(), Section.add_image_watermark(), and Section.remove_watermark() methods. Watermarks are implemented as VML shapes injected into the default header part, matching Word's native watermark format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Security Agent Report SECURITY_FAIL Security Review — PR #100 (Watermark Support)SummaryOne high-severity XML injection vulnerability found in ISSUE 1 — XML Injection via unsanitized
|
| Parameter | Source | Escaped? | Risk |
|---|---|---|---|
text |
caller input | No | HIGH |
font |
caller input | No | HIGH |
color_hex |
RGBColor.__str__() → hex digits only |
Yes (implicit) | None |
size_pt |
numeric :.0f format |
Yes (implicit) | None |
rotation |
literal "315" or "0" |
Yes (implicit) | None |
rId |
internal relationship ID | Yes (implicit) | None |
Recommended Fix
Use xml.sax.saxutils.escape() to escape both values before interpolation:
from xml.sax.saxutils import escape as xml_escape
safe_text = xml_escape(text) # escapes & < > " '
safe_font = xml_escape(font)Then use safe_text and safe_font in the f-string. Or better — build the
<v:textpath> element programmatically using lxml and set attributes with
element.set(attr, value), which always escapes properly:
from lxml import etree
textpath = etree.SubElement(shape, "{urn:schemas-microsoft-com:vml}textpath")
textpath.set("on", "t")
textpath.set("fitshape", "t")
textpath.set("style", f"font-family:{font};font-size:{size_pt:.0f}pt")
textpath.set("string", text)Clean areas
src/docx/oxml/ns.py: Namespace additions are static URNs — no injection risk.add_image_watermark:rIdis generated internally byget_or_add_image(),
width_pt/height_ptare formatted with:.0f(digits only). No injection risk.
Theimage_pathparameter is passed directly toget_or_add_image(), which is
expected behaviour for a document library (caller is already trusted).tests/test_section.py: Test additions only. No security concerns.- No new dependencies introduced.
- No secrets, tokens, or API keys present.
Sanitize user-supplied `text` and `font` values with xml.sax.saxutils.escape() before interpolating into the VML XML string to prevent XML injection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Security Agent Report SECURITY_FAIL Security Review — PR #100 (Watermark Support)Branch: agent/issue-36 Issues FoundISSUE-1 — XML Attribute Injection via Incomplete EscapingFile: Description:
# Line 189-190
safe_text = xml_escape(text)
safe_font = xml_escape(font)
# Line 236-237 — both values used inside double-quoted XML attributes
f'<v:textpath style="font-family:"{safe_font}";'
f'font-size:{size_pt:.0f}pt" string="{safe_text}"/>'Attack vector (font): A font name containing section.add_text_watermark("DRAFT", font='Arial" fillcolor="red')Produces: <v:textpath style="font-family:"Arial" fillcolor="red";font-size:72pt" string="DRAFT"/>This injects Attack vector (text): A watermark string containing section.add_text_watermark('DRAFT" on="injected')Produces: <v:textpath style="..." string="DRAFT" on="injected"/>Impact:
Recommended Fix: Use from xml.sax.saxutils import escape as xml_escape
safe_text = xml_escape(text, {'"': '"'})
safe_font = xml_escape(font, {'"': '"'})Or use from xml.sax.saxutils import quoteattr
# quoteattr returns '"value"' — slice to get the inner escaped string
safe_text = quoteattr(text)[1:-1]
safe_font = quoteattr(font)[1:-1]Clean FindingsThe following were reviewed and found to have no issues:
SummaryOne real (MEDIUM severity) XML attribute injection vulnerability was found caused by using |
xml.sax.saxutils.escape() does not escape `"` by default, allowing XML attribute injection via crafted text or font parameters. Add `"` to the escape table to prevent attribute breakout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Security Agent Report SECURITY_PASS Security Review Report — PR #100 (agent/issue-36)Reviewer: Security Agent SummaryNo exploitable security vulnerabilities found. The PR adds watermark support (text and image) to Files Reviewed
FindingsPASS — XML Injection via user-controlled
|
|
Review Agent REVIEW_CHANGES_NEEDED PR #100 — Watermark support for SectionOverall the implementation is solid and follows the project's general approach. The XML escaping and VML structure are correct, tests cover the key paths. A few real issues need addressing before merge. 1. Bug: Double-remove crash in
|
- Fix double-remove crash in remove_watermark by adding break flag - Extract duplicated paragraph-wrapping into _insert_watermark_pict helper - Move inline imports (xml_escape, parse_xml, qn) to module top level - Apply xml_escape to rId in add_image_watermark for defensive consistency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Security Agent Report SECURITY_PASS Security Review: PR #100 — Watermark SupportScopeChanges reviewed:
FindingsXML Injection — PASSText watermark (
Image watermark (
Numeric values in style attributes (
XXE (XML External Entity) — PASSThe constructed Path Traversal — PASS (by design)
Secrets in Code — PASSNo API keys, tokens, passwords, or credentials found. New Dependencies — PASSNo new packages introduced. The Namespace Additions (
|
|
Review Agent REVIEW_CHANGES_NEEDED PR #100 — Watermark support for SectionOverall the implementation is clean and well-tested. VML XML structure is correct, XML escaping is properly applied, and the test suite covers the key scenarios. There is one correctness bug and a few minor issues. Issues1. Correctness —
|
- Break linked headers before writing watermarks to avoid mutating shared inherited header definitions (issue #1) - Add type annotations to _insert_watermark_pict (issue #2) - Validate layout parameter, raising ValueError for invalid values (issue #3) - Use consistent _XML_QUOTE_ENTITIES for xml_escape extra-entities (issue #4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Security Agent Report SECURITY_PASS Security Review — PR #100 (agent/issue-36): Watermark SupportReviewed files:
SummaryNo security vulnerabilities found. The PR adds text and image watermark support via VML shapes Checks Performed1. XML Injection (PASS)The PR builds XML via f-string concatenation, which is a pattern that requires care.
The The 2. XXE (External Entity Injection) (PASS)
3. Path Traversal (PASS)
4. CSS Injection in Style Attributes (PASS — low residual risk, not exploitable)The The font name is double-quote–delimited in CSS. Any 5. New Dependencies (PASS)No new third-party packages are introduced. The only new import is 6. Secrets / Credential Leakage (PASS)No API keys, tokens, passwords, or credentials present in any changed file. 7. Sensitive Data Exposure (PASS)No sensitive data written to logs or disk beyond what the caller explicitly provides Notes
|
|
Review Agent REVIEW_CHANGES_NEEDED PR #100 — Watermark support for
|
|
Review Agent REVIEW_CHANGES_NEEDED PR #100 — Watermark Support for SectionOverall, this is a solid implementation. The VML approach matches how Word itself generates watermarks, XML escaping is handled correctly, and the tests cover the core functionality. A few issues need addressing. Issue 1 (Correctness) —
|
Summary
Implements #36
This PR was automatically generated by the Developer Agent.
Original Issue
Add support for adding text and image watermarks to documents.
API Design
Implementation
Watermarks are implemented as shapes in the header:
Needs to add/modify the header part for each section.
Upstream: python-openxml#845 (8 comments)
Generated by Developer Agent using Claude Code