FIX: Store deferred connect-attribute values in member buffers (#594)#596
FIX: Store deferred connect-attribute values in member buffers (#594)#596saurabh500 wants to merge 3 commits into
Conversation
…soft#594) PR microsoft#568 changed Connection::setAttribute to copy bytes/string attribute values into stack-local buffers before calling SQLSetConnectAttr, on the theory that the released GIL could otherwise expose a race with another thread mutating the member buffer. That premise is unsound for the deferred-attribute case and introduced a use-after-free. SQL_COPT_SS_ACCESS_TOKEN (1256) is a deferred ODBC connect attribute: the MS driver stores the caller's pointer at SQLSetConnectAttr time and dereferences it later, during SQLDriverConnect, to build the FedAuth login packet. With the stack-local copy, by the time the driver reads the pointer the buffer has already been freed. Observed symptoms of the regression (issue microsoft#594): * macOS arm64: Fatal Python error: Bus error (SIGBUS) * Windows x64: ProgrammingError: Authentication token is missing in the federated authentication message * Azure SQL DB: OperationalError: Communication link failure; TCP Provider Error code 0x2746 (server reset) Fix: store the value in the Connection-owned member buffers (strBytesBuffer for bytes/bytearray, wstrStringBuffer for str) so the memory remains valid for the lifetime of the Connection, which covers the deferred dereference during SQLDriverConnect. PR microsoft#568's gil_scoped_release around the SQLSetConnectAttr call is preserved. The 'race with another thread' rationale from PR microsoft#568 does not apply: attrs_before is applied once, sequentially, during connect(), and concurrent setAttribute on the same Connection from different threads is not a supported pattern. 1.6.0 used the persistent member buffer without issue. Verified against saurabhsingh.database.windows.net with an AzureCliCredential token on macOS arm64: * Stock 1.7.1: SIGBUS / TCP reset (use-after-free) * With fix: Clean OperationalError surfaced from the server Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the two shared Connection-member buffers (wstrStringBuffer, strBytesBuffer) with std::unordered_map<SQLINTEGER, ...> keyed by attribute id. This prevents a second deferred string/bytes attribute from invalidating the pointer the driver stashed for an earlier one. Today only SQL_COPT_SS_ACCESS_TOKEN (1256) is deferred in practice, so the single-buffer form is sufficient to fix microsoft#594 as observed; the map hardens the fix against any future deferred string/bytes attribute. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pushed |
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/pybind/connection/connection.cppLines 401-412 401 //
402 // Note: attrs_before is applied once, sequentially, during
403 // connect(); concurrent setAttribute() on the same Connection
404 // from different threads is not a supported pattern.
! 405 auto& buf = this->_attrBytesBuffers[attribute];
! 406 buf = value.cast<std::string>();
! 407 SQLPOINTER ptr = const_cast<char*>(buf.data());
! 408 SQLINTEGER length = static_cast<SQLINTEGER>(buf.size());
409
410 SQLRETURN ret;
411 {
412 py::gil_scoped_release release;📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.build._deps.simdutf-src.src.haswell.implementation.cpp: 0.4%
mssql_python.pybind.build._deps.simdutf-src.src.implementation.cpp: 6.7%
mssql_python.pybind.build._deps.simdutf-src.include.simdutf.implementation.h: 10.4%
mssql_python.pybind.build._deps.simdutf-src.include.simdutf.scalar.utf16_to_utf8.utf16_to_utf8.h: 25.3%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 59.7%
mssql_python.pybind.build._deps.simdutf-src.include.simdutf.internal.isadetection.h: 65.3%
mssql_python.row.py: 70.5%
mssql_python.pybind.logger_bridge.hpp: 70.8%
mssql_python.pybind.ddbc_bindings.cpp: 74.2%🔗 Quick Links
|
…e buffers
Expands the comments in Connection::setAttribute (str + bytes branches)
and Connection::reset() to explain why _attrStringBuffers /
_attrBytesBuffers must live for the entire Connection object lifetime
and are NOT cleared on SQL_ATTR_RESET_CONNECTION pool checkin:
* SQL_ATTR_RESET_CONNECTION only wipes per-session state; the
driver-side authentication context and stashed deferred-attribute
pointers (notably SQL_COPT_SS_ACCESS_TOKEN for FedAuth) are
intentionally retained.
* Idle Connection Resiliency can re-run Login7 transparently on a
dropped socket and will dereference the same stashed token
pointer.
Clearing the buffers on reset() would reintroduce issue microsoft#594 in a
new form (UAF during silent reconnect).
Comment-only change; no behavior change.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pushed Why
The driver may still dereference those stashed pointers after the reset — most importantly during Idle Connection Resiliency: if the underlying TCP connection drops while the connection sits idle in the pool, the driver transparently re-establishes it on the next use and re-runs the Login7 / FedAuth handshake, reading the same access-token pointer it stashed at original-connect time. If we cleared the per-attribute buffers on Memory cost is bounded: one entry per unique attribute id ever set on the connection (in practice = 1, for token 1256). Token rotation overwrites the same map slot in place, so it's O(1) memory per connection regardless of how often the token is refreshed. |
Fixes #594.
Problem
PR #568 changed
Connection::setAttributeto copy bytes/string attribute values into stack-local buffers before callingSQLSetConnectAttr, on the theory that the released GIL could otherwise expose a race with another thread mutating the member buffer.That premise is unsound for deferred attributes and introduced a use-after-free.
SQL_COPT_SS_ACCESS_TOKEN(1256) is a deferred ODBC connect attribute: the MS driver stores the caller's pointer atSQLSetConnectAttrtime and dereferences it later, duringSQLDriverConnect, to build the FedAuth login packet. With the stack-local copy, by the time the driver reads the pointer the buffer has already been freed.Observed symptoms (issue #594)
Fatal Python error: Bus error(SIGBUS)ProgrammingError: Authentication token is missing in the federated authentication messageOperationalError: Communication link failure; TCP Provider Error code 0x2746(server reset)All three are the same root cause — the driver dereferences a pointer to a freed stack frame; the symptom depends on what the OS leaves on that page.
Fix
Store the value in the Connection-owned member buffers (
strBytesBufferfor bytes/bytearray,wstrStringBufferfor str) so the memory remains valid for the lifetime of theConnection, which covers the deferred dereference duringSQLDriverConnect.PR #568's
gil_scoped_releasearoundSQLSetConnectAttris preserved.Why the 'race with another thread' rationale from PR #568 doesn't apply
attrs_beforeis applied once, sequentially, duringconnect(). No other thread is mutating attributes at that moment.setAttribute()on the sameConnectionfrom different threads is not a supported pattern — ODBC connection handles aren't designed for concurrent attribute mutation.Comments in the patched code document both the deferred-read requirement and the non-thread-safe contract.
Verification
Built the patched
.sofrom this branch on macOS arm64, overlaid into the installed 1.7.1 wheel, and ran the issue's repro script against an Azure SQL DB with anAzureCliCredentialaccess token:Fatal Python error: Bus error(SIGBUS) /OperationalError: ... TCP Provider Error code 0x2746(run-to-run variance)OperationalError: Login failed for user '<token-identified principal>'surfaced from the server (token transmitted intact; server-side rejection because that test target has no AAD admin configured)Files touched
mssql_python/pybind/connection/connection.cpp— both thepy::bytes/py::bytearraybranch and thepy::strbranch ofConnection::setAttributereverted to use the persistent member buffers.