Skip to content

fix: eliminate test port allocation race by running uvicorn in-thread#2263

Draft
maxisbey wants to merge 1 commit intomainfrom
max-158-port-allocation
Draft

fix: eliminate test port allocation race by running uvicorn in-thread#2263
maxisbey wants to merge 1 commit intomainfrom
max-158-port-allocation

Conversation

@maxisbey
Copy link
Contributor

@maxisbey maxisbey commented Mar 10, 2026

Summary

Replaces the subprocess + port-race fixture pattern in tests/shared/test_streamable_http.py with an in-thread uvicorn helper that pre-binds the listening socket, eliminating the TOCTOU race that caused intermittent CI failures under pytest-xdist.

The Problem

@pytest.fixture
def event_server_port() -> int:
    with socket.socket() as s:
        s.bind(("127.0.0.1", 0))
        return s.getsockname()[1]  # ← Port released! Race window opens

@pytest.fixture
def event_server(event_server_port, event_store):
    proc = multiprocessing.Process(target=run_server, kwargs={"port": event_server_port, ...})
    proc.start()
    wait_for_server(event_server_port)  # ← May see WRONG server
    ...

Between getsockname() returning and the subprocess rebinding, another pytest-xdist worker can claim the port — causing httpx.ConnectError: All connection attempts failed (job 63508538456) on the SSE auto-reconnect tests.

PR #1528 attempted worker-specific port ranges but was closed as only reducing (not eliminating) collision probability.

The Fix

tests/test_helpers.py now provides run_uvicorn_in_thread(app) which:

  1. Binds a listening socket with port=0 and calls listen() before starting the server thread
  2. Passes that socket to uvicorn via server.run(sockets=[sock])
  3. Yields the URL immediately — the port is known before the server thread even runs, and the kernel's listen queue buffers any connections that arrive during uvicorn startup

No polling, no race window. The socket is held atomically from bind() until sock.close() at fixture teardown.

Scope

Migrated the four test_streamable_http.py fixtures (basic_server, event_server, json_response_server, context_aware_server) that share create_app(). These include the SSE auto-reconnect tests (test_server_close_sse_stream_via_context, test_streamable_http_client_auto_reconnects, test_streamable_http_multiple_reconnections, test_standalone_get_stream_reconnection, test_streamable_http_client_respects_retry_interval) which genuinely need real TCP to exercise connection lifecycle.

Other files with the same pattern (test_sse.py, test_ws.py, test_http_unicode.py, test_integration.py, security tests) are deliberately left alone here — they don't need real TCP and are being converted to in-memory/ASGITransport in sibling PRs. wait_for_server() is kept for them.

Coverage

Running the server in-process means coverage now tracks transport code that was previously subprocess-invisible. Adjusted # pragma: no cover# pragma: lax no cover on a handful of branches in streamable_http.py and transport_security.py that are now partially covered depending on which tests exercise them.

AI Disclaimer

The previous pattern picked a free port via socket.bind(0), released it,
then started a uvicorn subprocess hoping to rebind — a TOCTOU race that
caused intermittent CI failures when pytest-xdist workers stole the port
between release and rebind (ConnectError, 404 against wrong server).

Added run_uvicorn_in_thread() which pre-binds the listening socket with
port=0 and passes it to uvicorn via server.run(sockets=[sock]). The port
is held atomically from bind until shutdown and is known before the
server thread even starts — no polling, no race. The kernel's listen
queue buffers any connections that arrive during uvicorn startup.

Migrated the four test_streamable_http.py fixtures (basic_server,
event_server, json_response_server, context_aware_server) that share
create_app(). These include the SSE auto-reconnect tests that genuinely
need real TCP to exercise connection lifecycle.

Running the server in-process means coverage now tracks transport code
that was previously subprocess-invisible; adjusted pragmas accordingly
(targeted no-cover on unreached error paths, lax no-cover on
timing-dependent branches).

wait_for_server() is kept for files not touched by this PR.
@maxisbey maxisbey force-pushed the max-158-port-allocation branch from 43777d8 to 21a3979 Compare March 13, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant