● PR3 (TLS) ready. CI green. #479
No reviewers
Labels
No labels
bug
dependencies
documentation
duplicate
enhancement
good first issue
help wanted
invalid
question
refactor
rust
technical-debt
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
navicore/patch-seq!479
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "net-pr3"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What landed:
StreamKind { Tcp(TcpStream), Tls(Box<StreamOwned<ClientConnection, TcpStream>>) }. Read and Write traits
implemented on the enum so tcp.read / tcp.write / tcp.close dispatch transparently across both variants.
tcp.accept and tcp.connect wrap their output in StreamKind::Tcp(_).
TcpStream out of STREAMS, frees the old slot, runs ClientConnection::new + complete_io eagerly
(cooperative-yielding via the may stream), wraps in StreamOwned, allocates a fresh STREAMS slot under
StreamKind::Tls(Box::new(_)). Returns the new id; any handshake failure surfaces as (0, false) with the
underlying socket dropped (closed). Trust roots from webpki-roots; ring provider auto-installs via rustls's
crate-features path.
webpki-roots = "0.26" added at workspace level and to seq-runtime as always-on direct deps. Disabling defaults
was necessary so we don't drag in aws_lc_rs (which the default features pull) alongside the ring provider ureq
already enables — that would double-compile the crypto layer.
module lists; compiler/ast/program.rs validation list; runtime/lib.rs re-export. net.tls.client ( a Socket
String -- a Socket Bool ).
any IO and leaves the listener socket usable; (2) TLS handshake against a same-process plain-TCP listener (peer
closes after accept) fails cleanly and consumes the client socket.
v1 limitations (no mTLS, no caller-side ALPN, no peer/cipher inspection, no session-resumption knobs), and a
worked HTTPS example.
Not in PR3, will surface in PR4 or later:
TLS server (no TLS server primitive exposed yet). The PR4 HTTP rewrite naturally exercises real TLS end-to-end
against https:// URLs in its tests, which closes this gap.
What landed: - crates/runtime/src/tcp.rs — STREAMS registry refactored to Mutex<SocketRegistry<StreamKind>> with enum StreamKind { Tcp(TcpStream), Tls(Box<StreamOwned<ClientConnection, TcpStream>>) }. Read and Write traits implemented on the enum so tcp.read / tcp.write / tcp.close dispatch transparently across both variants. tcp.accept and tcp.connect wrap their output in StreamKind::Tcp(_). - crates/runtime/src/tls.rs — new module. patch_seq_tls_client pops hostname + socket id, takes the underlying TcpStream out of STREAMS, frees the old slot, runs ClientConnection::new + complete_io eagerly (cooperative-yielding via the may stream), wraps in StreamOwned, allocates a fresh STREAMS slot under StreamKind::Tls(Box::new(_)). Returns the new id; any handshake failure surfaces as (0, false) with the underlying socket dropped (closed). Trust roots from webpki-roots; ring provider auto-installs via rustls's crate-features path. - Cargo deps — rustls = "0.23" with default-features = false, features = ["ring", "std", "tls12", "logging"] and webpki-roots = "0.26" added at workspace level and to seq-runtime as always-on direct deps. Disabling defaults was necessary so we don't drag in aws_lc_rs (which the default features pull) alongside the ring provider ureq already enables — that would double-compile the crypto layer. - 4-site wiring — compiler/builtins/tls.rs, compiler/codegen/runtime/tls.rs, both registered in their parent module lists; compiler/ast/program.rs validation list; runtime/lib.rs re-export. net.tls.client ( a Socket String -- a Socket Bool ). - crates/compiler/src/error_flag_lint/state.rs — net.tls.client => (2, 1, "TLS handshake failure"). - tests/integration/src/test-net-tls-client.seq — two failure-path tests: (1) empty hostname fast-fails before any IO and leaves the listener socket usable; (2) TLS handshake against a same-process plain-TCP listener (peer closes after accept) fails cleanly and consumes the client socket. - docs/STDLIB_REFERENCE.md — new net.tls.* section with stack effect, eager-handshake semantics, four explicit v1 limitations (no mTLS, no caller-side ALPN, no peer/cipher inspection, no session-resumption knobs), and a worked HTTPS example. Not in PR3, will surface in PR4 or later: - Happy-path TLS roundtrip integration test — needs either a real HTTPS endpoint (CI offline) or a same-process TLS server (no TLS server primitive exposed yet). The PR4 HTTP rewrite naturally exercises real TLS end-to-end against https:// URLs in its tests, which closes this gap. - Per-attempt connect/handshake error introspection. - Client cert auth, ALPN selection, peer-cert inspection.Review — PR3 (net.tls.client)
Reviewed against
origin/net-pr3. The shape is right: aStreamKindenum unifying TCP and TLS in oneSTREAMSregistry,Read/Writeimpls sotcp.read/tcp.write/tcp.closedispatch transparently, eager handshake that surfaces TLS errors as(0, false)like every other fallible word, process-wideArc<ClientConfig>so the cert store isn't rebuilt per call. The rustls feature pinning (default-features = false, onlyring) to avoid double-compiling alongside ureq's provider is well-thought-out — that's exactly the kind of thing that bites later. PR2 IPv6 fix carried in cleanly.One real concern, plus follow-ups and a few nits.
Free-then-allocate race in
tls.clienttls.rs:425-453:Between the
freeand theallocate, the strand yields cooperatively while rustls drives the handshake. Other strands run during that window, and any of them callingnet.tcp.accept/net.tcp.connectpops idNfrom the free list and stores a different stream there.Concrete sequence:
tls.clienton socket id 7.take_tcp(7)→ slot=None.free(7)→free_ids=[…, 7]. Handshake starts; strand parks on socket I/O.tcp.accept→allocate→ pops 7 → strand B now holds a brand-new TCP stream at id 7.allocate(Tls(…))→ id 8 (or whatever's next).Now strand B is using id 7 for its connection while strand A is using id 8 for its TLS session. Internally consistent — different fds, different streams, no memory corruption. But any user code that had captured the integer
7before callingtls.client(e.g. a logging line, a map keyed on socket id, a generic "track all open sockets" structure) is now silently pointing at the wrong stream.The doc compounds this: it says "Returns a fresh Socket id; the original is consumed (freed on failure, replaced on success)." "Replaced" reads as in-place. The code is free-then-allocate-elsewhere.
Fix is small and makes the doc true: don't
freeaftertake. The slot is alreadyNone(which means "in use"), so no other allocator can grab idN. On success, write the TLS variant back into the same slot — same id throughout the upgrade. On handshake failure, then callfreeto release the id.Sketch:
This also avoids a free-list churn entry per upgrade. Worth doing.
Behaviour / API items worth documenting (or fixing in v1)
tcp.closeon a TLS socket is a hard close, not a graceful shutdown. DroppingStreamOwnedcloses the underlying TCP without sendingclose_notify. RFC 5246 expects clients to send the alert before close; modern servers tolerate truncation, but some older stacks (and a few HTTP/1.1 keep-alive intermediaries) log it as a truncation attack indicator. Either sendconn.send_close_notify()+complete_io()in theStreamKind::Tlsclose path, or document that close is a hard shutdown. Sending close_notify can hang on a slow peer, so a configurable timeout (or just a best-effort non-blocking attempt) would be the safer middle ground — but for v1 a one-line doc note is enough."IP literals accepted" in the doc table is misleading. Yes,
ServerName::try_fromparses IP literals without error, so the call succeeds. But cert validation against an IP requires the peer cert to carry that IP in a SAN entry — almost no public cert does. So in practice TLS to an IP literal always fails at validation, not parse. Currently a reader would conclude "I can do TLS to an IP" — they can't, except in tightly controlled internal cases. Suggest either dropping the parenthetical or expanding it: "IP literals are syntactically accepted but cert validation will reject the connection unless the peer cert lists the IP in its SAN."No handshake timeout.
complete_iowill drive reads/writes until the handshake succeeds, fails, or the strand never wakes. Combined with PR2's missing connect timeout, a strand can park for OS-SYN-timeout + indefinite-handshake-time on a partly broken peer. Same shape as the PR2 follow-up — worth grouping into a "deadline" pass once.Code-shape concerns
STREAMSandSocketRegistrymadepub(crate). Now any other runtime module can poke directly at the registry's internals. The existingtake_tcphelper in tls.rs is most of the way to a clean API; moving it (and itsinstall_tlscounterpart) intotcp.rsand keepingSTREAMS/SocketRegistryprivate would re-localise the invariant that "free + allocate is dangerous across a yield" inside the module that owns it. Not a blocker — the current shape works — but the surface area you've exposed is larger than the upgrade path actually needs.build_tls'shost_seqstringparameter is dead.tls.rs:483-488:The comment says "host_seqstring stays in scope until we hand the cloned hostname to ServerName" — but
hostnameis already an owned-from-&strboundary in the caller (host.as_str_or_empty().to_string()attls.rs:417), so&hostnameis&strof the owned String, not borrowing fromhost_seqstring. The SeqString is moved into the function and dropped at the end of scope — the explicitlet _ =makes no difference. Both the param and the discard line look like they can go.While there:
hostname.to_string()allocates again insidebuild_tlseven though the caller already owns aString. Just hand it theStringand consume it.Box-rationale comment is half-right.
tcp.rscomment aroundStreamKind::Tls(Box<…>)says the box prevents every plain-TCP slot from paying the TLS-sized footprint. The enum's discriminant size is the max of its variant sizes — boxing the TLS arm makes that variant ~16 bytes (pointer), butmay::net::TcpStreamitself is ~24-40 bytes, so the enum size is dominated by Tcp, not Tls. The box is still the right call (saves~size_of::<StreamOwned>()per TCP slot in the allocator, just not in the registry vector slot), but the stated reason isn't quite right.Test coverage
test-net-tls-client-empty-host-failsis asserting something trivially true. The test passes a listener id (fromnet.tcp.listen) tonet.tls.clientwith empty hostname, then asserts the listener is still usable. But the empty-hostname check attls.rs:418-420short-circuits beforetake_tcpruns. So the listener's slot (inLISTENERS) is untouched not because of any TLS-layer code path, but because the function returned before touchingSTREAMSat all. If a future refactor moved the empty-host check belowtake_tcp, this test would still pass (the listener lives inLISTENERS, notSTREAMS). The test is exercising the pre-check ordering, not what the name implies.A stronger version: get a real connected stream (e.g. via a same-process plain-TCP listen + connect like the PR2 test does), pass it to
tls.clientwith empty hostname, then assert the stream slot is still alive andtcp.read/tcp.writestill work on it. That would actually catch the regression the current test pretends to catch.Happy-path absence acknowledged. Deferring real-TLS coverage to PR4's HTTPS-against-real-endpoints tests is reasonable, but it means PR3 lands with the success path completely untested at the integration level. If you can stand up a same-process rustls server with a self-signed cert and a custom
RootCertStorein the test (it's ~30 lines of glue), the happy-path coverage closes that gap without external network. Optional — but the longer the success path goes unexercised the more room for handshake-shape regressions to slip in.Stronger handshake-fail isolation. The test uses
"127.0.0.1"as both the connect target and the SNI hostname. rustls'sServerName::try_fromfor"127.0.0.1"succeeds (IP literal), so the failure comes fromcomplete_ioon the peer's FIN. Good — but a DNS-name SNI that doesn't resolve (e.g."no-such-host.invalid") against the same plain-TCP listener would exercise the same transport-error path while also confirming that DNS-form SNI parses and reaches the handshake. Cheap to add.Nits
CryptoProvider::install_defaultto crate-features auto-install. This works as long as exactly one provider is feature-enabled across the entire workspace. If anything (transitively) ever enablesaws_lc_rstoo, you get a runtime panic at firstClientConfig::builder()("multiple default providers available"). A defensivelet _ = rustls::crypto::ring::default_provider().install_default();at startup makes the choice explicit and removes the fragility — costs nothing on the happy path.tls.rs:368—use crate::seqstring::SeqString;is only used by the deadhost_seqstringparam. Goes away with that.tls.rs:392doc comment says "port-of-call is the hostname (String, top)". "Port-of-call" is an odd phrase here — I think you mean "calling convention" or just "top of stack". Cosmetic.STDLIB_REFERENCE.mdhas the same closes-on-success-but-not-on-failure shape as PR2's example.dup net.tcp.read [ io.write-line ] [ drop "read failed" io.write-line ] ifthennet.tcp.close droponly fires the close on the success branch's continuation — actually re-reading, thenet.tcp.closeis outside theif, so it runs in both branches. Good. Disregard.Verdict
The free-then-allocate race is the one I'd want addressed before merge — it's a small fix and the doc already implies the corrected behaviour. Everything else is either deferrable (handshake timeout, happy-path test, mTLS/ALPN/peer-cert) or cosmetic (dead
host_seqstring, the encapsulation question, the doc clarifications, the defensive crypto-provider install). The structural call to put TLS into the same registry behind aStreamKindenum is the right one and will keep paying off as PR4 lands.Done: 1. Free-then-allocate race fixed (tls.rs): no free after take_tcp. The slot stays Some(None) (reserved) across the handshake yield, so no other strand can claim the id. On success the TLS variant is reinstalled into the same slot — the Socket id the caller passed in is the same one returned. Free is now only called on handshake/build failure (where the underlying stream has already been dropped). Doc updated to describe this accurately; take_tcp now documents the must-reinstall-or-free contract. 2. Defensive ring::default_provider().install_default() at first config build. The let _ = swallows the already-installed error so concurrent first-time use is safe. 3. Strengthened empty-host test: now uses a real connected stream from a same-process listen + connect, calls net.tls.client with empty hostname, then completes a ping/pong against the same client socket to prove its STREAMS slot was never touched. The old test passed a listener id (in LISTENERS) which the empty-host short-circuit couldn't reach anyway. 4. Added DNS-form SNI handshake-fail test: same plain-TCP listener pattern but the SNI is "no-such-host.invalid" — exercises the DNS-name branch of ServerName::try_from reaching complete_io before the transport-error path fails it. Complements the IP-SNI test which exercises the IP-literal branch. 5. Dead code/redundancy removal in tls.rs: dropped the host_seqstring: SeqString param and its let _ = discard, dropped the unused use crate::seqstring::SeqString. build_tls now takes hostname: String (owned) and consumes it, so the redundant .to_string() is gone. 6. Doc clarifications: - STDLIB_REFERENCE.md: the net.tls.client row now says "Returns the same Socket id with its registry slot replaced in place"; added an explicit "IP-literal hostnames syntactically work but almost always fail validation" limitation row; added the hard-close + no close_notify limitation row. - tls.rs: replaced "port-of-call is the hostname" with "top of stack"; expanded the race-avoidance comment so a future reader sees why free was deliberately skipped. - tcp.rs: corrected the box-rationale comment — the box isn't shrinking the enum discriminant, it's keeping the StreamOwned payload off the registry's per-allocation footprint. Deferred (per the review's own "not a blocker" markings): - Encapsulation refactor moving take_tcp + the install logic into tcp.rs to keep STREAMS / SocketRegistry private. Worth doing once the surface settles after PR4. - Happy-path TLS test with a same-process rustls server. PR4's HTTPS coverage closes this naturally. - Handshake timeout — folds into the "deadline pass" follow-up alongside PR2's connect-timeout gap.