issue-499 #502

Merged
navicore merged 1 commit from issue-499-b into main 2026-05-20 17:37:36 +00:00
Owner

⏺ Done. The change is exactly what we promised:

#499

Modified (Rust only — no language change):

  • crates/core/src/value.rs — added ChannelMsg::{Value, Closed} (mirrors
    WeaveMessage) and closed: Arc to ChannelData
  • crates/core/src/stack.rs — one test's manual ChannelData { … } constructor
  • crates/runtime/src/channel.rs — chan.make/chan.send/chan.receive/chan.close now
    route through ChannelMsg with CAS-guarded sentinel + Go-style re-broadcast in
    receive
  • crates/runtime/src/channel/tests.rs — 5 new unit tests including MPMC drain and
    close idempotency

Added (test coverage only):

  • tests/integration/src/test-channel-close.seq — 4 end-to-end tests including the
    verbatim #499 cross-strand scenario

Touched docs:

  • docs/STDLIB_REFERENCE.md and docs/language-guide.md — chan.close description now
    accurate
  • docs/design/CHAN_CLOSE_SEMANTICS.md — the rewritten design doc

Results:

  • 547 integration tests pass (was 543 — the 4 new ones)
  • 16 channel unit tests pass (was 11)
  • just ci green: fmt, clippy -D warnings, unit tests, examples, integration, seqc
    lint, no_dead_code
  • Zero .seq source files migrated. Every existing program — including the
    actor_counters example, the stdlib, and the seqlings curriculum — keeps working
    with Channel as a single type and chan.close doing what every doc said it does.

The user-visible API is identical; the bug stayed entirely inside the runtime where
it belonged. The WeaveMessage pattern at crates/core/src/value.rs:43-51 did all
the work — same idea applied to plain channels.

⏺ Done. The change is exactly what we promised: https://git.navicore.tech/navicore/patch-seq/issues/499 Modified (Rust only — no language change): - crates/core/src/value.rs — added ChannelMsg::{Value, Closed} (mirrors WeaveMessage) and closed: Arc<AtomicBool> to ChannelData - crates/core/src/stack.rs — one test's manual ChannelData { … } constructor - crates/runtime/src/channel.rs — chan.make/chan.send/chan.receive/chan.close now route through ChannelMsg with CAS-guarded sentinel + Go-style re-broadcast in receive - crates/runtime/src/channel/tests.rs — 5 new unit tests including MPMC drain and close idempotency Added (test coverage only): - tests/integration/src/test-channel-close.seq — 4 end-to-end tests including the verbatim #499 cross-strand scenario Touched docs: - docs/STDLIB_REFERENCE.md and docs/language-guide.md — chan.close description now accurate - docs/design/CHAN_CLOSE_SEMANTICS.md — the rewritten design doc Results: - 547 integration tests pass (was 543 — the 4 new ones) - 16 channel unit tests pass (was 11) - just ci green: fmt, clippy -D warnings, unit tests, examples, integration, seqc lint, no_dead_code - Zero .seq source files migrated. Every existing program — including the actor_counters example, the stdlib, and the seqlings curriculum — keeps working with Channel as a single type and chan.close doing what every doc said it does. The user-visible API is identical; the bug stayed entirely inside the runtime where it belonged. The WeaveMessage pattern at crates/core/src/value.rs:43-51 did all the work — same idea applied to plain channels.
issue-499
All checks were successful
CI - Linux / CI - Linux x86_64 (pull_request) Successful in 8m33s
4466adf716
⏺ Done. The change is exactly what we promised:

  Modified (Rust only — no language change):
  - crates/core/src/value.rs — added ChannelMsg::{Value, Closed} (mirrors
  WeaveMessage) and closed: Arc<AtomicBool> to ChannelData
  - crates/core/src/stack.rs — one test's manual ChannelData { … } constructor
  - crates/runtime/src/channel.rs — chan.make/chan.send/chan.receive/chan.close now
  route through ChannelMsg with CAS-guarded sentinel + Go-style re-broadcast in
  receive
  - crates/runtime/src/channel/tests.rs — 5 new unit tests including MPMC drain and
  close idempotency

  Added (test coverage only):
  - tests/integration/src/test-channel-close.seq — 4 end-to-end tests including the
  verbatim #499 cross-strand scenario

  Touched docs:
  - docs/STDLIB_REFERENCE.md and docs/language-guide.md — chan.close description now
  accurate
  - docs/design/CHAN_CLOSE_SEMANTICS.md — the rewritten design doc

  Results:
  - 547 integration tests pass (was 543 — the 4 new ones)
  - 16 channel unit tests pass (was 11)
  - just ci green: fmt, clippy -D warnings, unit tests, examples, integration, seqc
  lint, no_dead_code
  - Zero .seq source files migrated. Every existing program — including the
  actor_counters example, the stdlib, and the seqlings curriculum — keeps working
  with Channel as a single type and chan.close doing what every doc said it does.

  The user-visible API is identical; the bug stayed entirely inside the runtime where
   it belonged. The WeaveMessage pattern at crates/core/src/value.rs:43-51 did all
  the work — same idea applied to plain channels.
Author
Owner

Review

Reading the diff + design doc + tests against the issue. Recommend
approve
— the approach is right and the scope is honest. Two
observations worth a response, neither a blocker.

What's good

  • The reuse of the WeaveMessage typed-sentinel pattern is the right
    call. It's already proven in the weave path; applying it to plain
    channels means there's no new abstraction to learn or maintain.
  • The "no language-visible change" claim is upheld in the diff:
    changes are confined to crates/core/src/value.rs,
    crates/runtime/src/channel.rs, doc strings, and tests. Stack
    effects and types are untouched. .seq source counts are unchanged.
  • compare_exchange(false, true, AcqRel, Acquire) is the right
    ordering pair for the closed flag — the AcqRel on success pairs
    with the Acquire load in chan.send, and the failure path's
    Acquire is enough to observe the prior store.
  • The Err(_) => (default, false) fallback in chan.receive for
    "all Arc<ChannelData> clones dropped" preserves the existing
    behaviour and is now correctly the only path you reach when there
    was no explicit chan.closeOk(Closed) handles the closed
    case explicitly. Comment at the call site is helpful.
  • The MPMC drain test (test_mpmc_drain_on_close) is the regression
    bar that matters: one close, N receivers wake. That's what the
    re-broadcast mechanic is buying us, and it's tested.
  • The integration test file mirrors the verbatim reproducer from
    the issue (test-499-cross-strand). Good.

Observations

1. Send-vs-close has a TOCTOU window. chan.send does an
Acquire load on closed and then calls sender.send(...). If a
concurrent chan.close wins between the load and the enqueue, a
Value(v) lands in the queue after Closed. Effect: a receiver
consumes Closed first, re-broadcasts, returns (default, false)
the late Value(v) is then orphaned in the buffer until every
Arc<ChannelData> drops.

This isn't worse than Go (which avoids the race by panicking on
send-after-close, behind a mutex) and is arguably better given
Seq's "errors are values" stance — but the doc-comment in
channel.rs says "user code reliably sees false after close"
and the STDLIB_REFERENCE.md table says "Returns false once the
channel has been closed", both of which slightly overstate the
guarantee. Two options:

  • Tighten the wording — "After chan.close, subsequent sends return
    false; sends concurrent with the close may succeed and their
    values may not be drained."
  • Or accept the gap silently if you think the wording is fine as a
    high-level summary. Either is defensible; just don't want a future
    reader to think the guarantee is stronger than it is.

No code change needed — the design is consistent with the
no-mutex-on-hot-path constraint in the design doc.

2. test_mpmc_drain_on_close uses thread::sleep(50ms) to ensure
the receivers reach recv() before close fires.
On a loaded CI
runner this could flake — receivers might not have parked yet,
in which case they may consume the sentinel + re-broadcast race in a
different order than expected. The current assertion (all four wake
with success=false) is robust to the order, so I think this will
hold in practice — but a std::sync::Barrier shared by the four
receivers (count it down once each reaches a "about to call recv"
checkpoint) would make the test deterministic. Follow-up, not a
blocker.

Minor

  • The "Revert" section in CHAN_CLOSE_SEMANTICS.md describes a
    previous breaking-change branch that needed undoing before this
    approach landed. That's done now (the diff shows no compiler/lsp
    changes), so that section is historical context — fine to leave,
    or trim once the PR merges if you'd rather the doc read as
    forward-looking.
  • The module doc-comment block at the top of channel.rs is a
    nice rewrite — the "Issue #499" anchor + pointer to the design
    doc is exactly the kind of breadcrumb future maintainers want.

Nice cleanup of a real footgun.

## Review Reading the diff + design doc + tests against the issue. **Recommend approve** — the approach is right and the scope is honest. Two observations worth a response, neither a blocker. ### What's good - The reuse of the `WeaveMessage` typed-sentinel pattern is the right call. It's already proven in the weave path; applying it to plain channels means there's no new abstraction to learn or maintain. - The "no language-visible change" claim is upheld in the diff: changes are confined to `crates/core/src/value.rs`, `crates/runtime/src/channel.rs`, doc strings, and tests. Stack effects and types are untouched. `.seq` source counts are unchanged. - `compare_exchange(false, true, AcqRel, Acquire)` is the right ordering pair for the closed flag — the AcqRel on success pairs with the `Acquire` load in `chan.send`, and the failure path's `Acquire` is enough to observe the prior store. - The `Err(_) => (default, false)` fallback in `chan.receive` for "all `Arc<ChannelData>` clones dropped" preserves the existing behaviour and is now correctly the only path you reach when there was no explicit `chan.close` — `Ok(Closed)` handles the closed case explicitly. Comment at the call site is helpful. - The MPMC drain test (`test_mpmc_drain_on_close`) is the regression bar that matters: one close, N receivers wake. That's what the re-broadcast mechanic is buying us, and it's tested. - The integration test file mirrors the verbatim reproducer from the issue (`test-499-cross-strand`). Good. ### Observations **1. Send-vs-close has a TOCTOU window.** `chan.send` does an `Acquire` load on `closed` and then calls `sender.send(...)`. If a concurrent `chan.close` wins between the load and the enqueue, a `Value(v)` lands in the queue *after* `Closed`. Effect: a receiver consumes `Closed` first, re-broadcasts, returns `(default, false)` — the late `Value(v)` is then orphaned in the buffer until every `Arc<ChannelData>` drops. This isn't worse than Go (which avoids the race by panicking on send-after-close, behind a mutex) and is arguably *better* given Seq's "errors are values" stance — but the doc-comment in `channel.rs` says "user code reliably sees `false` after close" and the `STDLIB_REFERENCE.md` table says "Returns `false` once the channel has been closed", both of which slightly overstate the guarantee. Two options: - Tighten the wording — "After `chan.close`, subsequent sends return `false`; sends concurrent with the close may succeed and their values may not be drained." - Or accept the gap silently if you think the wording is fine as a high-level summary. Either is defensible; just don't want a future reader to think the guarantee is stronger than it is. No code change needed — the design is consistent with the no-mutex-on-hot-path constraint in the design doc. **2. `test_mpmc_drain_on_close` uses `thread::sleep(50ms)` to ensure the receivers reach `recv()` before close fires.** On a loaded CI runner this could flake — receivers might not have parked yet, in which case they may consume the sentinel + re-broadcast race in a different order than expected. The current assertion (all four wake with `success=false`) is robust to the order, so I think this will hold in practice — but a `std::sync::Barrier` shared by the four receivers (count it down once each reaches a "about to call recv" checkpoint) would make the test deterministic. Follow-up, not a blocker. ### Minor - The "Revert" section in `CHAN_CLOSE_SEMANTICS.md` describes a previous breaking-change branch that needed undoing before this approach landed. That's done now (the diff shows no compiler/lsp changes), so that section is historical context — fine to leave, or trim once the PR merges if you'd rather the doc read as forward-looking. - The module doc-comment block at the top of `channel.rs` is a nice rewrite — the "Issue #499" anchor + pointer to the design doc is exactly the kind of breadcrumb future maintainers want. Nice cleanup of a real footgun.
navicore deleted branch issue-499-b 2026-05-20 17:37:37 +00:00
Sign in to join this conversation.
No description provided.