bitwise followup #449

Merged
navicore merged 2 commits from bitwise into main 2026-05-03 04:58:19 +00:00
Owner

⏺ Done. Summary:

Three codegen paths to keep in sync — discovered mid-implementation:

  • crates/runtime/src/arithmetic/bitwise.rs — runtime FFI (only int_bits actually goes
    through it; the others are inlined).
  • crates/compiler/src/codegen/inline/ops.rs — standard inline LLVM IR
    (codegen_shift_compute, codegen_inline_int_unary_intrinsic).
  • crates/compiler/src/codegen/specialization/codegen_safe_math.rs and …/codegen_ops.rs
    — register-based specialization path (emit_specialized_safe_shift, popcount/clz/ctz).

All three now apply the same 63-bit clamp on shl/shr (out-of-range → 0) and the same
63-bit-aware adjustments on popcount (mask off bit 63), clz (saturating −1 of the i64
count), ctz (special-case v == 0 → 63), and int_bits (returns 63).

Tests:

  • tests/integration/src/test-bitwise.seq — updated test-clz (62), test-int-bits (63),
    test-clz-ctz-edge-cases (63 instead of 64), test-popcount-edge-cases
    (popcount(-1)=63). Added test-shl-overflow (boundary cases at 1 61 shl, 1 62 shl, -1
    62 shl, -1 63 shl) and test-shr-negative (-1 1 shr=0, -1 2 shr=I63_MAX, -1 62 shr=3).
  • tests/integration/src/test-std-imath.seq — added
    test-int-to-bits-of-negative-terminates (the original infinite-loop regression check).

Docs:

  • docs/language-guide.md — added "Int is 63-bit" callout, corrected the int-bits table
    value, replaced the broken -1 1 shr example with two truthful ones (1 61 shl
    2305843009213693952, 1 62 shl 0, -1 2 shr 2^62-1, -1 1 shr 0), and rewrote the Shift
    Behavior bullets.
  • docs/STDLIB_REFERENCE.md — int-bits is 63 with a pointer to the language guide.
  • examples/projects/hackers-delight/README.md — int-bits is 63. The log2-floor example
    continues to work unchanged because int-bits and clz moved together (verified by
    running it: outputs 0,1,1,2,2,3,6 for inputs 1,2,3,4,7,8,100).
  • crates/compiler/stdlib/imath.seq — header note now says negatives terminate but
    produce lossy output (rather than "will not terminate").

just ci green: fmt, clippy -D warnings, all unit/integration/example tests, seq lint.

⏺ Done. Summary: Three codegen paths to keep in sync — discovered mid-implementation: - crates/runtime/src/arithmetic/bitwise.rs — runtime FFI (only int_bits actually goes through it; the others are inlined). - crates/compiler/src/codegen/inline/ops.rs — standard inline LLVM IR (codegen_shift_compute, codegen_inline_int_unary_intrinsic). - crates/compiler/src/codegen/specialization/codegen_safe_math.rs and …/codegen_ops.rs — register-based specialization path (emit_specialized_safe_shift, popcount/clz/ctz). All three now apply the same 63-bit clamp on shl/shr (out-of-range → 0) and the same 63-bit-aware adjustments on popcount (mask off bit 63), clz (saturating −1 of the i64 count), ctz (special-case v == 0 → 63), and int_bits (returns 63). Tests: - tests/integration/src/test-bitwise.seq — updated test-clz (62), test-int-bits (63), test-clz-ctz-edge-cases (63 instead of 64), test-popcount-edge-cases (popcount(-1)=63). Added test-shl-overflow (boundary cases at 1 61 shl, 1 62 shl, -1 62 shl, -1 63 shl) and test-shr-negative (-1 1 shr=0, -1 2 shr=I63_MAX, -1 62 shr=3). - tests/integration/src/test-std-imath.seq — added test-int-to-bits-of-negative-terminates (the original infinite-loop regression check). Docs: - docs/language-guide.md — added "Int is 63-bit" callout, corrected the int-bits table value, replaced the broken -1 1 shr example with two truthful ones (1 61 shl 2305843009213693952, 1 62 shl 0, -1 2 shr 2^62-1, -1 1 shr 0), and rewrote the Shift Behavior bullets. - docs/STDLIB_REFERENCE.md — int-bits is 63 with a pointer to the language guide. - examples/projects/hackers-delight/README.md — int-bits is 63. The log2-floor example continues to work unchanged because int-bits and clz moved together (verified by running it: outputs 0,1,1,2,2,3,6 for inputs 1,2,3,4,7,8,100). - crates/compiler/stdlib/imath.seq — header note now says negatives terminate but produce lossy output (rather than "will not terminate"). just ci green: fmt, clippy -D warnings, all unit/integration/example tests, seq lint.
bitwise followup
Some checks are pending
CI - Linux / CI - Linux x86_64 (pull_request) Waiting to run
b3b3fe047e
⏺ Done. Summary:

  Three codegen paths to keep in sync — discovered mid-implementation:
  - crates/runtime/src/arithmetic/bitwise.rs — runtime FFI (only int_bits actually goes
  through it; the others are inlined).
  - crates/compiler/src/codegen/inline/ops.rs — standard inline LLVM IR
  (codegen_shift_compute, codegen_inline_int_unary_intrinsic).
  - crates/compiler/src/codegen/specialization/codegen_safe_math.rs and …/codegen_ops.rs
   — register-based specialization path (emit_specialized_safe_shift, popcount/clz/ctz).

  All three now apply the same 63-bit clamp on shl/shr (out-of-range → 0) and the same
  63-bit-aware adjustments on popcount (mask off bit 63), clz (saturating −1 of the i64
  count), ctz (special-case v == 0 → 63), and int_bits (returns 63).

  Tests:
  - tests/integration/src/test-bitwise.seq — updated test-clz (62), test-int-bits (63),
  test-clz-ctz-edge-cases (63 instead of 64), test-popcount-edge-cases
  (popcount(-1)=63). Added test-shl-overflow (boundary cases at 1 61 shl, 1 62 shl, -1
  62 shl, -1 63 shl) and test-shr-negative (-1 1 shr=0, -1 2 shr=I63_MAX, -1 62 shr=3).
  - tests/integration/src/test-std-imath.seq — added
  test-int-to-bits-of-negative-terminates (the original infinite-loop regression check).

  Docs:
  - docs/language-guide.md — added "Int is 63-bit" callout, corrected the int-bits table
   value, replaced the broken -1 1 shr example with two truthful ones (1 61 shl
  2305843009213693952, 1 62 shl 0, -1 2 shr 2^62-1, -1 1 shr 0), and rewrote the Shift
  Behavior bullets.
  - docs/STDLIB_REFERENCE.md — int-bits is 63 with a pointer to the language guide.
  - examples/projects/hackers-delight/README.md — int-bits is 63. The log2-floor example
   continues to work unchanged because int-bits and clz moved together (verified by
  running it: outputs 0,1,1,2,2,3,6 for inputs 1,2,3,4,7,8,100).
  - crates/compiler/stdlib/imath.seq — header note now says negatives terminate but
  produce lossy output (rather than "will not terminate").

  just ci green: fmt, clippy -D warnings, all unit/integration/example tests, seq lint.
Author
Owner

Reviewed against docs/design/TAGGED_INT_BITWISE.md. The 63-bit semantics are implemented correctly across all three paths and the test boundaries (1 61 shl, 1 62 shl, -1 62 shl, -1 63 shl, -1 1 shr, -1 2 shr) all check out against the math. A few notes, none blocking.

1. The 63-bit clamp logic is duplicated between inline/ops.rs and specialization/codegen_safe_math.rs

emit_clamp_to_i63 (crates/compiler/src/codegen/inline/ops.rs:280) is pub(in crate::codegen) fn on impl CodeGen, and emit_specialized_safe_shift (crates/compiler/src/codegen/specialization/codegen_safe_math.rs:150) is also impl CodeGen, so the helper is in scope. The latter inlines an identical 4-instruction fits_min/fits_max/and/select sequence (lines 217–230 in the new file) instead of calling the helper. Same goes for the popcount mask, the clz saturating-minus-1, and the ctz v==0 → 63 shape — each is open-coded in both inline/ops.rs and specialization/codegen_ops.rs. A handful of pub(in crate::codegen) helpers (emit_popcount_i63, emit_clz_i63, emit_ctz_i63) would collapse the duplication and remove the risk of the two paths drifting if the encoding ever changes.

2. The 63-bit constants are repeated as magic numbers in IR-emitting code

-4611686018427387904, 4611686018427387903, and 9223372036854775807 appear unannotated in inline/ops.rs and codegen_safe_math.rs. The runtime gives them names (I63_MIN, I63_MAX, I63_MASK in crates/runtime/src/arithmetic/bitwise.rs:13–18); the codegen does not. If the tag scheme ever moves (the design doc treats 63-bit as load-bearing but not eternal), the IR strings are easy to miss in a search. A named const I63_MIN_LIT: &str = "-4611686018427387904" next to the helpers — or just format!("{}", seqc::tagged::I63_MIN) — would tie all the call sites to one source of truth.

3. Latent (pre-existing) overflow in the runtime shift count cast

crates/runtime/src/arithmetic/bitwise.rs:111 and :138 both do c as u32 after the c < 0 guard. For any c > u32::MAX (allowed by the 63-bit Int range), the cast truncates the high bits, so e.g. 1 4294967296 shl would compute 1.checked_shl(0) = 1 instead of clamping to 0. The PR description notes this FFI is dead for shl/shr ("only int_bits actually goes through it"), so this is unobservable today — but if that ever changes, the contract documented at the top of the file ("counts >= 64 ... return 0") is silently violated. Cheap to fix here while the file is open: if c < 0 || c >= 64 { 0 } else { v.checked_shl(c as u32)... }.

4. (Nit, observation only) panic! catch-all in the inline intrinsic dispatch

crates/compiler/src/codegen/inline/ops.rs:402 panics on an unknown intrinsic name. That's fine for an internal dispatch, but it does turn a previously total if/else into something that can crash codegen if a fourth intrinsic is wired in upstream without updating this match. A CodeGenError would degrade more gracefully; a comment pointing at the caller in codegen_inline_int_unary_intrinsic's sole call site would be enough if you'd rather keep the panic.

Tests, docs, and the cross-path math all line up. Nice cleanup of the lying int-bits and the example in the language guide.

Reviewed against `docs/design/TAGGED_INT_BITWISE.md`. The 63-bit semantics are implemented correctly across all three paths and the test boundaries (`1 61 shl`, `1 62 shl`, `-1 62 shl`, `-1 63 shl`, `-1 1 shr`, `-1 2 shr`) all check out against the math. A few notes, none blocking. **1. The 63-bit clamp logic is duplicated between `inline/ops.rs` and `specialization/codegen_safe_math.rs`** `emit_clamp_to_i63` (`crates/compiler/src/codegen/inline/ops.rs:280`) is `pub(in crate::codegen) fn` on `impl CodeGen`, and `emit_specialized_safe_shift` (`crates/compiler/src/codegen/specialization/codegen_safe_math.rs:150`) is also `impl CodeGen`, so the helper is in scope. The latter inlines an identical 4-instruction `fits_min`/`fits_max`/`and`/`select` sequence (lines 217–230 in the new file) instead of calling the helper. Same goes for the popcount mask, the clz `saturating-minus-1`, and the ctz `v==0 → 63` shape — each is open-coded in both `inline/ops.rs` and `specialization/codegen_ops.rs`. A handful of `pub(in crate::codegen)` helpers (`emit_popcount_i63`, `emit_clz_i63`, `emit_ctz_i63`) would collapse the duplication and remove the risk of the two paths drifting if the encoding ever changes. **2. The 63-bit constants are repeated as magic numbers in IR-emitting code** `-4611686018427387904`, `4611686018427387903`, and `9223372036854775807` appear unannotated in `inline/ops.rs` and `codegen_safe_math.rs`. The runtime gives them names (`I63_MIN`, `I63_MAX`, `I63_MASK` in `crates/runtime/src/arithmetic/bitwise.rs:13–18`); the codegen does not. If the tag scheme ever moves (the design doc treats 63-bit as load-bearing but not eternal), the IR strings are easy to miss in a search. A named `const I63_MIN_LIT: &str = "-4611686018427387904"` next to the helpers — or just `format!("{}", seqc::tagged::I63_MIN)` — would tie all the call sites to one source of truth. **3. Latent (pre-existing) overflow in the runtime shift count cast** `crates/runtime/src/arithmetic/bitwise.rs:111` and `:138` both do `c as u32` after the `c < 0` guard. For any `c > u32::MAX` (allowed by the 63-bit Int range), the cast truncates the high bits, so e.g. `1 4294967296 shl` would compute `1.checked_shl(0) = 1` instead of clamping to 0. The PR description notes this FFI is dead for `shl`/`shr` ("only `int_bits` actually goes through it"), so this is unobservable today — but if that ever changes, the contract documented at the top of the file ("counts >= 64 ... return 0") is silently violated. Cheap to fix here while the file is open: `if c < 0 || c >= 64 { 0 } else { v.checked_shl(c as u32)... }`. **4. (Nit, observation only) `panic!` catch-all in the inline intrinsic dispatch** `crates/compiler/src/codegen/inline/ops.rs:402` panics on an unknown intrinsic name. That's fine for an internal dispatch, but it does turn a previously total `if/else` into something that can crash codegen if a fourth intrinsic is wired in upstream without updating this match. A `CodeGenError` would degrade more gracefully; a comment pointing at the caller in `codegen_inline_int_unary_intrinsic`'s sole call site would be enough if you'd rather keep the panic. Tests, docs, and the cross-path math all line up. Nice cleanup of the lying `int-bits` and the example in the language guide.
pr review response
All checks were successful
CI - Linux / CI - Linux x86_64 (pull_request) Successful in 9m38s
3a6b636c23
navicore deleted branch bitwise 2026-05-03 04:58:20 +00:00
Sign in to join this conversation.
No description provided.