navicore/patch-rexx

Fork 0

⏺ Phase 12A is complete. Here's a summary of all changes: #15

Merged

navicore merged 2 commits from phase12a into main

2026-02-11 05:11:31 +00:00

navicore commented

2026-02-11 04:59:50 +00:00

(Migrated from github.com)

Phase 12A: ANSI Conformance Fixes — Complete

Changes Made

src/parser.rs — Added parsing for NUMERIC, PUSH, QUEUE instructions

parse_numeric(): handles NUMERIC DIGITS [expr], NUMERIC FORM SCIENTIFIC|ENGINEERING|VALUE expr, NUMERIC FUZZ [expr]
parse_push() / parse_queue(): parse optional expression argument

src/eval.rs — Implemented execution

Added VecDeque data queue to Evaluator
exec_numeric(): validates and updates self.settings.digits/form/fuzz
exec_push(): LIFO — adds to front of queue
exec_queue(): FIFO — adds to back of queue
pull_from_queue_or_stdin(): checks queue first, falls back to stdin
Wired PARSE PULL and standalone PULL to use the data queue

src/lexer.rs — Newlines and line continuation

Newlines (\n) now produce Eol tokens (clause terminators), fixing multiline REXX programs
skip_whitespace_and_comments() no longer skips newlines
Trailing-comma line continuation: when last significant token before newline is ,, the comma and newline are consumed, joining with the next line

src/builtins.rs — New BIFs

DIGITS(), FORM(), FUZZ(): return current numeric settings
QUEUED(): returns number of items in the data queue

tests/phase12a.rs — 28 integration tests covering all new features

Test Results

481 tests pass (87 unit + 366 integration + 28 phase12a), including 24 LSP tests
Clippy clean in both default and --features lsp modes
Zero regressions in phases 1–11

Phase 12A: ANSI Conformance Fixes — Complete Changes Made src/parser.rs — Added parsing for NUMERIC, PUSH, QUEUE instructions - parse_numeric(): handles NUMERIC DIGITS [expr], NUMERIC FORM SCIENTIFIC|ENGINEERING|VALUE expr, NUMERIC FUZZ [expr] - parse_push() / parse_queue(): parse optional expression argument src/eval.rs — Implemented execution - Added VecDeque<String> data queue to Evaluator - exec_numeric(): validates and updates self.settings.digits/form/fuzz - exec_push(): LIFO — adds to front of queue - exec_queue(): FIFO — adds to back of queue - pull_from_queue_or_stdin(): checks queue first, falls back to stdin - Wired PARSE PULL and standalone PULL to use the data queue src/lexer.rs — Newlines and line continuation - Newlines (\n) now produce Eol tokens (clause terminators), fixing multiline REXX programs - skip_whitespace_and_comments() no longer skips newlines - Trailing-comma line continuation: when last significant token before newline is ,, the comma and newline are consumed, joining with the next line src/builtins.rs — New BIFs - DIGITS(), FORM(), FUZZ(): return current numeric settings - QUEUED(): returns number of items in the data queue tests/phase12a.rs — 28 integration tests covering all new features Test Results - 481 tests pass (87 unit + 366 integration + 28 phase12a), including 24 LSP tests - Clippy clean in both default and --features lsp modes - Zero regressions in phases 1–11

claude[bot] commented

2026-02-11 05:02:05 +00:00

(Migrated from github.com)

PR Review: Phase 12A ANSI Conformance Fixes

I've reviewed the changes for NUMERIC instructions, PUSH/QUEUE/PULL data queue, new BIFs, and line continuation. Here are my findings:

Critical Issues

1. NUMERIC DIGITS overflow vulnerability (src/eval.rs:1117-1125)

The code converts to_integer result directly to u32 without checking if it exceeds u32::MAX:

let n = self.to_integer(&val)?;
if n < 1 {
    return Err(...);
}
#[allow(clippy::cast_possible_truncation, clippy::cast_sign_loss)]
{
    n as u32  // ⚠️ No upper bound check!
}

Impact: NUMERIC DIGITS 5000000000 (5 billion) would silently truncate, causing incorrect arithmetic precision.

Fix: Add upper bound validation before cast:

if n > u32::MAX as i64 {
    return Err(RexxDiagnostic::new(RexxError::InvalidWholeNumber)
        .with_detail("NUMERIC DIGITS value too large"));
}

ANSI X3.274-1996 §9.3.2 requires implementations to support at least DIGITS 9, but allows implementations to set maximum limits. Per spec, exceeding implementation limits should raise Error 26 (Invalid whole number).

2. NUMERIC FUZZ overflow (src/eval.rs:1161-1169)

Same vulnerability as DIGITS — missing upper bound check before n as u32 cast.

Arithmetic & Error Handling ✓

Division by zero handling is correct for all division operations:

Regular division (src/eval.rs:1829-1831) ✓
Integer division (src/eval.rs:1847-1849) ✓
Remainder (src/eval.rs:1862-1864) ✓

All use Error 42 (ArithmeticOverflow) per ANSI spec §A.4.42 — correct error number for division by zero.

REXX Semantics ✓

PUSH/QUEUE/PULL implementation follows REXX semantics correctly:

PUSH uses LIFO (push_front) — verified by test at line 756
QUEUE uses FIFO (push_back) — verified by test at line 770
PARSE PULL does not uppercase (correct)
Shorthand PULL uppercases (test at line 830)
Queue checked before stdin in pull_from_queue_or_stdin (src/eval.rs:1217)

QUEUED() BIF correctly returns self.queue.len() — no off-by-one errors.

NUMERIC FORM validation is correct (src/eval.rs:1141-1149):

Only accepts "SCIENTIFIC" or "ENGINEERING"
Uses Error 25 (InvalidSubKeyword) — correct per ANSI §A.4.25

Lexer Line Continuation

Potential edge case in is_line_continuation (src/lexer.rs:149-187):

The block comment scanning doesn't handle unterminated comments at end of line:

x = 1 +, /* unterminated comment

Current code would scan past EOF looking for */, then return true (treating EOF as continuation). This is benign — the unterminated comment would be caught by the later comment parsing in skip_whitespace_and_comments, producing Error 6 (UnmatchedComment).

Not a bug, but the control flow is indirect. Consider adding a safety check:

if depth > 0 && i >= self.source.len() {
    // Unterminated comment - not a continuation
    return false;
}

Test Coverage ✓

Phase12a tests are comprehensive:

28 tests covering NUMERIC DIGITS/FORM/FUZZ
PUSH/QUEUE ordering (LIFO vs FIFO)
Line continuation (basic, multiple, with comments)
Edge cases: empty PUSH, FUZZ ≥ DIGITS validation

Missing test: NUMERIC DIGITS with value > u32::MAX (would expose the overflow bug).

Performance

No performance issues detected:

VecDeque for queue is correct choice (O(1) push/pop on both ends)
No unnecessary clones in queue operations
String allocations in PUSH/QUEUE are unavoidable ("everything is a string")

Summary

Must fix before merge:

Add upper bound validation for NUMERIC DIGITS (prevent truncation > u32::MAX)
Add upper bound validation for NUMERIC FUZZ

Recommended:
3. Add test case for large NUMERIC DIGITS values

The implementation is otherwise sound. Division by zero handling is correct, REXX semantics are properly followed, and the queue implementation matches the ANSI spec.

## PR Review: Phase 12A ANSI Conformance Fixes I've reviewed the changes for NUMERIC instructions, PUSH/QUEUE/PULL data queue, new BIFs, and line continuation. Here are my findings: ### Critical Issues **1. NUMERIC DIGITS overflow vulnerability (src/eval.rs:1117-1125)** The code converts `to_integer` result directly to `u32` without checking if it exceeds `u32::MAX`: ```rust let n = self.to_integer(&val)?; if n < 1 { return Err(...); } #[allow(clippy::cast_possible_truncation, clippy::cast_sign_loss)] { n as u32 // ⚠️ No upper bound check! } ``` **Impact**: `NUMERIC DIGITS 5000000000` (5 billion) would silently truncate, causing incorrect arithmetic precision. **Fix**: Add upper bound validation before cast: ```rust if n > u32::MAX as i64 { return Err(RexxDiagnostic::new(RexxError::InvalidWholeNumber) .with_detail("NUMERIC DIGITS value too large")); } ``` ANSI X3.274-1996 §9.3.2 requires implementations to support at least DIGITS 9, but allows implementations to set maximum limits. Per spec, exceeding implementation limits should raise Error 26 (Invalid whole number). --- **2. NUMERIC FUZZ overflow (src/eval.rs:1161-1169)** Same vulnerability as DIGITS — missing upper bound check before `n as u32` cast. --- ### Arithmetic & Error Handling ✓ Division by zero handling is **correct** for all division operations: - Regular division (src/eval.rs:1829-1831) ✓ - Integer division (src/eval.rs:1847-1849) ✓ - Remainder (src/eval.rs:1862-1864) ✓ All use Error 42 (ArithmeticOverflow) per ANSI spec §A.4.42 — correct error number for division by zero. --- ### REXX Semantics ✓ **PUSH/QUEUE/PULL implementation** follows REXX semantics correctly: - PUSH uses LIFO (`push_front`) — verified by test at line 756 - QUEUE uses FIFO (`push_back`) — verified by test at line 770 - PARSE PULL does not uppercase (correct) - Shorthand PULL uppercases (test at line 830) - Queue checked before stdin in `pull_from_queue_or_stdin` (src/eval.rs:1217) **QUEUED() BIF** correctly returns `self.queue.len()` — no off-by-one errors. **NUMERIC FORM** validation is correct (src/eval.rs:1141-1149): - Only accepts "SCIENTIFIC" or "ENGINEERING" - Uses Error 25 (InvalidSubKeyword) — correct per ANSI §A.4.25 --- ### Lexer Line Continuation **Potential edge case** in `is_line_continuation` (src/lexer.rs:149-187): The block comment scanning doesn't handle **unterminated comments** at end of line: ```rexx x = 1 +, /* unterminated comment ``` Current code would scan past EOF looking for `*/`, then return `true` (treating EOF as continuation). This is **benign** — the unterminated comment would be caught by the later comment parsing in `skip_whitespace_and_comments`, producing Error 6 (UnmatchedComment). Not a bug, but the control flow is indirect. Consider adding a safety check: ```rust if depth > 0 && i >= self.source.len() { // Unterminated comment - not a continuation return false; } ``` --- ### Test Coverage ✓ Phase12a tests are comprehensive: - 28 tests covering NUMERIC DIGITS/FORM/FUZZ - PUSH/QUEUE ordering (LIFO vs FIFO) - Line continuation (basic, multiple, with comments) - Edge cases: empty PUSH, FUZZ ≥ DIGITS validation **Missing test**: NUMERIC DIGITS with value > u32::MAX (would expose the overflow bug). --- ### Performance No performance issues detected: - `VecDeque` for queue is correct choice (O(1) push/pop on both ends) - No unnecessary clones in queue operations - String allocations in PUSH/QUEUE are unavoidable ("everything is a string") --- ## Summary **Must fix before merge**: 1. Add upper bound validation for NUMERIC DIGITS (prevent truncation > u32::MAX) 2. Add upper bound validation for NUMERIC FUZZ **Recommended**: 3. Add test case for large NUMERIC DIGITS values The implementation is otherwise sound. Division by zero handling is correct, REXX semantics are properly followed, and the queue implementation matches the ANSI spec.

claude[bot] commented

2026-02-11 05:10:12 +00:00

(Migrated from github.com)

Code Review: Phase 12A ANSI Conformance Fixes

I've reviewed this PR focusing on correctness, REXX semantics, error handling, and potential issues. The implementation is solid overall, but I've identified one critical bug and several improvement areas.

What's Working Well

Correct error numbers: Using Error 25 (InvalidSubKeyword) for NUMERIC FORM and Error 26 (InvalidWholeNumber) for DIGITS/FUZZ validation
NUMERIC DIGITS validation: Properly checks for positive values and rejects >= u32::MAX
FUZZ < DIGITS constraint: Correctly enforced per ANSI spec
Queue semantics: PUSH (LIFO) and QUEUE (FIFO) implemented correctly with VecDeque
Line continuation: Trailing comma handling looks reasonable with nested comment support
BIF implementations: DIGITS(), FORM(), FUZZ(), QUEUED() are straightforward and correct
Test coverage: 28 new tests cover main paths well

Critical Issue

Line continuation with unterminated block comments can cause infinite loop or panic

Location: src/lexer.rs:149-187 (is_line_continuation())

Problem: The block comment handling in is_line_continuation() doesn't check for EOF while skipping nested comments. If a trailing comma is followed by an unterminated /* comment, the while loop can exit with depth > 0, but the function continues instead of returning an error.

Impact: Could cause incorrect parsing or allow unterminated comments to be silently accepted during line continuation detection.

Fix needed: After the while loop exits at line 178, add:

if depth > 0 {
    // Unterminated comment - not a valid continuation
    return false;
}

Test case to add:

x = 1 +, /* unterminated
say x

This should fail with an unterminated comment error, not silently continue.

ANSI Compliance & Semantics Issues

1. PARSE PULL does not uppercase in tests, but spec is unclear

Locations:

tests/phase12a.rs:196, tests/phase12a.rs:226, tests/phase12a.rs:343
src/eval.rs:419-425

Issue: Test comments say "PARSE PULL does not uppercase (only shorthand PULL does)" but the code at src/eval.rs:905 shows that ParseSource::Pull calls pull_from_queue_or_stdin() which returns raw text, and then the PARSE instruction applies uppercasing based on the upper flag.

Questions:

Does PARSE PULL default to uppercase or case-preserving in ANSI REXX?
Should it be PARSE UPPER PULL vs PARSE PULL to control case?

Verify: Check if the parser distinguishes PARSE PULL from PARSE UPPER PULL and if tests cover both cases.

2. NUMERIC FORM bare keyword defaults to SCIENTIFIC

Location: src/parser.rs:1524 (line 522 in diff)

Question: Is this correct per ANSI X3.274-1996? Some REXX implementations treat bare NUMERIC FORM as an error. Please verify this is the intended default behavior.

Potential Performance Issues

String clones in PUSH/QUEUE

Location: src/eval.rs:1200, src/eval.rs:1215

This is called on every PUSH/QUEUE. Since RexxValue already owns the string, you could use into_string() to avoid the clone if the value isn't used afterward.

Test Coverage Gaps

Line continuation edge cases:
- Missing: Unterminated comment after comma (critical bug case)
- Missing: Continuation with line comment after comma
- Missing: Comma at absolute EOF
NUMERIC DIGITS edge cases:
- Missing: NUMERIC DIGITS 0 (should fail with Error 26)
- Missing: NUMERIC DIGITS with non-integer like 5.5 (should fail)
NUMERIC FUZZ edge cases:
- Missing: Negative FUZZ values (has validation but no test)
- Missing: FUZZ affecting comparison operations
Queue edge cases:
- Missing: Multiple PULLs from empty queue (should read stdin each time)
- Missing: QUEUED() during PULL operations

Recommendations

Must fix before merge:

Fix the unterminated comment handling in is_line_continuation() (critical bug)
Add test for unterminated comment edge case
Verify and document PARSE PULL uppercase behavior per ANSI spec

Should consider:

Add tests for NUMERIC DIGITS 0 and non-integer values
Verify NUMERIC FORM bare keyword behavior against spec
Consider micro-optimization for queue string clones

Nice to have:

Add more FUZZ-related tests (negative values, comparison behavior)
Test queue + stdin interaction edge cases

Summary

The implementation correctly handles the ANSI REXX semantics for NUMERIC settings and the data queue. Error handling is proper with correct error numbers. The critical issue is in the lexer's line continuation logic with unterminated comments. Fix that, verify the PARSE PULL uppercase behavior, and this will be ready to merge.

Overall assessment: Solid implementation with one critical fix needed.

File-specific issues cited:

src/lexer.rs:162-178 - unterminated comment handling
src/parser.rs:522 - verify bare NUMERIC FORM default
src/eval.rs:1200,1215 - potential string clone optimization
Test gaps noted above

## Code Review: Phase 12A ANSI Conformance Fixes I've reviewed this PR focusing on correctness, REXX semantics, error handling, and potential issues. The implementation is solid overall, but I've identified **one critical bug** and several improvement areas. --- ### What's Working Well 1. **Correct error numbers**: Using Error 25 (InvalidSubKeyword) for NUMERIC FORM and Error 26 (InvalidWholeNumber) for DIGITS/FUZZ validation 2. **NUMERIC DIGITS validation**: Properly checks for positive values and rejects >= u32::MAX 3. **FUZZ < DIGITS constraint**: Correctly enforced per ANSI spec 4. **Queue semantics**: PUSH (LIFO) and QUEUE (FIFO) implemented correctly with VecDeque 5. **Line continuation**: Trailing comma handling looks reasonable with nested comment support 6. **BIF implementations**: DIGITS(), FORM(), FUZZ(), QUEUED() are straightforward and correct 7. **Test coverage**: 28 new tests cover main paths well --- ### Critical Issue **Line continuation with unterminated block comments can cause infinite loop or panic** **Location**: src/lexer.rs:149-187 (is_line_continuation()) **Problem**: The block comment handling in is_line_continuation() doesn't check for EOF while skipping nested comments. If a trailing comma is followed by an unterminated /* comment, the while loop can exit with depth > 0, but the function continues instead of returning an error. **Impact**: Could cause incorrect parsing or allow unterminated comments to be silently accepted during line continuation detection. **Fix needed**: After the while loop exits at line 178, add: ```rust if depth > 0 { // Unterminated comment - not a valid continuation return false; } ``` **Test case to add**: ```rexx x = 1 +, /* unterminated say x ``` This should fail with an unterminated comment error, not silently continue. --- ### ANSI Compliance & Semantics Issues #### 1. PARSE PULL does not uppercase in tests, but spec is unclear **Locations**: - tests/phase12a.rs:196, tests/phase12a.rs:226, tests/phase12a.rs:343 - src/eval.rs:419-425 **Issue**: Test comments say "PARSE PULL does not uppercase (only shorthand PULL does)" but the code at src/eval.rs:905 shows that ParseSource::Pull calls pull_from_queue_or_stdin() which returns raw text, and then the PARSE instruction applies uppercasing based on the upper flag. **Questions**: 1. Does PARSE PULL default to uppercase or case-preserving in ANSI REXX? 2. Should it be PARSE UPPER PULL vs PARSE PULL to control case? **Verify**: Check if the parser distinguishes PARSE PULL from PARSE UPPER PULL and if tests cover both cases. #### 2. NUMERIC FORM bare keyword defaults to SCIENTIFIC **Location**: src/parser.rs:1524 (line 522 in diff) **Question**: Is this correct per ANSI X3.274-1996? Some REXX implementations treat bare NUMERIC FORM as an error. Please verify this is the intended default behavior. --- ### Potential Performance Issues **String clones in PUSH/QUEUE** **Location**: src/eval.rs:1200, src/eval.rs:1215 This is called on every PUSH/QUEUE. Since RexxValue already owns the string, you could use into_string() to avoid the clone if the value isn't used afterward. --- ### Test Coverage Gaps 1. **Line continuation edge cases**: - Missing: Unterminated comment after comma (critical bug case) - Missing: Continuation with line comment after comma - Missing: Comma at absolute EOF 2. **NUMERIC DIGITS edge cases**: - Missing: NUMERIC DIGITS 0 (should fail with Error 26) - Missing: NUMERIC DIGITS with non-integer like 5.5 (should fail) 3. **NUMERIC FUZZ edge cases**: - Missing: Negative FUZZ values (has validation but no test) - Missing: FUZZ affecting comparison operations 4. **Queue edge cases**: - Missing: Multiple PULLs from empty queue (should read stdin each time) - Missing: QUEUED() during PULL operations --- ### Recommendations **Must fix before merge**: 1. Fix the unterminated comment handling in is_line_continuation() (critical bug) 2. Add test for unterminated comment edge case 3. Verify and document PARSE PULL uppercase behavior per ANSI spec **Should consider**: 1. Add tests for NUMERIC DIGITS 0 and non-integer values 2. Verify NUMERIC FORM bare keyword behavior against spec 3. Consider micro-optimization for queue string clones **Nice to have**: 1. Add more FUZZ-related tests (negative values, comparison behavior) 2. Test queue + stdin interaction edge cases --- ### Summary The implementation correctly handles the ANSI REXX semantics for NUMERIC settings and the data queue. Error handling is proper with correct error numbers. The critical issue is in the lexer's line continuation logic with unterminated comments. Fix that, verify the PARSE PULL uppercase behavior, and this will be ready to merge. **Overall assessment**: Solid implementation with one critical fix needed. File-specific issues cited: - src/lexer.rs:162-178 - unterminated comment handling - src/parser.rs:522 - verify bare NUMERIC FORM default - src/eval.rs:1200,1215 - potential string clone optimization - Test gaps noted above