Pass 1 of src/parser.rs audit complete. #27
No reviewers
Labels
No labels
bug
dependencies
documentation
duplicate
enhancement
good first issue
help wanted
invalid
question
rust
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
navicore/patch-rexx!27
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "parser-refactor"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Applied:
Result:
Code Review — PR #27 (parser.rs audit pass 1)
The refactoring is sound: the six helpers (G1–G6) are semantically equivalent to the code they replace, all 500+ tests pass, and the structural improvements are real. One performance regression was introduced by G1, and two pre-existing correctness issues became more visible.
Bug introduced — G1: per-clause
Stringallocation inparse_clauseFile:
src/parser.rs, line 167to_ascii_uppercase()heap-allocates a newStringon everyparse_clausecall, including all the common paths (assignments, command clauses, label definitions) wherenameis not a keyword at all. Before G1, the dispatch usedSelf::check_keyword(name, kw)which resolves toeq_ignore_ascii_case— a zero-allocation in-place comparison.This is a measurable regression for scripts with many clauses. The fix is to either store the uppercase form once before the match, or apply
eq_ignore_ascii_casecomparisons directly:Option A is the smallest mechanical fix and keeps the match readable.
Pre-existing —
parse_do_bodysilently accepts wrong END nameFile:
src/parser.rs, lines 462–467REXX spec §7.4.6: "The name in an END instruction, if specified, must be the same as the name of the control variable in the corresponding DO instruction." The code accepts
end jfor ado i = 1 to 10without error. The loop variable name is available viaDoBlock::name; the fix is to thread it intoparse_do_bodyand compare. Not introduced here, but worth tracking.Pre-existing — dead predicate in
parse_do_bodyFile:
src/parser.rs, line 463is_terminator()returns true only forSemicolon | Eol | Eof. ATokenKind::Symbolcan never be any of those, so!self.is_terminator()is alwaystruewhen the outer pattern matches. The guard is dead and should be removed to avoid misleading future readers. Also not introduced here.Everything else looks correct
finish_do: all six DO variants call it with the sameskip_terminators → parse_do_bodysequence as the original. ✓consume_unique_keyword: duplicate-keyword detection is preserved; error messages are string-identical. ✓is_keyword("THEN"|"ELSE"|"WITH"): identical to the deleted helpers since both callcheck_keyword→eq_ignore_ascii_case. ✓parse_optional_expression: EXIT / RETURN / PUSH / QUEUE / NUMERIC DIGITS / FUZZ all behave identically. ✓do_header_depthis still active whileconsume_unique_keywordparses sub-expressions (TO/BY/FOR/WHILE/UNTIL), matching the original scoping. ✓consume_unique_keywordpropagate after the keyword is consumed, same as the original. ✓Code Review — Pass 1 Audit (PR #27)
This is a clean refactoring pass. The six changes (G1–G6) are mechanically correct, and I found no behavioral regressions in the expression parser or the DO/IF/SELECT parsers. Comments below are issues I'm confident about; the refactoring itself is sound.
1.
finish_doreceivesnamebut doesn't pass it toparse_do_body— Error 21 never fires (CORRECTNESS)src/parser.rs:401src/parser.rs:539–546Per ANSI X3.274-1996 §6.6.6,
end jinsidedo i = 1 to 5must raise Error 21 (InvalidDataOnEnd). TheRexxError::InvalidDataOnEndvariant already exists (error.rs:68, error code 21 aterror.rs:136). The design doc (docs/design/end-name-validation.md) describes the exact fix — threadname.as_deref()intoparse_do_body— but it isn't applied in this PR.The
nameparameter was carefully added tofinish_do's signature, so the scaffolding is in place. The missing step is just wiring it through, and the design doc already shows the code. Risk: this can be inadvertently merged as-is sincejust ciis green and the gap is silent (no test currently catches it).Missing tests (also called out in the design doc's checkpoints):
do i = 1 to 5; nop; end i— should parse OK (regression guard)do i = 1 to 5; nop; end I— should parse OK (case-insensitive)do i = 1 to 5; nop; end j— should fail with Error 21do; nop; end x— should fail (Simple DO has no control variable)do forever; leave; end x— should fail (Forever DO has no control variable)2.
parse_select's END handler doesn't guard against a trailing Symbol (CORRECTNESS)src/parser.rs:573–576Per ANSI, SELECT's closing
ENDtakes no name (unlike DO).select; when 1 then say 'a'; end foosilently leavesfooin the token stream, which is then parsed as a stray command clause rather than raising Error 21. The fix is symmetric withparse_do_body: after advancing past END, if the next token is a Symbol, raiseRexxError::InvalidDataOnEnd.This was pre-existing but worth fixing in the same pass since
parse_do_body's END handling was explicitly redesigned here.3. Test coverage gaps for
consume_unique_keyworderror pathtests/phase2.rs:276,282— duplicate TO and BY are covered. Duplicate FOR (do i = 1 for 3 for 2), duplicate WHILE, and duplicate UNTIL have no tests. Sinceconsume_unique_keywordis the new helper that consolidates this logic, tests for all five keywords would fully exercise it.Minor observation (not a blocker)
src/parser.rs:228—self.peek_kind().clone()allocates aStringfor everyparse_clausecall (cloning the Symbol's innerStringto bindname). The doc comment onKeywordKind::lookupsays "zero-alloc" which is true forlookupitself but the call site still clones. Not introduced by this PR, but the comment is slightly misleading in context.Summary
parser.rs:401,539parser.rs:573tests/phase2.rsThe G1–G6 refactoring is correct and clean. The main blocker is #1:
finish_do'snameparameter needs to be threaded intoparse_do_bodyper the design doc before this lands.PR #27 Review — Pass 1 parser.rs audit
This is a clean, well-executed refactor. All six changes (G1–G6) preserve semantics faithfully. No regressions were found. The notes below are purely about gaps and one subtle pre-existing issue that became more visible through the refactor.
Correctness
No new bugs introduced. Each of the six refactors is a mechanical equivalence transform:
G1 (
KeywordKinddispatch):KeywordKind::lookupuseseq_ignore_ascii_case, matching the oldcheck_keywordcalls exactly. The label (symbol + colon) and assignment (symbol + =) checks still come first, sothen = 5,then:, etc. correctly reach those branches before the keyword table is consulted. REXX's "no reserved words" rule is preserved for assignment and label positions.G6 (removed
!self.is_terminator()guard inparse_do_body): The claim that this was dead code is correct.is_terminatorreturnstrueonly forSemicolon | Eol | Eof. The guardif let TokenKind::Symbol(_) = self.peek_kind() && !self.is_terminator()— sinceSymbolis never any of those three variants, the!is_terminator()sub-expression was alwaystruewhen the outer pattern matched. Removal is safe.finish_do/consume_unique_keyword/parse_optional_expression: All are behaviorally identical to the inlined code they replace.Test Coverage (missing, not introduced by this PR)
src/parser.rsunit test gaps created by the refactor becoming more visible:consume_unique_keyworderror path forFOR,WHILE,UNTIL(lines 501–508):phase2.rshasduplicate_to_keyword_errorandduplicate_by_keyword_error, but no tests fordo i = 1 for 3 for 3(duplicateFOR),do i = 1 while x while x(duplicateWHILE), ordo i = 1 until x until x(duplicateUNTIL). These paths now flow through the same helper asTO/BY, so a single helper test would cover all five, but theFOR/WHILE/UNTILbranches remain untested at the unit level.DO WHILE and DO UNTIL parser unit tests (
src/parser.rs):parse_doalready has unit tests forSimple(line 1811),Count(line 1822),Controlled(line 1833), andForever(line 1850), butWhileandUntilvariants have no unit tests inparser.rs. Integration-level coverage exists in the phase tests, but adding two#[test]entries alongside the existing DO tests would fill the gap cheaply.END name matching — unit tests deferred with the feature (
parse_do_body, lines 540–547): The design doc (docs/design/end-name-validation.md) lists 8 test cases that should accompany the eventual implementation. Since the validation itself is deferred, the tests are too — but it's worth ensuring those 8 cases land together with the implementation in the follow-on PR, not after.Pre-existing issues (not introduced by this PR, flagged for completeness)
do_header_depthnot decremented on early error exit (src/parser.rs, lines 484–513):Same for
if_depth(lines 366–381) andcondition_depth(lines 350–352 inparse_select). In practice this is benign —Parseris single-use perparse()call (INTERPRET creates a freshParser::new()), so leaked counters are discarded with the parser on error. But ifParserwere ever reused or the depth counters werechecked_sub'd, this would surface. Worth a comment or a drop-guard pattern if the parser ever becomes multi-invocable.endused as a variable name inside DO body (parse_do_body, line 540):end = 5inside a DO block hitsis_keyword("END")and is misread as the block terminator. This is unchanged from before the PR and is a known limitation of context-freeis_keywordchecks.Summary
!is_terminator()guard removalFOR/WHILE/UNTILnot unit-testedend = 5inside DO body mis-parsedThe refactor achieves its stated goals cleanly. The three helpers and the
KeywordKindtable are solid additions. Suggest adding the two missing DO WHILE/UNTIL parser unit tests insrc/parser.rsbefore or alongside the follow-on END-name validation PR.