lsp #36

Merged
navicore merged 3 commits from lsp into main 2026-06-04 02:58:26 +00:00
Owner

● All green: 6 runner + 248 integration + 148 core unit + 19 new LSP = 421 tests across the workspace. prlg-lsp

MVP is shipped.

What was built:

crates/lsp/ — new workspace member, crate patch-prolog-lsp, binary prlg-lsp.

  • src/main.rs — tower-lsp scaffold, document store (HashMap<Url, DocumentState> behind RwLock), LanguageServer
    impl covering initialize, initialized, shutdown, did_open, did_change, did_close, completion, hover.
    Capabilities: full sync, completion with : trigger, hover.
  • src/diagnostics.rs — re-parses on every change, maps the parser's existing at line N col M format
    (post-#18, #20) into LSP Diagnostic ranges with severity Error and source prlg-lsp. 1-char range from the
    position; widening to lexeme spans is a future improvement when the parser exposes structured spans.
  • src/completion.rs — union of built-ins (builtin_atom_names() + builtin_functor_names() from core), stdlib
    predicates (parsed once from include_str!("../../../knowledge/stdlib.pl") behind OnceLock), and user-defined
    predicates from the current buffer. Mid-edit safety: if the whole buffer fails to parse, truncate to the last
    clause-ending . and retry — so completion stays useful while the user is typing.
  • src/hover.rs — fixed table of one-liner descriptions for ~50 built-ins; falls back to rendering the
    user-defined clause head for predicates the user defined in the buffer. Same mid-edit truncation as completion.
  • README.md — install, intended consumer (patch-prolog.nvim), manual Neovim wiring snippet, RUST_LOG note.

Release-workflow updates (.forgejo/workflows/release.yml):

  • The sed pin-bump now loops over Cargo.toml crates/*/Cargo.toml so the lsp crate's patch-prolog-core pin moves
    with the workspace.
  • The verify gate now expects ≥3 pins (2 in root, 1 in lsp).
  • New publish step cargo publish -p patch-prolog-lsp between the existing 120s wait and the patch-prolog
    publish.
  • git add extended to crates/*/Cargo.toml.

Verified end-to-end:

  • initialize handshake returns the right capabilities.
  • didOpen on "p(foo).\ngo :- bar(]).\n" publishes a diagnostic at line 1 char 10 with message unexpected ]
    same accuracy as prlg run's error output.
  • 19 unit tests cover prefix detection, completion sources (built-in, stdlib, user-defined-in-buffer), mid-edit
    truncation, hover for built-ins, hover for user-defined predicates, no-hover on unknown words, and a regression
    that internal TokenKind variant names never leak into diagnostic messages.
● All green: 6 runner + 248 integration + 148 core unit + 19 new LSP = 421 tests across the workspace. prlg-lsp MVP is shipped. What was built: crates/lsp/ — new workspace member, crate patch-prolog-lsp, binary prlg-lsp. - src/main.rs — tower-lsp scaffold, document store (HashMap<Url, DocumentState> behind RwLock), LanguageServer impl covering initialize, initialized, shutdown, did_open, did_change, did_close, completion, hover. Capabilities: full sync, completion with : trigger, hover. - src/diagnostics.rs — re-parses on every change, maps the parser's existing <message> at line N col M format (post-#18, #20) into LSP Diagnostic ranges with severity Error and source prlg-lsp. 1-char range from the position; widening to lexeme spans is a future improvement when the parser exposes structured spans. - src/completion.rs — union of built-ins (builtin_atom_names() + builtin_functor_names() from core), stdlib predicates (parsed once from include_str!("../../../knowledge/stdlib.pl") behind OnceLock), and user-defined predicates from the current buffer. Mid-edit safety: if the whole buffer fails to parse, truncate to the last clause-ending . and retry — so completion stays useful while the user is typing. - src/hover.rs — fixed table of one-liner descriptions for ~50 built-ins; falls back to rendering the user-defined clause head for predicates the user defined in the buffer. Same mid-edit truncation as completion. - README.md — install, intended consumer (patch-prolog.nvim), manual Neovim wiring snippet, RUST_LOG note. Release-workflow updates (.forgejo/workflows/release.yml): - The sed pin-bump now loops over Cargo.toml crates/*/Cargo.toml so the lsp crate's patch-prolog-core pin moves with the workspace. - The verify gate now expects ≥3 pins (2 in root, 1 in lsp). - New publish step cargo publish -p patch-prolog-lsp between the existing 120s wait and the patch-prolog publish. - git add extended to crates/*/Cargo.toml. Verified end-to-end: - initialize handshake returns the right capabilities. - didOpen on "p(foo).\ngo :- bar(]).\n" publishes a diagnostic at line 1 char 10 with message unexpected `]` — same accuracy as prlg run's error output. - 19 unit tests cover prefix detection, completion sources (built-in, stdlib, user-defined-in-buffer), mid-edit truncation, hover for built-ins, hover for user-defined predicates, no-hover on unknown words, and a regression that internal TokenKind variant names never leak into diagnostic messages.
● All green: 6 runner + 248 integration + 148 core unit + 19 new LSP = 421 tests across the workspace. prlg-lsp
Some checks failed
CI - Linux / CI - Linux x86_64 (pull_request) Failing after 18s
318b422a0d
MVP is shipped.

  What was built:

  crates/lsp/ — new workspace member, crate patch-prolog-lsp, binary prlg-lsp.

  - src/main.rs — tower-lsp scaffold, document store (HashMap<Url, DocumentState> behind RwLock), LanguageServer
  impl covering initialize, initialized, shutdown, did_open, did_change, did_close, completion, hover.
  Capabilities: full sync, completion with : trigger, hover.
  - src/diagnostics.rs — re-parses on every change, maps the parser's existing <message> at line N col M format
  (post-#18, #20) into LSP Diagnostic ranges with severity Error and source prlg-lsp. 1-char range from the
  position; widening to lexeme spans is a future improvement when the parser exposes structured spans.
  - src/completion.rs — union of built-ins (builtin_atom_names() + builtin_functor_names() from core), stdlib
  predicates (parsed once from include_str!("../../../knowledge/stdlib.pl") behind OnceLock), and user-defined
  predicates from the current buffer. Mid-edit safety: if the whole buffer fails to parse, truncate to the last
  clause-ending . and retry — so completion stays useful while the user is typing.
  - src/hover.rs — fixed table of one-liner descriptions for ~50 built-ins; falls back to rendering the
  user-defined clause head for predicates the user defined in the buffer. Same mid-edit truncation as completion.
  - README.md — install, intended consumer (patch-prolog.nvim), manual Neovim wiring snippet, RUST_LOG note.

  Release-workflow updates (.forgejo/workflows/release.yml):
  - The sed pin-bump now loops over Cargo.toml crates/*/Cargo.toml so the lsp crate's patch-prolog-core pin moves
  with the workspace.
  - The verify gate now expects ≥3 pins (2 in root, 1 in lsp).
  - New publish step cargo publish -p patch-prolog-lsp between the existing 120s wait and the patch-prolog
  publish.
  - git add extended to crates/*/Cargo.toml.

  Verified end-to-end:
  - initialize handshake returns the right capabilities.
  - didOpen on "p(foo).\ngo :- bar(]).\n" publishes a diagnostic at line 1 char 10 with message unexpected `]`  —
  same accuracy as prlg run's error output.
  - 19 unit tests cover prefix detection, completion sources (built-in, stdlib, user-defined-in-buffer), mid-edit
  truncation, hover for built-ins, hover for user-defined predicates, no-hover on unknown words, and a regression
  that internal TokenKind variant names never leak into diagnostic messages.
Author
Owner

Review

What it does: Adds a new workspace crate patch-prolog-lsp (binary prlg-lsp) — a tower-lsp server with diagnostics (re-parse on change), completion (built-ins + stdlib + buffer-defined predicates), and hover (fixed doc table + user clause heads) — and extends the release workflow to bump/verify/publish the new crate.

Overall the design is good: reusing the core parser so squiggles match prlg run byte-for-byte, wholesale diagnostic replacement so stale errors clear, mid-edit truncation fallback, logging to stderr only. The 19 tests pass locally. But there are two real bugs, one of which will break the published crate.

🔴 Blocking

1. The published crate won't compile — include_str! reaches outside the package

crates/lsp/src/completion.rs line 19 does include_str!("../../../knowledge/stdlib.pl"), but knowledge/ belongs to the root patch-prolog package, not this crate. Verified:

$ cargo package -p patch-prolog-lsp --list
Cargo.toml  README.md  src/completion.rs  src/diagnostics.rs  src/hover.rs  src/main.rs

No stdlib.pl. Since the workflow publishes with --no-verify, the publish will succeed, then every cargo install patch-prolog-lsp will fail with "couldn't read ../../../knowledge/stdlib.pl" — and a published version number can't be reclaimed. Fixes, pick one:

  • A build.rs that copies stdlib.pl into OUT_DIR + include_str!(concat!(env!("OUT_DIR"), ...)), or
  • Best: expose STDLIB_PL as a pub const from patch-prolog-core (it already ships the parser; src/runner.rs:19 and src/compiler.rs:5 each include_str! it separately today — this would deduplicate all three).

A cargo package -p patch-prolog-lsp (no --no-verify) step in CI would have caught this and is worth adding regardless.

2. Panic on non-ASCII lines in completion

completion.rs line 164 — &line[..col] uses position.character as a byte index, but LSP positions default to UTF-16 code units. On a line containing any multibyte char before the cursor, the slice lands mid-char and panics (reproduced: end byte index 1 is not a char boundary; it is inside 'é'). Prolog source legitimately contains non-ASCII in quoted atoms/comments. Convert UTF-16 → byte offset (or at minimum clamp to the nearest char boundary). Note hover.rs::word_at_position happens to be panic-safe (it returns None when boundaries don't align) but still computes wrong words on such lines — same conversion applies.

🟡 Worth addressing

  • Duplication between completion.rs and hover.rs: last_complete_clauses_offset/last_clause_offset and predicate_indicator/predicate_pair are byte-identical pairs, and both files implement the same truncate-and-reparse fallback. A small shared module (e.g. src/buffer.rs) would also bring both files back under the ~200-line target (currently 235/266).
  • did_change takes the first change (main.rs:91). With FULL sync clients send exactly one, but if one ever sends several, the last is the authoritative full text. params.content_changes.into_iter().last() is the same cost and strictly safer.
  • Completion duplicates: a user who defines member/2 in-buffer gets it twice (stdlib + user). Dedup by (label, detail-source) or prefer the user entry.
  • trigger_characters: [":"]: only appears in :-/?-, which isn't a predicate-completion position. Harmless but probably not what was intended; identifier typing already triggers completion in most clients.

🟢 Minor / fine as-is

  • tokio with features = ["full"] — only io-std, rt, macros are needed; trims dependency build for installers.
  • Workflow changes are consistent with the existing pattern (sed loop, pin count ≥3, swallowed-publish warning matches the existing patch-prolog step). The =0.5.0 pin + workflow bump logic line up.
  • Diagnostics string-parsing of at line N col M is acknowledged tech debt with a sensible whole-file fallback, and there's a regression test that internal TokenKind names don't leak. Good.
  • Lock-poisoning handled with warn! instead of unwrap — fine for a single-threaded server.

Verdict

Solid MVP architecture and good test coverage of the logic that exists, but hold the merge until №1 is fixed — the next tagged release would publish a crate that can never compile for users. №2 is a one-line crash for any non-ASCII buffer and worth fixing in the same pass.


Review by Claude Code

# Review **What it does:** Adds a new workspace crate `patch-prolog-lsp` (binary `prlg-lsp`) — a tower-lsp server with diagnostics (re-parse on change), completion (built-ins + stdlib + buffer-defined predicates), and hover (fixed doc table + user clause heads) — and extends the release workflow to bump/verify/publish the new crate. Overall the design is good: reusing the core parser so squiggles match `prlg run` byte-for-byte, wholesale diagnostic replacement so stale errors clear, mid-edit truncation fallback, logging to stderr only. The 19 tests pass locally. But there are two real bugs, one of which will break the published crate. ## 🔴 Blocking ### 1. The published crate won't compile — `include_str!` reaches outside the package `crates/lsp/src/completion.rs` line 19 does `include_str!("../../../knowledge/stdlib.pl")`, but `knowledge/` belongs to the root `patch-prolog` package, not this crate. Verified: ``` $ cargo package -p patch-prolog-lsp --list Cargo.toml README.md src/completion.rs src/diagnostics.rs src/hover.rs src/main.rs ``` No `stdlib.pl`. Since the workflow publishes with `--no-verify`, the publish will *succeed*, then every `cargo install patch-prolog-lsp` will fail with "couldn't read ../../../knowledge/stdlib.pl" — and a published version number can't be reclaimed. Fixes, pick one: - A `build.rs` that copies `stdlib.pl` into `OUT_DIR` + `include_str!(concat!(env!("OUT_DIR"), ...))`, or - **Best:** expose `STDLIB_PL` as a `pub const` from `patch-prolog-core` (it already ships the parser; `src/runner.rs:19` and `src/compiler.rs:5` each `include_str!` it separately today — this would deduplicate all three). A `cargo package -p patch-prolog-lsp` (no `--no-verify`) step in CI would have caught this and is worth adding regardless. ### 2. Panic on non-ASCII lines in completion `completion.rs` line 164 — `&line[..col]` uses `position.character` as a byte index, but LSP positions default to **UTF-16 code units**. On a line containing any multibyte char before the cursor, the slice lands mid-char and panics (reproduced: `end byte index 1 is not a char boundary; it is inside 'é'`). Prolog source legitimately contains non-ASCII in quoted atoms/comments. Convert UTF-16 → byte offset (or at minimum clamp to the nearest char boundary). Note `hover.rs::word_at_position` happens to be panic-safe (it returns `None` when boundaries don't align) but still computes wrong words on such lines — same conversion applies. ## 🟡 Worth addressing - **Duplication between `completion.rs` and `hover.rs`:** `last_complete_clauses_offset`/`last_clause_offset` and `predicate_indicator`/`predicate_pair` are byte-identical pairs, and both files implement the same truncate-and-reparse fallback. A small shared module (e.g. `src/buffer.rs`) would also bring both files back under the ~200-line target (currently 235/266). - **`did_change` takes the *first* change** (`main.rs:91`). With FULL sync clients send exactly one, but if one ever sends several, the *last* is the authoritative full text. `params.content_changes.into_iter().last()` is the same cost and strictly safer. - **Completion duplicates:** a user who defines `member/2` in-buffer gets it twice (stdlib + user). Dedup by `(label, detail-source)` or prefer the user entry. - **`trigger_characters: [":"]`** — `:` only appears in `:-`/`?-`, which isn't a predicate-completion position. Harmless but probably not what was intended; identifier typing already triggers completion in most clients. ## 🟢 Minor / fine as-is - `tokio` with `features = ["full"]` — only `io-std`, `rt`, `macros` are needed; trims dependency build for installers. - Workflow changes are consistent with the existing pattern (sed loop, pin count ≥3, swallowed-publish warning matches the existing `patch-prolog` step). The `=0.5.0` pin + workflow bump logic line up. - Diagnostics string-parsing of `at line N col M` is acknowledged tech debt with a sensible whole-file fallback, and there's a regression test that internal `TokenKind` names don't leak. Good. - Lock-poisoning handled with `warn!` instead of unwrap — fine for a single-threaded server. ## Verdict Solid MVP architecture and good test coverage of the logic that exists, but **hold the merge until №1 is fixed** — the next tagged release would publish a crate that can never compile for users. №2 is a one-line crash for any non-ASCII buffer and worth fixing in the same pass. --- *Review by Claude Code*
refactor crates
Some checks failed
CI - Linux / CI - Linux x86_64 (pull_request) Failing after 18s
c34d8fa26d
● All six tasks done. Multibyte buffer with % é\np(foo).\np :- mem returned
All checks were successful
CI - Linux / CI - Linux x86_64 (pull_request) Successful in 46s
ccc3e4af9c
member in completion — the pre-fix panic case now works end-to-end.

  Final state after PR #36 review work:

  🔴 Blocking #1 (publish-breaker) — stdlib.pl moved to
  crates/patch-prolog-core/, exposed as pub const STDLIB_PL: &str. Single
  source of truth across the workspace. Three call sites (cli/runner.rs,
  cli/compiler.rs, lsp/completion.rs) plus three integration tests all
  consume from there. No include_str! in the LSP crate any more — cargo
  package -p patch-prolog-lsp --list confirms only crate-local files are
  bundled.

  🔴 Blocking #2 (UTF-16 panic) — new position_to_byte_offset helper in
  crates/lsp/src/buffer.rs. Walks chars accumulating len_utf16(), stops at
  the target, clamps past-end-of-line and inside surrogate pairs to char
  boundaries. Applied in both completion::current_word_prefix and
  hover::word_at_position. Three regression tests (emoji surrogate pair,
  multibyte before cursor, line past end).

  🟡 Worth addressing — all four done:

  1. Dedup completion/hover — new crates/lsp/src/buffer.rs with
  position_to_byte_offset, last_clause_offset, parse_best_effort,
  predicate_indicator. Removes ~70 lines of duplication. Both consumers now
  ~150 lines each (back under the 200-line target).
  2. did_change takes .last() instead of .next().
  3. Completion dedup — user-defined predicates shadow stdlib (and stdlib
  shadows built-ins) keyed on (name, arity). New test confirms a
  buffer-defined member/2 hides the stdlib copy.
  4. Dropped trigger_characters: [":"] — replaced with
  CompletionOptions::default() and a comment explaining why.

  🟢 Nit — done: tokio features trimmed from "full" to ["io-std", "macros",
  "rt"]. Saves dependency build time for installers.

  CI safety net — new just pkg-check recipe runs cargo package --no-verify on
  all three publishable crates. Added to just ci. Catches metadata +
  bundling issues. Acknowledged limitation: the full --verify would catch the
  original include_str! bug but pulls from crates.io (chicken-and-egg
  pre-release); the release workflow's publish-with-verify-after-core-indexed
  is the final gate.

  Test count: 431 (6 runner + 248 integration + 148 core + 29 LSP = a 10-test
  increase: 3 buffer multibyte tests, 1 completion multibyte regression, 1
  completion dedup, 1 hover multibyte, plus the migrated extraction-coverage
  tests).

  Both binaries installed; prlg --version reports 0.5.0; LSP smoke-test
  responds to completion against a buffer with é in a comment without
  panicking.
Commenting is not possible because the repository is archived.
No description provided.