Furthened decomp work

This commit is contained in:
Marco 2026-04-10 18:14:55 +02:00
commit 28cbbe3470
519 changed files with 1498 additions and 43421 deletions

View file

@ -38,18 +38,47 @@ These functions are now renamed live in the active `REGRET.EXE` database:
- `1398:0086` = `usecode_debugger_open_for_current_unit`
- `1398:020d` = `usecode_debugger_open_modal`
- `1398:0291` = `usecode_debugger_format_expression_to_shared_buffer`
- `1398:088f` = `usecode_debugger_source_pane_create`
- `1398:0ba7` = `usecode_debugger_source_pane_handle_command`
- `1398:0f16` = `usecode_debugger_source_pane_handle_pointer_event`
- `1398:1088` = `usecode_debugger_source_line_copy_for_display`
- `1398:1118` = `usecode_debugger_source_pane_draw_visible_lines`
- `1398:1413` = `usecode_debugger_source_pane_clamp_viewport`
- `1398:15ac` = `usecode_debugger_source_pane_load_file`
- `1398:1791` = `usecode_debugger_source_pane_draw`
- `1398:193f` = `usecode_debugger_source_pane_handle_click`
- `1398:19b1` = `usecode_debugger_gump_create`
- `1398:1c2c` = `usecode_debugger_translate_registered_event`
- `1398:1dc6` = `usecode_debugger_forward_child_event`
- `1398:1df3` = `usecode_debugger_handle_event`
- `13e0:0000` = `usecode_debugger_break_state_create`
- `13e0:0053` = `usecode_debugger_break_state_update_line_and_maybe_break`
- `13e0:00dd` = `usecode_debugger_break_state_add_breakpoint`
- `13e0:01a5` = `usecode_debugger_break_state_remove_breakpoint`
- `13e0:0230` = `usecode_debugger_break_state_find_breakpoint_or_next_index`
- `13e0:029e` = `usecode_debugger_break_state_has_breakpoint`
- `13e0:02f5` = `usecode_debugger_break_state_push_current_entry`
- `13e0:03b0` = `usecode_debugger_break_state_push_current_entry_copy`
- `13e0:03f7` = `usecode_debugger_break_state_pop_current_entry`
- `13e0:0419` = `usecode_debugger_break_state_enable_single_step`
- `13e0:0432` = `usecode_debugger_break_state_clear_runtime_break_flags`
- `13e0:0444` = `usecode_debugger_break_state_get_current_entry`
- `13e0:046f` = `usecode_debugger_break_state_vtable_slot0_noop`
- `13e0:0474` = `usecode_debugger_break_state_vtable_slot1_return_zero`
- `13f0:0000` = `interpreter_push_saved_farptr`
- `13f0:003c` = `interpreter_pop_saved_farptr`
- `13f0:00e8` = `usecode_interpreter_context_init`
- `13f0:0244` = `usecode_interpreter_context_create`
- `13f0:035f` = `usecode_interpreter_context_load_source_cursor_from_global_unit`
- `13f0:038b` = `usecode_debugger_interpreter_hook`
- `13f8:10da` = `usecode_interpreter_run_context_with_debugger_hook`
- `13f8:1d72` = `entity_vm_runtime_get_slot_chunk_ptr_at_offset`
- `1398:2c2e` = `usecode_debugger_source_buffer_create_from_path`
- `1398:2ca0` = `usecode_debugger_source_buffer_destroy`
- `1398:2d14` = `usecode_debugger_source_buffer_open_from_path`
- `1398:2e0a` = `usecode_debugger_source_buffer_load_text`
- `1398:2f4f` = `usecode_debugger_source_buffer_split_lines_in_place`
- `1398:301d` = `usecode_debugger_source_buffer_get_line_ptr`
Why `1398:0086` matches the current-unit wrapper:
@ -63,6 +92,21 @@ Why `1398:020d` matches the modal-open wrapper:
- It is the smaller sibling that creates the same gump and immediately sends it through `Dispatch_ModalGump` without the extra current-unit file-load path.
Newly closed helper roles in this pass:
- `usecode_debugger_source_pane_create` is the source-view child-gump constructor used by `usecode_debugger_gump_create`.
- `usecode_debugger_source_pane_handle_command` owns source-view commands such as line scroll, page navigation, search navigation, and line breakpoint toggles.
- `usecode_debugger_source_pane_handle_pointer_event` converts pointer coordinates into source line/column space and updates viewport or selection state.
- `usecode_debugger_source_line_copy_for_display` expands tabs into fixed-width spaces before a source line is rendered.
- `usecode_debugger_source_pane_draw_visible_lines` is the stronger local draw body for visible source rows, including current-line highlight and breakpoint marks.
- `usecode_debugger_source_pane_clamp_viewport` constrains the source viewport and syncs the pane against child scrollbars.
- `usecode_debugger_source_pane_load_file` is the common file-load path used both by the current-unit opener and by the file-open event lane.
- `usecode_debugger_source_pane_draw` renders the visible source pane from the loaded `.unk` buffer and overlays debugger-side markers.
- `usecode_debugger_source_pane_handle_click` maps mouse position back to a visible line index.
- `usecode_debugger_break_state_add_breakpoint`, `remove_breakpoint`, `find_breakpoint_or_next_index`, and `has_breakpoint` close the breakpoint-table maintenance cluster.
- `usecode_debugger_break_state_push_current_entry`, `push_current_entry_copy`, and `pop_current_entry` close the current-entry stack helper cluster enough to stop treating it as anonymous storage logic.
- `usecode_debugger_source_buffer_create_from_path`, `destroy`, `open_from_path`, `load_text`, `split_lines_in_place`, and `get_line_ptr` now close the line-indexed source-buffer ownership chain instead of leaving the file-loader side anonymous.
### 3. The event dispatcher is the real debugger dispatcher, not a generic text UI
Live recovery of `1398:1df3` shows the same usecode-debugger-style command/state machine seen in retail No Remorse, just relocated.
@ -80,6 +124,17 @@ Recovered dispatcher lanes in this pass include:
This is much stronger than a generic console/editor interpretation. It is the same hidden debugger family in functional form.
Wider interaction map from the disassembly cleanup:
- the `RUN` lane clears `+0x74/+0x75` and then resumes through the gump callback rather than interpreting source text directly
- the single-step lane routes through `usecode_debugger_break_state_enable_single_step`
- one branch clears the current line's breakpoint bit in the loaded source-line table
- another clears the full 10-line visible breakpoint bitmap band in the current source view
- one branch prompts for a line number and routes the result through the source-pane selection helper
- another prompts for a search string, then scans forward through `usecode_debugger_source_buffer_get_line_ptr` results until a match is found
That broader event map makes the loaded source buffer and the breakpoint table look like first-class debugger subsystems, not mere UI decoration.
### 4. No Regret preserves the missing bootstrap writer
This was the key result of the pass.
@ -148,7 +203,7 @@ Retail No Remorse still has the base break-state object and the inert callback s
### 7. The interpreter-side hook also still appears to be wired
`13f0:038b` is called from `13f8:10fa` and reads the debugger global at `1480:712c/712e` before calling `13e0:0432`.
`13f0:038b` is called from `13f8:10fa` inside the newly named `usecode_interpreter_run_context_with_debugger_hook`, and reads the debugger global at `1480:712c/712e` before calling `13e0:0432`.
Recovered disassembly around the key handoff:
@ -159,6 +214,20 @@ Recovered disassembly around the key handoff:
That is the current strongest evidence that No Regret preserves not just the UI/event layer and not just the constructor/store path, but also the interpreter-side consumer/hook path that can actually consult the debugger object during VM execution.
The wrapper at `13f8:10da` now makes that path clearer:
- it marks the live interpreter context active via bytes near `+0x122/+0x123`
- it passes two runtime context pointers into `usecode_debugger_interpreter_hook` from the same parent object (`+0x121` and `+0x36`)
- if the hook returns the continue code, the wrapper falls back into the normal virtual dispatch lane
That is not the shape of a manual debugger launcher. It is the shape of ordinary VM execution with an optional debugger sidecar.
The final direct-xref closure in Regret is now tighter than the earlier pass:
- exhaustive `1480:712c/712e` data-use recovery now shows only the already-named bootstrap/open/format/draw/event functions plus `usecode_debugger_interpreter_hook`; no extra hidden debugger-object consumers surfaced outside that cluster
- direct address searches for the helper entry points also stayed narrow: `usecode_debugger_break_state_clear_runtime_break_flags` is only called from the hook prologue, `usecode_debugger_break_state_pop_current_entry` is only called from the hook unwind, `usecode_debugger_interpreter_hook` is only called from `usecode_interpreter_run_context_with_debugger_hook`, and `usecode_interpreter_context_create` is only called from the upstream usecode-process/context factory path at `13f8:0eec`
- that means the remaining uncertainty is no longer broad subsystem discovery; it is the exact location of the missing current-entry push inside an already-identified interpreter/runtime path
### 8. No Regret appears to keep an auto-open-on-break path that retail No Remorse lacks
This is the practical launch-path conclusion.
@ -186,7 +255,8 @@ Current direct-caller state in the live database:
- no normal direct callers to `usecode_debugger_open_for_current_unit`
- no normal direct callers to `usecode_debugger_open_modal`
- `usecode_debugger_handle_event` is reached only through debugger callback wrappers (`usecode_debugger_translate_registered_event` and `usecode_debugger_forward_child_event`)
- `usecode_debugger_interpreter_hook` is called from `13f8:10da`, which looks like a real runtime/interpreter-side consumer
- `usecode_debugger_interpreter_hook` is called from `usecode_interpreter_run_context_with_debugger_hook`, which looks like a real runtime/interpreter-side consumer
- no normal direct callers are yet recovered for `usecode_debugger_break_state_push_current_entry` or `usecode_debugger_break_state_push_current_entry_copy`
So the current best distinction is:
@ -195,6 +265,449 @@ So the current best distinction is:
In other words, No Regret looks better wired than retail No Remorse, but still not yet proven to expose a deliberate user-facing launcher.
### 10. The required seeding currently looks engine-side, not like a compiled-usecode feature
This pass materially tightens the `who seeds the records?` question.
What the new names and caller map now support:
- `usecode_debugger_break_state_pop_current_entry` has a direct caller in `usecode_debugger_interpreter_hook`
- `usecode_interpreter_run_context_with_debugger_hook` wraps an ordinary interpreter execution path, not a debugger-only path
- the hook clears runtime break flags on entry, runs the interpreter loop, and unwinds debugger current-entry depth on exit
- the current-entry push helpers still have no recovered script-visible or UI-visible callers
- the new source-buffer/file-load chain is only consumed by debugger UI and current-unit open paths, not by the interpreter-side seeding path
Caller-recovery status after the wider disassembly pass:
- the direct caller of `usecode_debugger_break_state_push_current_entry` is still not surfaced as a normal xref in the current Regret database
- the strongest current structural candidate remains an unrecovered or still-overlapped interpreter-side producer analogous to retail `Interpreter_NextUsecodeOp`
- the deeper recovered Regret chain is now explicit: `Usecode_ItemCallEvent -> 13f8:0eec -> usecode_interpreter_context_create -> usecode_interpreter_context_init / interpreter_push_saved_farptr -> usecode_debugger_interpreter_hook`
- that outer factory path also explains where the debugger-side source cursor comes from: `13f8:0eec` and `usecode_interpreter_context_load_source_cursor_from_global_unit` both use `entity_vm_runtime_get_slot_chunk_ptr_at_offset` against the current live usecode root at `1480:71a1/71a3`
- the nearby `13f0` context/setup helpers seed the live interpreter object's source-stream and frame-base lanes (`+0xd6/+0xd8` and `+0xda/+0xdc`), but the visible direct call path there still reaches stack/context helpers rather than the debugger push helper itself
- a seemingly promising `1458:` lane that writes fields at `+0x72/+0x74/+0x76/+0x78/+0x7a/+0x7c` turned out to be the RIFF/animation parser family, not the debugger, so it should be treated as a false structural cousin rather than as debugger seeding evidence
What this now rules out with fairly high confidence:
- there is no additional large unnamed Regret-side debugger subsystem still hiding outside the already recovered `1398` / `13e0` / `13f0` / `13f8` lanes
- there is no second recovered writer or alternate global-owner path for the debugger object beyond `usecode_debugger_bootstrap_init`
- there is no recovered direct UI/source-buffer-side path that seeds the current-entry stack before `RUN`
- there is no ordinary standalone caller of `usecode_debugger_break_state_push_current_entry` waiting to be found by one more basic xref sweep
Current best read from that combination:
- current-entry seeding is part of engine-side interpreter bookkeeping around live usecode execution
- it is not currently evidenced as a direct feature toggled by `.unk` contents
- it is not currently evidenced as a distinct compiled-usecode opcode that scripts can invoke to manufacture debugger state
Compiled usecode still matters, but in a narrower way:
- it can carry `LINE_NUMBER` metadata in builds/corpora that preserve it
- it provides the runtime activity that naturally flows through the interpreter wrapper/hook path
- but the actual debugger-object and current-entry stack ownership still appears to live in executable-side VM code
So the current answer to `is there an automated mechanism?` is `yes, probably`, but the mechanism currently looks like ordinary interpreter-side debug bookkeeping rather than a dedicated source-file or script-level launcher feature.
### 11. What the seeding record actually is, and what that implies for simulation
The current-entry push helpers are now explicit enough to answer the practical `what is being seeded?` question.
`usecode_debugger_break_state_push_current_entry` writes one `0x15`-byte inline record into the debugger break-state object:
- `+0x00..+0x08` = inline unit-name string, asserted to fit in `8` bytes plus terminator
- `+0x09..+0x0c` = first runtime far pointer
- `+0x0d..+0x10` = second runtime far pointer
- `+0x11..+0x14` = third runtime far pointer
The important part is not the string. It is what the debugger later does with the payload dwords.
Known current consumers:
- `usecode_debugger_open_for_current_unit` uses the inline unit name to build the `.unk` path under `s_usecode`
- `usecode_debugger_format_expression_to_shared_buffer` resolves the top current-entry record and passes entry `+0x09` and `+0x0d` into `FUN_1398_045c`
- the formatter helper then dereferences those payload dwords as live source/descriptor and frame/evaluation context, not as inert metadata
That means the debugger seeding issue is **not** `how do we invent filenames or line numbers`. It is `how do we capture a valid live VM snapshot and serialize it into the break-state stack`.
The recovered interpreter-side precursor path now explains where that snapshot data naturally comes from:
- `usecode_interpreter_context_create` seeds the live interpreter context with source-stream cursor at `+0xd6/+0xd8`
- the same helper seeds a second live lane at `+0xda/+0xdc` that the hook immediately dereferences as a frame/stream-adjacent context root
- `+0xe1/+0xe3` is the strongest current candidate for the third current-entry dword, though that trailing entry field still lacks a live consumer in Regret
- `usecode_interpreter_context_load_source_cursor_from_global_unit` and the upstream `13f8:0eec` factory both derive the source cursor from the current usecode root at `1480:71a1/71a3` through `entity_vm_runtime_get_slot_chunk_ptr_at_offset`
So the seeding data is already present in ordinary in-process runtime state. The missing piece is the serializer/call site that pushes it into the debugger break-state object.
### 12. Existing data versus live memory: what has to be true for the debugger to work
Current best answer:
- **existing data is necessary but not sufficient**
- the live interpreter context already contains the source cursor and frame payload the debugger needs
- but those values still have to be copied into the debugger break-state stack at the right moment
In other words, this is not a pure offline data-format problem.
What can be prepared offline:
- a loadable `.unk` for the current unit
- line-number metadata, if a future Regret-side injector is built
- breakpoint/source text that matches the compiled usecode unit name
What cannot be prepared offline with the current evidence:
- the live source-stream cursor far pointer the formatter dereferences
- the live frame/evaluation payload far pointer the formatter dereferences
- the surrounding timing of when the hook expects the current-entry depth to exist and when it later unwinds it
That is why `.unk` export alone can open files, but does not give stable `RUN` / step behavior.
### 13. External-process hypothesis: possible in principle, but weak in the current evidence
The idea that original developers might have used an external helper is worth testing against the recovered structure.
What the current evidence supports poorly:
- no recovered IPC, serial, network, or file-polling path tied specifically to the hidden debugger
- no recovered external-loader or sidecar command path that would import a prebuilt current-entry record
- no evidence that the `.unk` file or compiled-usecode stream carries the three live payload dwords directly
What an external helper **would** have to do if it existed:
- locate the debugger object in live DOS memory
- know the exact break-state layout
- locate the active interpreter context for the running unit
- copy the current unit name plus at least the source cursor and frame payload dwords into the debugger stack record
- keep pace with interpreter execution because the hook currently pops one current-entry record on unwind/exit
That is possible in principle for a development-only DOS memory poking tool, but it would still be a **live memory injector**, not an external source-file preprocessor.
The stronger current explanation is simpler:
- the game already has all the required data in-process
- Regret still has the debugger object, hook, open path, and stack helpers
- the missing visible push is most likely inside a table-driven, overlapped, or otherwise not-cleanly-recovered interpreter path rather than in a separate external system
So the external-process hypothesis is not impossible, but it is not needed to explain the current evidence and currently looks less likely than an internal interpreter-side producer.
### 14. How we can simulate the seeding today
There are three realistic simulation strategies, and they are not equally good.
#### A. Reuse the game's own runtime data with a small in-process patch
This is the most promising route.
The recovered Regret path already creates the live context data we need:
- `Usecode_ItemCallEvent -> 13f8:0eec -> usecode_interpreter_context_create -> usecode_debugger_interpreter_hook`
So the smallest high-confidence simulation is to add or re-enable a push at an existing in-process point where all ingredients are already live:
- current unit name
- source-stream cursor
- frame payload pointer
- debugger object pointer
Conceptually, this means restoring the missing `push_current_entry` behavior near the interpreter-context creation / hook handoff instead of inventing new data.
Why this is strongest:
- it uses the game's own live VM pointers
- it avoids guessing structure contents from outside the process
- it naturally matches the hook's later unwind behavior
#### B. Inject one current-entry record into live memory from outside the running game
This is possible, but it is a second-choice route.
For a minimal one-shot debugger open, the injected state would need at least:
- debugger object at `1480:712c/712e` already valid
- depth `+0x7a >= 1`
- inline unit-name record at `+0x7c`
- entry `+0x09` = valid live source cursor far pointer
- entry `+0x0d` = valid live frame payload far pointer
- entry `+0x11` = probably safe as zero or borrowed from the third interpreter lane unless a new consumer proves otherwise
- current line `+0x72` set coherently enough for the source pane and breakpoint checks
This could probably make `open_for_current_unit` and some expression/inspect flows work if the payload pointers are borrowed from a real live interpreter context.
Why this is weaker for stable `RUN` / step:
- the hook explicitly pops one current-entry record on exit
- if the normal push path is still absent, an external injector would need to re-seed repeatedly or patch around the unwind behavior
- blind injection without borrowing real live pointers is likely to crash when the formatter or hook dereferences the payload lanes
So yes, **injecting live data into a running DOSBox game could work**, but only if the injector is aware of the live interpreter context and not just writing guessed constants.
#### C. Offline-only tricks such as `.unk` rewriting or line-number injection
This is useful, but it does not solve the seeding problem by itself.
- better `.unk` files improve source loading
- line-number injection would improve current-line fidelity in Regret
- neither one creates the live source cursor / frame payload snapshot the debugger actually consumes
So offline-only work is supportive, not sufficient.
### 15. What an in-process patch would actually do
The safest current model is **not** `hex-edit the EXE until it behaves`. It is `restore one missing debugger-side serialization step at a point where the game already has all required live data`.
The candidate in-process patch is small in *behavioral* scope even if it still has to be handled carefully at the byte level:
- leave the existing debugger object/bootstrap path alone
- leave the existing UI/event/gump code alone
- leave the existing interpreter context creation path alone
- add or re-enable one call into `usecode_debugger_break_state_push_current_entry` or `...push_current_entry_copy` at a point where the current unit name and the three payload dwords already exist live
In concrete terms, the best current patch window is the already-recovered handoff around:
- `Usecode_ItemCallEvent`
- `13f8:0eec`
- `usecode_interpreter_context_create`
- `usecode_debugger_interpreter_hook`
That is the narrowest place where all of these are already true at once:
- a live interpreter/process object exists
- the source-stream cursor is already computed
- the frame/evaluation payload lane already exists
- the debugger object global can already be tested
So an in-process patch would most likely mean one of these two designs:
1. restore a missing direct push call near context creation or immediately before the debugger hook runs
2. synthesize the equivalent `0x15`-byte record in-place and then increment debugger depth exactly once per entered debugged frame
The first design is strongly preferred because it reuses the shipped helper and keeps the later pop/unwind behavior matched.
### 16. Why this should be done in Ghidra on a writable patch target, not by blind hex editing
The user's earlier failures with larger EXE edits fit the current structural risks.
What makes blind hex editing risky here:
- 16-bit NE code is tightly laid out and often table-driven
- some candidate lanes already show overlap/thunk-like behavior
- changing instruction length casually can invalidate nearby control flow or jump tables
- editing the main reference executable directly makes it too easy to lose track of which behavior is original versus experimental
What makes a Ghidra-side patch workflow safer:
- patch bytes can be applied at a studied address with nearby disassembly context preserved
- reanalysis can show whether the patched basic block still disassembles coherently
- comments can record intent directly at the patch site
- the experiment can be kept on a dedicated writable copy instead of the reference executable
So yes, Ghidra can be used for this kind of patching, and the MCP layer also exposes byte patching, but the current project rules still matter:
- **do not patch `REGRET.EXE` directly as the reference binary**
- only patch an explicitly writable copy, normally under a dedicated writable target folder
- study the exact callsite and byte budget first
The right mental model is `surgical patch on a writable clone`, not `edit the shipped EXE in place`.
### 17. Which debugger features depend on seeding, and which do not
The recovered event dispatcher at `1398:1df3` now makes the dependency split much clearer.
Features that are mostly just break-state/UI flags:
- `RUN` clears `+0x74/+0x75` and resumes through the existing callback path
- `break next` sets the break-next latch by writing `+0x74 = 1` with `+0x75 = 0`
- `single step` calls `usecode_debugger_break_state_enable_single_step`
- source-file open, goto-line, search, and breakpoint-table editing are mainly source-buffer/UI features once the debugger is open
Features that clearly depend on seeded current-entry runtime payload:
- `Inspect what?`
- `Watch what?`
- current-unit open centered on the correct unit/line
- any resume/step flow that expects the debugger to know the active frame context
Why watch/inspect depend on seeding:
- the dispatcher routes both commands through the debugger object's formatter callback
- `usecode_debugger_format_expression_to_shared_buffer` resolves the top current-entry record from the break-state stack
- it passes entry `+0x09` and `+0x0d` into `FUN_1398_045c`
- that helper dereferences those values as live descriptor/source and frame/evaluation context
So watch and inspect are not a workaround for missing seeding. They are actually one of the strongest proofs that seeding must be correct.
The `Global name` command is different.
What case `0x0f` in the dispatcher currently appears to do:
- prompt for a global symbol name
- resolve that symbol through `FUN_13e8_039f`
- copy the current bytes of the resolved global into a temporary buffer
- prompt for replacement bytes
- validate each entered byte is within `0x00..0xff`
- write the replacement bytes back into the live backing buffer
That means `change globals` looks comparatively independent of callstack seeding once the debugger UI itself is alive. It edits resolved global data by symbol name rather than by current frame context.
Practical implication:
- if the debugger can be opened at all, `change globals` may be one of the first useful commands to test
- `watch` and `inspect` are later-stage validation targets because they are stricter consumers of the seeded runtime payload
### 18. Do the other debugger features help us get debug functionality working?
Yes, but mostly as **validation targets**, not as bootstrap mechanisms.
Most useful feature signals after a bring-up patch:
1. `open current unit` works and loads the right `.unk`
2. source navigation, search, goto-line, and breakpoint toggles work without crashing
3. `change globals` can resolve and write a known symbol safely
4. `watch` and `inspect` return coherent values instead of crashing or producing garbage
5. `RUN`, `break next`, and `single step` survive at least one execution/unwind cycle
Those checks tell us progressively more:
- stages 1-3 prove the UI, file loading, symbol lookup, and non-frame-sensitive debugger features are alive
- stages 4-5 prove the current-entry seeding is actually coherent enough for real runtime debugging
So the extra debugger commands help a lot in narrowing validation order, but they do not remove the need to solve seeding first.
### 19. Most likely path to getting working debug functionality in No Regret
Current best path, in order of likelihood and engineering sanity:
1. Create a writable Regret patch target and keep the reference `REGRET.EXE` untouched.
2. Identify the smallest concrete callsite near `13f8:0eec -> usecode_interpreter_context_create -> usecode_debugger_interpreter_hook` where a debugger push can be added with minimal byte disruption.
3. Prefer a patch that calls the shipped `usecode_debugger_break_state_push_current_entry` helper rather than reimplementing the record layout by hand.
4. If a direct call will not fit cleanly, use a tiny detour/trampoline into spare writable code space on the patch target rather than forcing a larger inline rewrite.
5. Validate the patch first by opening the debugger and exercising file open, goto-line, search, and `change globals`.
6. Only then test `watch` and `inspect`, because they are the first strong consumers of the seeded runtime payload.
7. Only after those work, test `RUN`, then `break next`, then `single step`, because the pop-on-unwind behavior means resume features are stricter than one-shot UI success.
What is *less* likely to be the winning path:
- offline `.unk` or line-number work alone
- a large multi-site executable rewrite
- blind direct hex editing of the reference binary
- an external helper unless it is effectively a smart live-memory injector that reads the real interpreter context
So the most likely route to working debug functionality is:
- **small in-process patch on a writable Regret copy**
- **reuse shipped helper(s) and existing VM context data**
- **validate in stages, with `change globals` before `watch/inspect`, and `watch/inspect` before full resume/step**
### 20. First writable-patch prototype applied to `REGRET-PATCHED.EXE`
As of this pass, the writable target `/Writable/REGRET-PATCHED.EXE` has a first proof-of-concept patch in the live Ghidra database.
Applied patch shape:
- patched call site at `13f8:10fa`
- original `CALLF 0x13f0:038b`
- now redirected to local trampoline `13f8:2040`
Local trampoline behavior at `13f8:2040`:
- check debugger object global `1480:712c/712e`
- derive the current process base from the hook context pointer
- resolve the unit-name far pointer from `g_processNames`
- read the live interpreter context lanes at `+0xd6/+0xd8`, `+0xda/+0xdc`, and `+0xe1/+0xe3`
- call `usecode_debugger_break_state_push_current_entry`
- tail into the original `usecode_debugger_interpreter_hook`
This is intentionally narrow. It does **not** try to rewrite the debugger UI, breakpoint tables, source loader, or watch/inspect formatter. It only restores one likely missing seeding step before the original hook runs.
Known remaining risk in this prototype:
- it assumes one current-entry push per wrapped hook invocation is the right balance for the later pop-on-unwind path
- it assumes the `g_processNames` entry is the correct unit-name source in this lane
- it assumes the third current-entry dword can be borrowed from `+0xe1/+0xe3` without harming current consumers
So this should be treated as a tightly scoped runtime experiment, not as a final debugger restoration.
### 21. What to test on the patched writable target
Recommended test order:
1. Verify the patched executable still starts at all.
2. Reproduce the same path that previously opened the hidden debugger with a loadable `.unk`.
3. Confirm the debugger window opens without an immediate crash.
4. Confirm the current-unit source file loads and the source pane is populated.
5. Test `Goto Line` and `Search for` first, because they exercise source-buffer/UI logic without depending on deep expression evaluation.
6. Test adding and removing a breakpoint marker in the source pane.
7. Test `Global name` on a harmless known symbol first, because this is the least callstack-sensitive mutation feature.
8. Test `Inspect what?` on a simple symbol/expression.
9. Test `Watch what?` with one watch entry.
10. Only after the above succeed, test `RUN`.
11. If `RUN` survives once, test `break next`.
12. Last, test `single step`.
What to note during testing:
- whether the debugger opens but still crashes only on `RUN`
- whether `Inspect` / `Watch` now produce values instead of crashing or returning obvious garbage
- whether the source view follows the current unit/line more coherently than before
- whether `Global name` is usable before the resume/step features are stable
- whether any crash now happens on unwind/return instead of on initial open, which would point at push/pop imbalance rather than missing seeding
### 22. The line-number injector idea is viable, but the real target is Regret, not retail Remorse
The new evidence makes that split much clearer.
What the current parser/extractor already shows:
- retail Remorse compiled usecode already carries line markers in the present parser pass: `977 / 977` events currently recover `LINE_NUMBER` ops
- the map-viewer decompiler already recognizes the opcode directly as `0x5b = LINE_NUMBER`
- current retail Regret compiled usecode is the stripped side in the present parser pass: `1152 / 1152` events currently recover `0` `LINE_NUMBER` ops
So the immediate correction is:
- a line-number injector is **not** mainly a retail Remorse need
- it is mainly a Regret recovery idea if the goal is to improve debugger/source correlation when loading usecode through `-u`
The idea itself is structurally plausible, but only if it is treated as a bytecode rewriter rather than as a text-side export trick.
Strongest viable shape:
1. decode existing compiled usecode into an IR that preserves exact opcode offsets and branch targets
2. attach desired line numbers to existing instruction boundaries from the pseudocode/IR mapping
3. inject `LINE_NUMBER` opcodes into the existing event streams
4. rewrite event sizes and any offset-based control-flow operands that shift because of the inserted bytes
5. rebuild a patched `EUSECODE.FLX` for `-u`
Why this is more than `just add debug markers`:
- even though the target opcode is simple, inserting bytes changes event layout
- any relative jump, switch, or offset-based control operand inside the event stream has to be rebased correctly
- the hard part is not inventing pseudocode again; it is preserving existing compiled behavior while changing stream length safely
Current best use for that tool if it is ever built:
- improve Regret line fidelity inside the debugger for `-u`-loaded compiled usecode
- potentially give breakpoints/current-line display a better match than the current synthetic floor
Current limit of that idea:
- even a perfect line-number injector would still not replace the missing interpreter-seeded current-entry stack
- it can improve source correlation, but it does not solve the runtime seeding problem by itself
### 23. Failed raw helper-host experiments to avoid repeating
These are the patch shapes that were tried and should be treated as dead ends for now:
- `13f8:2040..20b9` as a startup-path runtime helper: caused startup aborts and GPFs when booted through the interpreter hook path, including `Load program failed -- Error code 201`, `Abort: IRET: Illegal descriptor type 0x0`, and `General protection fault detected`.
- `13f8:20dd..2157` as a second helper host: allowed the game to reach startup, but then it immediately exited to the `No Pity. No Mercy. No Regret.` line, so it is also not a safe host.
- any variant that detours `13f8:10fa` away from retail at startup: the safe read is now that the startup wrapper should stay retail, because the runtime helper experiments are what destabilized boot.
- any variant that tries to keep the runtime helper logic alive inside segment `13f8` without first proving the target region is truly unused: those edits are too close to live startup code in this build.
Current safe patch shape:
- keep the startup wrapper retail
- keep the `loosecannon` trigger retargets only
- use the debugger bootstrap/open-modal path from the hidden cheat lane only
- do not re-enable the runtime-helper branch until a genuinely unused function-sized host region is identified
## Current Cross-Build Implications
### What this changes
@ -1017,12 +1530,25 @@ Because no sample `.unk` corpus is currently present, the best next move is not
That reconstruction step is now far enough along to try in practice.
A new generator was added at `tools/generate_usecode_unk.py`. It walks an extracted Crusader USECODE root, parses each class body through the existing local usecode decompiler, and writes one synthesized `<unit>.unk` file per class name.
The original Python generator at `tools/generate_usecode_unk.py` established the first manufactured corpus, but the JavaScript exporter in the map-viewer project is now the version that should be used for development and testing going forward.
Current active workflow:
- authoritative development path = `Crusader-Map-Viewer/map_renderer/src/lib/usecode-unk-exporter.js`
- cache pipeline owner = `Crusader-Map-Viewer/map_renderer/src/lib/usecode-decompiler.js`
- JS-built `.unk` outputs now live in the map-viewer cache under `.cache/usecode/<game>/<source>/.data/`, next to the decompiled pseudocode tree
- Python generator remains useful as historical reference and fallback, but ongoing `.unk` iteration should be validated against the JavaScript exporter output
Operational note:
- the map-viewer `build-usecode` command only refreshes `.cache/usecode/...` output
- the intended development/testing corpus is the cache-local `.data` output, not a copy written back into the Crusader repo
- this matters for debugger bring-up because the old cache layout mixed `.unk` files into the pseudocode tree and encouraged the wrong sync workflow
Current generated Regret corpus:
- output root: `USECODE/REGRET`
- manifest: `USECODE/REGRET/SYNTH_UNK_MANIFEST.tsv`
- output root: `.cache/usecode/regret/EUSECODE/.data`
- manifest: `.cache/usecode/regret/EUSECODE/.data/SYNTH_UNK_MANIFEST.tsv`
- synthesized files written: `477`
Current shape of each generated `.unk` file:
@ -1036,12 +1562,31 @@ Current practical limitation:
- in the present Regret extracted corpus, the parsed class bodies examined so far appear to have `0` surviving `line_number` op markers
- that means the generated Regret `.unk` files currently behave mostly as readable reconstructed source appendices, not as truly line-accurate original debug-source replicas
Current loader-side constraint that matters for the crash:
- live decompile of `1398:2f4f` shows the debugger line splitter writes `0` to the byte immediately before each `\n`
- that means a file must not begin with a bare newline, or the splitter underwrites one byte before the buffer start
- it also strongly indicates the debugger expects DOS-style `CRLF` line endings so the overwritten byte is the `\r`, not the last visible character of the line
- this makes the previous sparse-output shape unsafe for real debugger bring-up whenever the file began with blank padding or with an appendix-only body
Current executable-side `RUN` findings:
- `usecode_debugger_handle_event` case `3` is the real `RUN` path; it does **not** execute or parse source text, it clears the debugger object's runtime break latches and resumes through the gump callback
- the post-resume debugger path still depends on source-line identity: `usecode_debugger_open_for_current_unit` recenters the viewer on debugger-state line `+0x72 - 1`, and the source panel / breakpoint toggles compare against loaded file path plus 1-based line numbers
- shipped retail Remorse usecode is fully line-mapped in the current parser: `977 / 977` events contain `LINE_NUMBER` ops, with a highest recovered line number of `2991`
- current retail Regret `EUSECODE.FLX` does **not** yield any recovered `LINE_NUMBER` ops in the present parser pass: `1152 / 1152` events currently show `0` `LINE_NUMBER` ops total
- because the debugger still drives by runtime line numbers even when the extracted Regret source lacks recovered line markers, a cache-built Regret `.unk` needs a synthetic dense line table floor if it is going to participate in `RUN` / break-next / single-step flows at all
- but the file is only one requirement: `usecode_debugger_open_modal` merely creates the debugger gump, while `usecode_debugger_break_state_create` initializes the current-entry depth at `+0x7a = 0`; a truly runnable session still needs a real current-entry record and current line, not just a loadable `.unk`
- current live result after adding the synthetic line floor: the `.unk` now opens cleanly, but `RUN` still crashes, which is consistent with the remaining missing debugger-state requirement rather than with a remaining text-file parser failure
That limitation matters for breakpoint fidelity, but it does **not** block the basic experiment the user actually needs next:
- the debugger's file loader only needs a line-oriented text file to open and display
- these synthesized `.unk` files now provide that text payload in the exact filename family the debugger expects
- the rebuilt Regret cache corpus currently stays well under the loader's average-line-length rejection threshold, with the highest sampled file at about `50.74` bytes per parsed line versus the fail cutoff at `0x84`
- the JS exporter now pads Regret classes that have no recovered debug lines with a synthetic `2991`-line scaffold, capped so the total indexed line table stays within the debugger's `5999`-line source limit before the appendix is appended
So the current live experiment surface is no longer hypothetical: the patched Regret build can now be tested directly against manufactured per-unit `.unk` files under `USECODE/REGRET`.
So the current live experiment surface is no longer hypothetical: the patched Regret build can now be tested directly against manufactured per-unit `.unk` files copied from the map-viewer cache under `.cache/usecode/regret/EUSECODE/.data`.
### 8. Best current synthesis
@ -1059,6 +1604,310 @@ That model matches all of the currently recovered evidence:
- the very small size of `UNKDS.DAT`
- and the complete absence of any stock `.unk` files in the current workspace
## What Proper Debugger State Requires
The file-format work answered only the loader side of the problem. The deeper runtime pass now shows that a loadable `.unk` file is necessary, but it is **not** sufficient for a stable debugger session.
Current safest model:
- the debugger has one global break-state object at `1480:712c/712e`
- the source viewer reads text from the loaded `.unk` file through `DAT_1480_64ec/64ee`
- the debugger runtime state that drives `RUN`, `break next`, `single step`, current-unit open, watches, and inspect is stored in the break-state object, not in the `.unk` file
- `usecode_debugger_open_modal` only creates and displays the debugger window
- `usecode_debugger_open_for_current_unit` is stronger because it expects a valid current-entry stack and current line, resolves the unit name from that stack, and then loads the matching `.unk`
So the current Regret result is now split cleanly into two separate requirements:
1. text/source requirement: loadable `.unk` file with a usable line table
2. runtime-context requirement: valid break-state current-entry records plus current line and break/step control state
The current JS exporter work now satisfies the first requirement well enough for file open and source display. The remaining blocker for `RUN` is the second requirement.
### 1. What lives in the break-state object
The break-state constructor at `13e0:0000` zeroes and initializes the object, including the debugger-current-entry depth at `+0x7a`.
Current verified fields that matter most:
- `+0x72` = current source line, stored zero-based by `usecode_debugger_break_state_update_line_and_maybe_break`
- `+0x74` = break-next style runtime latch
- `+0x75` = single-step latch
- `+0x76/+0x78` = step countdown / timing words used by the step logic
- `+0x7a` = current-entry depth/count
- `+0x7c ...` = current-entry stack, one entry per frame-like debugger context
Current-entry pointer rule:
- `usecode_debugger_break_state_get_current_entry` returns `break_state + depth * 0x15 + 0x67`
- with depth `1`, that lands at `break_state + 0x7c`, which is the first current-entry record
### 2. Current-entry record layout
The record push helper at `13e0:02f5` and the copy helper at `13e0:03b0` now make the current-entry shape visible enough to describe.
Current safest layout for one current-entry record (`0x15` bytes total):
- `+0x00 .. +0x08` = unit name string, max 8 chars plus terminating `0`
- `+0x09 .. +0x14` = three packed runtime dwords copied from live interpreter context
What is firmly verified about those packed fields:
- they are **not** just cosmetic line metadata
- `13e0:02f5` pushes them as three dwords after the unit string and increments `+0x7a`
- `1398:045c` consumes at least the first two packed dwords when evaluating watch / inspect expressions, which means those fields carry real runtime context needed for symbol/value resolution
- `13e0:03f7` decrements `+0x7a` on interpreter-side unwind, so the current-entry stack is meant to track live nested interpreter context rather than merely loaded source files
Current safest interpretation:
- the current-entry record is a live interpreter/debug frame descriptor: unit name plus several runtime context pointers/handles
- it is intended to be pushed on entry to a debugger-relevant usecode context and popped on exit
### 3. Who seeds the records
This answer is now partly verified and partly inferred.
What is directly verified:
- `usecode_debugger_bootstrap_init` seeds the **object itself** and stores it into `1480:712c/712e`
- `13e0:02f5` seeds a **new current-entry record** from `(unit name, three runtime dwords)`
- `13e0:03b0` seeds a **new current-entry record** by copying an already-built `0x15`-byte record wholesale
- `13e0:03f7` pops one current-entry record on interpreter-side unwind
- `usecode_debugger_interpreter_hook` is the live interpreter-side consumer that owns the pop on exit
What is not yet fully recovered as a normal xref in the current MCP session:
- the exact direct callsite that invokes `13e0:02f5` or `13e0:03b0`
Current best evidence-based conclusion anyway:
- the record seeding is almost certainly performed by interpreter-side instrumentation around real usecode entry, not by the source loader and not by the modal-open wrapper
- the push/pop pairing strongly points to a runtime-enter / runtime-leave model: enter usecode context -> push record -> run -> unwind -> pop record
So the current best answer to `who seeds the records?` is:
> the debugger bootstrap seeds the object, but a separate interpreter-side runtime path seeds the current-entry records. The exact static caller of `13e0:02f5` / `13e0:03b0` is still not recovered cleanly in the present database, but the runtime ownership is clearly on the interpreter side rather than on the file-loader side.
### 4. Why `open_modal` is not enough
This is now the key practical distinction.
`usecode_debugger_open_modal`:
- creates the debugger gump
- shows the debugger UI
- does **not** load a current unit automatically
- does **not** seed a current-entry record stack
`usecode_debugger_open_for_current_unit`:
- creates the debugger gump
- reads the current-entry record from the break-state object
- derives the unit name from that record
- loads the matching `.unk`
- recenters the view around break-state line `+0x72 - 1`
So the current crash after `RUN` is fully consistent with this model:
- the `.unk` now opens and displays because the file requirement is satisfied
- but the debugger session still lacks the same runtime current-entry state that a real interpreter-driven break would provide
- therefore `RUN` can still resume into an invalid or half-seeded debugger context and crash
### 5. What can be done manually right now
There are now three manual tiers, each with different reliability.
#### Tier A: menu-only bring-up
This is the already-validated state.
Minimum manual state:
1. ensure `1480:712c/712e` points at a valid debugger object
2. load a debugger-safe `.unk` file for the desired unit
3. open the debugger gump
This is enough for:
- opening the debugger window
- opening source text
- browsing text and using some viewer-side commands
It is **not** enough for:
- stable `RUN`
- stable single-step / break-next
- reliable watch / inspect value evaluation
#### Tier B: manually seeded synthetic current-entry session
This is the first plausible manual path toward a truly runnable session.
Required manual state:
1. bootstrap the debugger object if `1480:712c/712e` is null
2. set `break_state + 0x7a = 1`
3. write one current-entry record at `break_state + 0x7c`
4. set `break_state + 0x72` to the current source line minus one
5. optionally arm `+0x75` or `+0x74`
6. open via `usecode_debugger_open_for_current_unit`, not only `open_modal`
Minimum record contents for that experiment:
- unit name at record `+0x00`
- three packed runtime dwords at record `+0x09 .. +0x14`
Main limitation:
- we do not yet have a fully decoded semantic map for the three packed dwords
- at least some of them are consumed by expression evaluation and probably by other debugger-context code
- so zero-filling them is unlikely to give a fully stable session
Current safest read:
- manual synthetic seeding is plausible, but it needs **captured real runtime values** from a live interpreter context, not invented placeholders
#### Tier C: let the interpreter seed the record naturally, then open the debugger
This is now the strongest practical route.
Desired sequence:
1. bootstrap the debugger object
2. arm break-next or single-step in the object
3. allow one real usecode/interpreter pass to run
4. let the interpreter-side debugger path push a genuine current-entry record
5. open through the normal current-unit path or let the slot-`0` auto-open path fire
Why this is stronger than manual record fabrication:
- the runtime itself provides the three packed context dwords
- the current line is seeded by the real update path
- watch / inspect / resume have the best chance of seeing coherent context
### 6. Best current manual recipe
If the immediate goal is `make the debugger work properly`, the best current manual recipe is no longer `generate a better .unk` by itself. The better recipe is:
1. keep the current JS exporter and synthetic Regret line floor so source load stays stable
2. capture one real current-entry record from a live interpreter-side seed point
3. reuse those captured values either:
- by manual object seeding, or
- by patching the launcher so the interpreter seeds them naturally before open
Current candidate capture points:
- breakpoint on `13e0:02f5`
- breakpoint on `13e0:03b0`
- breakpoint on `13e0:03f7` to see the matching pop side
- breakpoint on `13e0:0053` if a real break path can be reached
What to record when one of those hits:
- debugger object pointer (`1480:712c/712e`)
- `+0x72`
- `+0x7a`
- the full `0x15` bytes at the current-entry record start
- the three packed dwords separately
- the unit name string
That would immediately answer the remaining unknowns that static decompilation has not resolved cleanly.
## Workable Plan
The path to a fully working debugger session now looks like this.
### Phase 1. Finish the record-seeding recovery
Goal:
- confirm the exact interpreter-side caller that reaches `13e0:02f5` / `13e0:03b0`
Concrete actions:
1. use runtime breakpoints on `13e0:02f5`, `13e0:03b0`, and `13e0:03f7`
2. trigger ordinary usecode activity while the debugger object is live
3. capture caller addresses and arguments
4. map the three packed dwords to concrete runtime structures
Deliverable:
- one verified record-seeding path with argument semantics
### Phase 2. Prove a coherent manual seed
Goal:
- show that a manually seeded break-state can survive `RUN`
Concrete actions:
1. bootstrap the debugger object
2. write one captured real current-entry record into `break_state + 0x7c`
3. set `+0x7a = 1`
4. set `+0x72` to a matching line
5. open with `usecode_debugger_open_for_current_unit`
6. test `RUN`, then `single step`, then `watch`
Deliverable:
- one working manual debugger session without relying on the exact hidden launcher path
### Phase 3. Replace manual seeding with a real launch path
Goal:
- stop fabricating debugger context and let the runtime produce it naturally
Concrete actions:
1. patch a hidden input lane only far enough to bootstrap the object and arm break-next / single-step
2. do **not** open only through `open_modal`
3. let the next real interpreter pass seed the current-entry record stack
4. open through `usecode_debugger_open_for_current_unit` or the slot-`0` auto-open path
Deliverable:
- the smallest reproducible launcher patch that yields a coherent debugger context
### Phase 4. Tighten the source side only after runtime seeding is solved
Goal:
- improve source fidelity once the runtime context path is stable
Concrete actions:
1. keep the current synthetic Regret line floor for stability
2. continue investigating whether Regret still contains recoverable original line markers through another parser pass or another source of metadata
3. only after runtime stability, revisit whether the `.unk` needs better line correlation for breakpoint fidelity
Deliverable:
- stable debugger first, better line fidelity second
## Bottom Line
The investigation has now advanced beyond `what file will the debugger open?`.
Current safest answer to `what does it take for the debugger to work properly?` is:
- a debugger-safe `.unk` file
- a live debugger object
- a valid current-entry record stack seeded from real interpreter context
- a valid current line in `+0x72`
- and only then `RUN` / step / inspect / watch have a defensible chance to behave properly
Current safest answer to `who seeds the records?` is:
- bootstrap seeds the object
- interpreter-side runtime logic seeds the current-entry records
- the exact push caller is still not fully recovered as a named xref, but the push/pop ownership is now clearly on the interpreter side, not on the source loader side
Current safest answer to `how can we do it manually?` is:
- not by `.unk` text alone
- by either capturing and writing one real current-entry record into the break-state object, or by patching a launcher path that lets the interpreter seed that record naturally before opening the debugger
## Open Questions After This Pass
1. Does retail No Remorse still contain a dormant analogue of the Regret vtable override, not just the writer/bootstrap?