Add detailed log for retail debugger patch attempts in CRUSADER.EXE

This commit introduces a comprehensive document outlining the various executable-patching attempts aimed at revealing the hidden retail usecode debugger within the CRUSADER.EXE file. The document serves multiple purposes, including preserving negative evidence, recording patch shapes and their rationales, and ensuring that runtime outcomes are linked to specific patch generations.

Key sections include:
- Ground rules for patching and validation processes.
- A table of stable facts regarding the debugger's structure and behavior.
- A detailed attempt log documenting each patch's shape, mechanical and runtime results, and verdicts.
- Root-cause findings from failed paths, providing insights into the challenges faced during the patching process.
- Current live candidates for further testing and exploration.

This documentation is intended to streamline future patching efforts and improve the understanding of the underlying mechanics of the debugger.
This commit is contained in:
Marco 2026-03-25 17:36:16 +01:00
commit 7310c4fe96
13 changed files with 1008 additions and 1959 deletions

View file

@ -79,6 +79,38 @@ Detailed completed analysis belongs in the files under `docs/`, not in this plan
- The next blocker layer is narrower too. Those modal wrappers are not abstract helpers; inside `World_HandleKeyboardInput_13e8_14b4` they already wrap concrete user-facing lanes including exit-to-DOS confirmation (`0x22d`), quick save (`0x13f`), quick load (`0x13e`), restart/main-menu handling (`Game_RestartMaybe`), and the neighboring load/menu gump lanes. Separately, event `0x7e` remains the only other recovered writer of `0x6045` besides `Key_CheckCheatToggle`, so a successful `jassica16` match can still be undone later by that independent runtime path. `Key_CheckCheatToggle` itself is now comment-backed as keydown-only and still requires top-row `1` / `6` scan codes at the tail, leaving keypad digits and other non-matching input routes as a still-live explanation for failed tests.
- Cross-game verification against the currently opened `REGRET.EXE` now has a runtime correction too. The F10 branch at `1148:0d0e` still reaches the same modifier helper at `11e0:01a8`, and live testing shows the practical gesture is hold `F10` first and then press `Ctrl`, not `Alt`. The same BIOS-backed helper swap should be verified directly in that target before promoting renames there. The same runtime test also explains the repeated immortality popups: the F10 branch is not debounced, so holding the keys lets repeated F10 keydown events flip immortality on and off multiple times. The real gameplay difference remains the latch code: `1148:34d2` (`Key_CheckSecretCodeSequences`) still contains a `jassica16` table at `1480:2ff0`, but the latch-enabling sequence in No Regret is the second table at `1480:2ffc`, decoded as `loosecannon`, which toggles `1480:0ac0` and mirrors the result into the F10 latch byte `1480:009b`.
- Retail hidden-menu patching remains open, but the failed branches are now better separated from the current writable candidate. Verified file/fixup anchors are `0007:0d75` / `0007:0d79` (file `0x70d75` / relocation entry `0x71d68`) and `000c:99dd` / `000c:99e0` (file `0xc99dd`, seg126 chain `0x25e0`). The deferred `0x42f -> 000c:99dd -> 000b:9c0d` design remains explicitly rejected: it visibly entered the hidden UI path, but it halted with the retail `FILE\FLEX.C, line 83` failure and dropped into the quit line, so `0x42f` is the wrong deferred context even though the modal wrapper address itself was valid. The newer direct `0007:0d79 -> 000b:9a86` current-slot retarget with the narrowed `000b:9a8d` arg patch was also runtime-tested and produced no hidden menu, so the writable `/Writable/CRUSADER-PATCHED.EXE` test build is now moved to the next defensible variant instead: restore the direct hook to `000a:5276`, keep the current-slot wrapper unpatched, and retarget the later controller-side `000c:99e0` call to `000b:9c0d` while zeroing only the inherited modal-wrapper words at `000b:9c4a`.
- The next retail test build is narrower still. User runtime feedback on the first `Ctrl+Q` patch is: the mouse pointer appears, then gameplay hangs with only a single right-edge pixel still updating. That makes the remaining failure look more like post-entry control-flow fallout than a bad entry address. The PowerShell patch therefore now also rewrites raw `000c:99e8` / live `13e8:25e8` from `PUSH 0x3e8` to a near jump into the shared epilogue at `13e8:29a7`, so the reused `13e8:25dd` deferred lane exits immediately after the retargeted `13e8:25e0 -> 13a0:020d` call instead of falling through into the original `0x42f` branch tail logic.
- DOSBox-X debugger capture now shows the same hang surviving that tail-skip patch, and the stop point is materially informative: the live runtime state matches seg131 `Interpreter_NextUsecodeOp` at `1418:04c3..051d`, specifically just before the `1408:02f5` call at `1418:051d`. That means the blunt `Ctrl+Q -> 13a0:020d` patch is not merely stuck in the seg109 modal wrapper; it has activated the interpreter-side debugger-state path guarded by non-null `0x659c/0x659e`, and the freeze now looks like a bad or incomplete seg1408 debugger-state lifecycle rather than a simple wrong branch tail. Current best implication: stop iterating on the blunt modal force-open patch and pivot the next patch design toward constructing or safely emulating a real `usecode_debugger_break_state_create` object at `1408:0000` before relying on the seg109 UI lane.
- The next executable patch still follows that pivot, but the boot-time callback rewrite is now explicitly retired. The current PowerShell build repurposes the gated `0x410` body at `13e8:230d` to lazily construct a seg1408 state object through the existing far-call slot at `13e8:2352 -> 1408:0000`, stores the returned far pointer into `0x659c/0x659e`, and then reuses the **second** existing far-call slot in that same body (`13e8:235c`) to jump directly to `usecode_debugger_open_for_current_unit` at `13a0:0086` with zeroed wrapper arguments. This keeps the patch hotkey-local instead of rewriting the shared seg1408 callback table at `1478:65ab`, while the older direct and deferred modal-force-open sites remain restored.
- The callback-table design is now negative evidence rather than the live candidate. Even after fixing an NE-segment indexing mistake (`1478:65ab` had first been retargeted to segment `109` instead of `117` for `13a0:0086`), the global callback rewrite still caused startup failure. The surviving script fixes from that pass remain important: the large `13e8:230d` body must use on-disk `FF FF 00 00` placeholders rather than disassembly-resolved far operands, and its patched byte array must include the final trailing `0xC7` so patch/restore verification matches the retail executable length. With the global callback rewrite removed and the second local call slot retargeted instead, the script now round-trips cleanly on a fresh copied retail EXE (`apply -> patched`, `restore -> original`) and also cleans up the stale old `1478:65ab` callback retarget if that earlier crashing build had already been applied.
- The direct hotkey-to-wrapper retarget is now negative evidence too. The local-call redesign fixed startup and let the game reach gameplay, but pressing `Ctrl+Q` immediately quit through the normal `"No pity. No mercy. No remorse."` shutdown line, which is more consistent with entering the modal UI while the original keypress is still live than with a boot-time relocation problem. The next patch therefore keeps the hotkey-local object creation but stops calling `13a0:0086` on the keypress itself.
- The live candidate is now a per-object callback redirect. The `0x410` body at `13e8:230d` still creates/stores the seg1408 debugger-state object at `0x659c/0x659e`, but the second existing far-call slot in that body (`13e8:235c`) is now retargeted to `1408:0419` (`usecode_debugger_enable_single_step`) instead of directly opening the UI. The created object's first word is rewritten from the shared callback-table offset `0x65ab` to the private relocated slot `0x65af`, and the private dword at `1478:65af` is retargeted from `1408:0474` to `13a0:0086`. That should let the *next* interpreter-side debugger callback open the current-unit UI without inheriting the live `Ctrl+Q` key event, while the original shared `1478:65ab` slot stays restored to the retail no-op.
- The PowerShell patcher now round-trips cleanly for this per-object callback design on a fresh copied retail EXE too: `13e8:230d` body patched/restored, `13e8:235c` step-arm call patched/restored, private callback slot `1478:65af` patched/restored, and legacy shared callback slot `1478:65ab` held at original in both states.
- User runtime on that per-object single-step variant is now also informative negative evidence: the game boots and reaches gameplay, but pressing `Ctrl+Q` produces no visible effect at all, not even the original CD-transfer toast, which implies the hotkey body is being intercepted but the deferred break still is not surfacing. Current best read is that the single-step path at `+0x75` remains gated by the seg1418 nesting counters `+0x76/+0x78` often enough that the callback never fires in the observed test path.
- The live patch candidate therefore now sets **break-next** mode directly instead of single-step mode. The repurposed `13e8:230d` body still constructs/stores the seg1408 debugger-state object and repoints that object to the private callback slot `1478:65af -> 13a0:0086`, but it now writes `+0x75 = 0` and `+0x74 = 1` in the object itself rather than retargeting the second `13e8:235c` call slot to `1408:0419`. That matches the surviving UI-side control path at `13a0:1e5d`, where `+0x74` is the unconditional break-on-next-callback mode while `+0x75` is the nesting-sensitive single-step mode.
- The PowerShell patcher also round-trips cleanly for this break-next design on a fresh copied retail EXE: `13e8:230d` body patched/restored, private callback slot `1478:65af` patched/restored, shared callback slot `1478:65ab` held at original, and the stale second-call-slot cleanup removed from the write path because those bytes now belong to the patched body itself.
- User runtime on that `1478:65af` break-next variant is now negative evidence as well: the game crashes on launch again, so even the supposedly private `65af` slot now looks too globally visible to repurpose. Current best implication is that the object-local `0x65af` first-word rewrite can stay as the arm point, but the actual callback entry must move off the live callback-table dword itself.
- The live candidate is therefore now a **guarded trampoline** at the original seg1408 no-op callback code. The PowerShell patcher keeps the `13e8:230d` break-next object creation/store path, but restores the shared `1478:65ab` slot, stops repointing `1478:65af` to `13a0:0086`, patches `1408:0474` into a tiny guard that returns immediately unless `0x659c/0x659e` is armed, and uses the apparently unused relocated dword at `1478:6597` as the far target slot for `13a0:0086`. This newer `6597`/`1408:0474` build now also round-trips cleanly on a fresh copied retail EXE: `13e8:230d` body patched/restored, guarded callback code at file `0xCEE6F` patched/restored, wrapper target slot at file `0xEA197` patched/restored, and all older direct/deferred experiment sites held at original bytes.
- The root-cause read on the `65af` startup failures is now sharper: `0x65af` is not an alternate vtable base at all. The constructor at `1408:0000` writes `0x65ab` to object offset `+0`, and the dispatch sites prove that object `CALLF [BX]` uses the dword at `65ab -> 1408:046f` while object `CALLF [BX+4]` uses the next dword at `65af -> 1408:0474`. Rewriting the object first word to `0x65af` therefore corrupts the second method lookup (`[BX+4]`) instead of selecting a “private callback table”, which explains the launch-time instability and the other inconsistent runtime fallout from the earlier single-step and break-next builds.
- The live candidate is now the narrower **method-0 deferred callback** design. The PowerShell patcher still keeps the `13e8:230d` lazy object creation/store path and still arms break-next mode by writing `+0x75 = 0` / `+0x74 = 1`, but it explicitly preserves the object's vtable base as `0x65ab`, restores the method-1 helper at `1408:0474`, patches only the method-0 break callback at `1408:046f` to indirect through the spare relocated dword `1478:6597`, and uses that slot as the far target for `13a0:0086`. This corrected `046f`/`6597` build also round-trips cleanly on a fresh copied retail EXE: `13e8:230d` body patched/restored, break callback code at file `0xCEE6F` patched/restored, deferred target slot at file `0xEA197` patched/restored, and all older direct/deferred experiment sites held at original bytes.
- User runtime on that shared-`046f` method-0 build is now negative evidence too: startup still crashes, which makes the shared method body just as globally sensitive as the shared `65ab/65af` vtable slots. Current implication: the deferred path still looks right, but the callback implementation must move to a truly private per-object table instead of any shared seg1408 body or shared vtable dword.
- The live candidate is now a **private two-entry vtable** built from unused relocated dwords with no current data uses. The PowerShell patcher still keeps the `13e8:230d` lazy object creation/store path and still arms break-next mode with `+0x75 = 0` / `+0x74 = 1`, but it now rewrites the created object's vtable base to `0x658f`, retargets private method 0 `1478:658f -> 13a0:0086`, retargets private method 1 `1478:6593 -> 1408:0474`, and leaves the shared callback bodies and shared `65ab/65af` table entries untouched. This private-vtable build also round-trips cleanly on a fresh copied retail EXE: `13e8:230d` body patched/restored, private method 0 slot at file `0xEA18F` patched/restored, private method 1 slot at file `0xEA193` patched/restored, and all older direct/deferred experiment sites held at original bytes.
- User runtime on that first private-vtable placement is now negative evidence too: startup still crashes, which proves the `658f/6593` pair is also startup-visible despite the lack of direct data-use hits. Current best implication is that the private-vtable strategy itself still looks structurally right, but the specific dword pair must move farther away from the debugger-global cluster and any hidden boot-time consumers.
- The full six-candidate private-vtable harness is now retired. User runtime results:
- Candidate A (`1478:6724/6728`) = DOSBox closes on start
- Candidate B (`1478:672c/6730`) = fatal `Load program failed -- error code 201 -- C:\CRUSADER.EXE`
- Candidate C (`1478:6734/6738`) = DOSBox closes on start
- Candidate D (`1478:6718/671c`) = startup crash
- Candidate E (`1478:6720/6724`) = startup crash
- Candidate F (`1478:6738/673c`) = startup crash
Ghidra follow-up now explains why: `1478:6718..673c` is a live function-pointer table containing `UsecodeProcess_*`, `Process_Terminate`, `Process_Fail`, and nearby null handlers, not spare relocated dwords. The script no longer offers those candidates.
- The first guarded shared-callback pair is now negative evidence too. Candidates G/H still crashed on startup, and the best new explanation is structural: that design overwrote both `1408:046f` and the adjacent `1408:0474`, but `0474` is a real helper that returns `DX:AX = 0`, not dead padding. Destroying that zero-return behavior may itself be enough to destabilize startup.
- The method-0-only shared-callback pair is now negative evidence too. User runtime on Patch 1 / Patch 2 showed both startup-crashing, which means preserving `1408:0474` was necessary but not sufficient: the shared `1408:046f` body is still too broad if it jumps straight into debugger UI code.
- The live patch family is now an **interpreter callsite retarget** design. Candidate M/N are retired negative evidence: both startup-crashed, the embedded private stub inside the patched `13e8:230d` body was malformed, and the supposed deferred target slot at `1478:6597` is no longer treated as spare storage. The current PowerShell build still keeps the retail debugger object's real `1478:65ab` vtable base and still arms break-next through the patched `13e8:230d` body, but it now avoids both the shared seg1408 callback bodies and the `1478:6597` data slot entirely. Instead, it patches the existing interpreter `CALLF usecode_debugger_maybe_break_on_current_line` at `1418:04b5` to a corrected private stub at `13e8:232d`, and it also reuses the second retail far-call slot inside `13e8:230d` (`13e8:235c`) as the actual private UI-call target. The `13e8:230d` body itself now correctly handles both cases: reuse and arm an existing debugger-state object at `0x659c/659e`, or lazily create/store one before arming break-next. One implementation bug from the first O/P refactor is now fixed too: the second `13e8:235c` relocation write is part of candidate application and verification, so the live build now really routes to the selected wrapper instead of accidentally leaving that slot on retail `Dispatch_ModalGump`. Current candidates:
- Candidate O = interpreter callsite retarget -> `13a0:020d`, with `13a0:024a` zeroed inherited modal-wrapper words
- Candidate P = interpreter callsite retarget -> `13a0:0086`, with `13a0:008f` zeroed inherited current-unit-wrapper words
Both apply/restore cleanly on a disposable retail copy and are the next runtime tests.
- Full chronology for this patch line now lives in `docs/retail-debugger-patch-attempts.md`, including the failed global callback rewrite, direct wrapper call, single-step `65af` build, break-next `65af` build, guarded `0474` trampoline, shared `046f` method patch, and the current private-vtable candidate.
- The hidden-menu orphan model is now materially stronger too. New live renames in seg1408 (`usecode_debugger_break_state_create`, `usecode_debugger_maybe_break_on_current_line`, `usecode_debugger_breakpoint_insert_sorted`, `usecode_debugger_has_breakpoint`, `usecode_debugger_callstack_push_entry`, `usecode_debugger_callstack_pop_entry`, `usecode_debugger_enable_single_step`, `usecode_debugger_clear_step_state`, `usecode_debugger_current_entry_get_unit_name`) line up with the seg109 UI in a way the cheat-only hook never did. The concrete interpreter-side handoff at `1418:04aa..04b5` now calls `usecode_debugger_maybe_break_on_current_line` whenever the far pointer at `0x659c/0x659e` is non-null, and that helper checks `(file,line)` breakpoints before callbacking through the debugger-state object's vtable. Current best read is therefore that the retail orphan happened one layer earlier than the cheat/event experiments: the seg109 current-unit debugger UI likely used to be entered from this seg1408 breakpoint object, but retail no longer appears to instantiate/store that object at `0x659c/0x659e`. That makes the breakpoint callback lane a stronger original-entry candidate than direct event `0x103` retargeting.
- The live NE `CRUSADER.EXE` mapping for that hidden-menu lane is now explicit and comment-backed in Ghidra too: direct hook `1130:2b75/2b78`, current-slot wrapper `13a0:0086` with constructor arg site `13a0:008d`, modal wrapper `13a0:020d` with inherited-arg patch subsite `13a0:024a`, listener create/dispatch `13a0:19b1` / `13a0:1df3`, compiled `0x410` CD-transfer-display body `13e8:2303`, deferred controller-side hook `13e8:25dd/25e0`, and the supporting cheat-state data cells at `1020:2833`, `1020:283d`, `1020:0844`, `1020:6045`, `1020:604f`, and `1020:6050`. The `0x410` body is still documented in place rather than renamed because it remains embedded inside the oversized `World_HandleKeyboardInput_13e8_14b4` function object. This improved live handoff and patch reproducibility still does not justify a headline estimate change by itself.