Enhance CLI functionality and improve common utilities

- Added new commands to the CLI for dumping regions, renaming functions by address, and setting various types of comments.
- Implemented JSON output formatting for CLI commands.
- Introduced functions for decompiling and disassembling functions, as well as retrieving cross-references.
- Enhanced common utilities with functions for reading memory regions, iterating Java items, and managing function metadata.
- Added suppress_output context manager to hide process output during Ghidra startup.
- Updated existing functions to improve error handling and output formatting.
This commit is contained in:
MaddoScientisto 2026-03-21 09:44:35 +01:00
commit a56851f994
16 changed files with 1072 additions and 36 deletions

View file

@ -75,6 +75,23 @@ Known call-site classifications (by argument pattern):
- `entity_fire_weapon` currently decompiles as a thin wrapper that calls `projectile_init_vector`.
- `fire_weapon_from_cursor` still decompiles poorly in the raw import, but disassembly shows it begins by pushing cursor sprite/state data from the `0x27d6` area, consistent with the existing seg001 notes.
### Raw seg091 Boundary Recovery (init/context + RNG helpers)
- Conservative PyGhidra boundary repair created the missing seg091 functions in `CRUSADER-RAW.EXE`:
- `000a:44fd` = `seg091_func_00fd`, body `000a:44fd-000a:454c`
- `000a:454d` = `seg091_func_014d`, body `000a:454d-000a:45fd`
- `000a:48a0` = `rng_advance_state`, body `000a:48a0-000a:48e2`
- `000a:48ff` = `rng_next_modulo`, body `000a:48ff-000a:4912`
- Additional adjacent helper identified directly in the raw import:
- `000a:48e3` = `rng_set_seed`
- Verified current behavior from the raw import:
- `seg091_func_00fd` shares runtime flag `0x44a4` with `runtime_init_or_abort`; if the flag is clear it sets it and dispatches through an unresolved far thunk, then falls into a second unresolved thunk path that Ghidra currently marks as non-returning.
- `seg091_func_014d` also shares flag `0x44a4`; it checks an optional long argument against the global context/cookie at `0x45a6`, zeroes the pointed byte when the argument is null, then dispatches through an unresolved far thunk. Keep the positional name until caller-side analysis resolves the thunk target and full signature.
- `rng_set_seed` writes the 32-bit RNG seed/state pair at `0x4584:0x4586` and forces the low word odd.
- `rng_advance_state` updates the same 32-bit state with a simple multiply/add step.
- `rng_next_modulo` advances the RNG state and returns the result modulo the requested bound, or `0` when the bound is zero.
- Short decompiler comments were added in Ghidra at all five seg091 entries so the current evidence stays attached to the raw database.
### Raw 0007 Gameplay Helper Batch (entity/tile aux state)
- New conservative gameplay-side helper renames (direct analysis from field writes and call structure):
@ -826,7 +843,7 @@ A scroll/camera management cluster found in the `0007:bxxx0007:dxxx` range.
| Address | Name | Evidence |
|---------|------|---------|
| `0007:5b6f` | `entity_set_at_target_update_facing` *(likely internal block, not true top-level function)* | Direct raw-analysis name from the visible local behavior: sets entity `+0x3a = 1` (arrived flag); calls `entity_set_facing_direction`; clears bit `0x10` from entity type table `0x7e1e[type*0x79+0x59]`; then tail-calls onward. Relocation data places it at `seg043:016f`, and resolved call sites exist immediately before/after it (`5b36`, `5b44`, `5bb9`), so this address is likely an internal labeled block inside the larger missing `0007:5a00` seg043 function rather than a true entrypoint. |
| `0007:5b6f` | internal block only *(no function after repair)* | Direct raw-analysis behavior remains useful as a local label: this block sets entity `+0x3a = 1` (arrived flag), calls `entity_set_facing_direction`, clears bit `0x10` from entity type table `0x7e1e[type*0x79+0x59]`, then tail-calls onward. After the PyGhidra boundary repair, `0007:5b6f` is no longer a function entry and should be treated only as an internal control-flow label inside the first repaired seg043 routine. |
### seg043 Standalone Boundary Recovery
@ -835,8 +852,15 @@ A scroll/camera management cluster found in the `0007:bxxx0007:dxxx` range.
- `seg043:0090` -> raw `0007:5a90`
- `seg043:017a` -> raw `0007:5b7a`
- `seg043:021c` -> raw `0007:5c1c`
- The first recovered standalone function spans `0x0090..0x0179`, which means the current raw label at `0007:5b6f` falls inside the tail of that routine and overlaps the true return at raw `0007:5b79`.
- Practical consequence: the missing raw `0007:5a00` seg043 function boundary should not start at segment offset `0x0000`, and the current `0007:5b6f` function object should be treated as a mis-split internal block until Ghidra-side function creation/repair is available.
- The first recovered standalone function spans `0x0090..0x0179`, which means raw `0007:5b6f` falls inside the tail of that routine and overlaps the true return at raw `0007:5b79`.
- Repair status: applied in `CRUSADER-RAW.EXE` via the local PyGhidra toolkit. The bad function object at `0007:5b6f` was removed, and three conservative replacement functions were created:
- `0007:5a90` = `seg043_func_0090` with body `0007:5a90..0007:5b79`
- `0007:5b7a` = `entity_set_at_target_update_facing` with body `0007:5b7a..0007:5c1b`
- `0007:5c1c` = `seg043_func_021c` with body `0007:5c1c..0007:5c80`
- Follow-up re-decompilation now supports one real behavioral rename: `0007:5b7a` sets entity `+0x3a` to 1, calls `entity_set_facing_direction`, clears class-detail bit `0x10` at `0x7e1e[type*0x79+0x59]`, then continues into downstream dispatch, so the repaired middle function has been renamed `entity_set_at_target_update_facing`.
- `0007:5a90` now has a stronger structural read from standalone disassembly: it allocates an object when the incoming far pointer is null (literal `0x98`), runs a far setup helper using DS:`0x4b48..0x4b4e` and the second incoming far pointer, writes `0x4c13` at the object base, calls `entity_set_at_target_update_facing` with the third incoming far pointer, then adjusts the nested object at `+0x38` using extents read from the object at `+0x34` before returning the object pointer.
- `0007:5c1c` also has a stronger structural read: it optionally calls a virtual method through `[object->vtable + 0x4c]` when `object+0x44/+0x46` is non-null, passes a local stack word through `entity_class_get_flag20`, then dispatches one or two downstream far helpers using `object+0x48`, gated by a local status byte at `[bp-0xe]`.
- `0007:5a90` and `0007:5c1c` remain intentionally positional because their current decompiles still collapse into unresolved thunk dispatches and do not yet support safe behavioral names.
### Entity Class Flag Helper
@ -1306,7 +1330,7 @@ Named via systematic analysis of 11,692 NE relocation fixup entries. These are t
| Rank | Address | Name | Calls | Description |
|------|---------|------|-------|-------------|
| 1 | `000a:44fd` | *(no function in Ghidra)* | 331 | Analysis gap at seg091:00fd. In comutils.c segment near joystick code. Needs manual function creation. |
| 1 | `000a:44fd` | `seg091_func_00fd` | 331 | Recovered boundary. Shares init flag `0x44a4` with `runtime_init_or_abort`; thunk-heavy non-returning wrapper. |
| 2 | `0003:ac7e` | `mem_alloc` | 272 | Allocation wrapper → seg082:0000 (`0009:a200`) |
| 3 | `0008:dbec` | `entity_word_list_destroy` | 238 | Already named. Frees entity word-list buffer. |
| 4 | `0003:a751` | `mem_free` | 207 | Free wrapper → seg082:007a (`0009:a27a` = `mem_free_checked`) |
@ -1372,7 +1396,7 @@ Named via systematic analysis of 11,692 NE relocation fixup entries. These are t
|------|---------|------|-------|-------------|
| 41 | `000a:7b58` | `nop_return_zero_b` | 56 | Returns 0 (default vtable slot) |
| 42 | `000b:3ab2` | `sprite_node_dispatch_event` | 56 | Large event dispatch: checks event type (2/4/8/0x100), updates global focus ptr at [0x4fd0:4fd2], dispatches via vtable methods [+0x14/+0x18/+0x20/+0x24] by event code. Switch table for 16 event types. |
| 43 | `000a:48ff` | *(no function in Ghidra)* | 55 | Analysis gap in comutils.c segment |
| 43 | `000a:48ff` | `rng_next_modulo` | 55 | Advances seg091 RNG state and returns the result modulo the requested bound; returns 0 when bound is 0. |
| 44 | `000b:3362` | `sprite_tree_unwind_check` | 55 | Validates SS == param_2 (stack segment guard), then decrements global counter at [0x4fd6] |
| 45 | `000b:40ee` | `sprite_node_update_and_dispatch` | 55 | If `sprite_node_is_dirty` returns false: marks dirty, calcs accumulated bounds via `sprite_tree_get_accumulated_bounds` (3ed8), then dispatches via thunk |
| 46 | `000a:7b5f` | `vtable_stub_trampoline` | 55 | Calls through fixup thunk (forwarder to another function) |
@ -1397,7 +1421,7 @@ Named via systematic analysis of 11,692 NE relocation fixup entries. These are t
- The earlier standalone seg001 port hypothesis in this subrange was wrong.
- Relocation data places raw `0007:5a00` at `seg043:0000`, and the already-named helper at `0007:5b6f` sits at `seg043:016f`.
- Because of that segment placement, standalone seg001 names such as `debris_spawn` (`0x7490`) and `entity_die` (`0x75ff`) should NOT be ported into this raw range.
- `0007:5b6f` currently remains `entity_set_at_target_update_facing` from direct raw analysis; its behavioral name is no longer in conflict with the standalone seg001 `entity_die` note.
- `0007:5b6f` no longer exists as a function after the PyGhidra repair pass. Its old raw-analysis behavior now lines up with the repaired function `0007:5b7a = entity_set_at_target_update_facing`, so `0007:5b6f` should be treated only as an internal control-flow location inside that function.
- Additional resolved call targets inside the missing seg043 block were annotated in Ghidra from relocation data:
- `0007:5a8a` -> `entity_set_event_type_checked`
- `0007:5a98` -> `FUN_0008_cc01` (timer-related flag/event helper; tests `+0x16 & 0x2`, sets `+0x16 |= 0x800`, copies event field `+0x06` to `+0x22`, checks `0x1000`, then conditionally dispatches)
@ -1406,19 +1430,21 @@ Named via systematic analysis of 11,692 NE relocation fixup entries. These are t
- `0007:5bb8` -> `entity_is_type_match`
- `0007:5c49` -> `entity_class_get_flag20`
- `0007:5c8b` -> `mem_alloc_far`
- Current boundary caveat:
- Ghidra likely split the real seg043 routine incorrectly. `0007:5b6f` has no inbound xrefs, while relocation-resolved calls exist on both sides of it inside the same segment window. Treat the current `0007:5b6f` label as a behavioral anchor for one internal block, not yet as a proven standalone function boundary.
- Standalone seg043 disassembly now strengthens that conclusion: real prologues are at raw `0007:5a90`, `0007:5b7a`, and `0007:5c1c`, so the current `0007:5b6f` boundary demonstrably overlaps an earlier function.
- Current boundary state:
- The seg043 split has now been repaired in Ghidra. Verified temporary functions exist at raw `0007:5a90`, `0007:5b7a`, and `0007:5c1c`.
- The repaired middle function at `0007:5b7a` has now been promoted from a positional label to `entity_set_at_target_update_facing` based on direct decompile/disassembly behavior.
- The remaining repaired functions at `0007:5a90` and `0007:5c1c` should keep their positional names until a later pass resolves the thunk-heavy bodies more clearly.
- The next pass on this region should continue re-decompiling `seg043_func_0090` and `seg043_func_021c`, resolve the still-unknown far thunks they call, and replace the positional names only when their behavior is directly supported.
| Address | NE Segment | Callers | Notes |
|---------|-----------|---------|-------|
| `000a:44fd` | seg091:00fd | 331 | #1 most-called target! In comutils.c segment. |
| `000a:44fd` | seg091:00fd | 331 | Recovered as `seg091_func_00fd`; thunk-heavy init wrapper sharing flag `0x44a4`. |
| `000b:2e00` | seg109:0000 | 74 | Start of segment 109. |
| `0007:5a00` | seg043:0000 | 64 | Start of segment 43. Earlier seg001 `debris_spawn` port was rejected; still needs manual function creation and direct analysis. |
| `000a:48ff` | seg091:04ff | 55 | In comutils.c segment near joystick code. |
| `000a:48ff` | seg091:04ff | 55 | Recovered as `rng_next_modulo`; bounded wrapper around seg091 RNG state advance. |
| `0003:a880` | seg005:0880 | 49 | In CRT segment near `far_memcpy`. |
| `0003:ad75` | seg005:0d75 | 43 | In CRT segment near `mem_alloc`. |
| `000a:454d` | seg091:014d | 32 | In comutils.c segment. |
| `000a:454d` | seg091:014d | 32 | Recovered as `seg091_func_014d`; init/context helper using the `0x45a6` cookie/context global. |
### Tier 4: Ranks 61-80 (29-42 callers)
@ -1438,7 +1464,7 @@ Named via systematic analysis of 11,692 NE relocation fixup entries. These are t
| 72 | `0009:c433` | `event_queue_align_index` | 34 | Returns `param_1 & 0xFFF8` — aligns ring index to 8-byte event slot boundary |
| 73 | `0009:2156` | `dos_file_get_size` | 33 | Saves file position, does INT 21h AH=42h AL=02 (seek to end), restores position. Returns file size in DX:AX |
| 74 | `000a:2c41` | `list_iterate_next` | 33 | Linked list iterator: if *out==0 returns first from obj+2; else follows next at ptr+2/+4. Returns bool (has more) |
| 75 | `000a:454d` | *(no function in Ghidra)* | 32 | Analysis gap in comutils.c segment |
| 75 | `000a:454d` | `seg091_func_014d` | 32 | Recovered boundary. Shares flag `0x44a4`; checks optional long argument against the `0x45a6` cookie/context global. |
| 76 | `000b:2446` | `sprite_clear_redraw_flag` | 31 | Clears flag at obj+0x17e, then dispatches via thunk |
| 77 | `0005:1238` | `entity_get_class_word` | 30 | Looks up table at [0x7e01] indexed by *param_1 * 2, returns word. Sister of `entity_get_type_word` (which uses [0x7df9]) |
| 78 | `000b:1446` | `display_null_check_dispatch` | 30 | Null-checks far ptr params, dispatches to different thunks based on result |
@ -1559,13 +1585,13 @@ Compares two 5-byte `map_position` structs: `{ x:word, y:word, layer:byte }`. Re
| `0x7df1` | word[] | 2 | Entity base type word |
| `0x7e1e` | struct[] | 0x79 | Entity class detail records (121 bytes per class) |
### Analysis Gaps (No Function in Ghidra)
### Recent Manual Boundary Repairs
These high-traffic addresses need manual function creation in Ghidra (Script Manager or UI):
Recent high-traffic addresses recovered with manual function creation in Ghidra/PyGhidra:
| Address | NE Segment | Callers | Notes |
|---------|-----------|---------|-------|
| `000a:44fd` | seg091:00fd | 331 | #1 most-called target! In comutils.c segment. |
| `000a:48ff` | seg091:04ff | 55 | Recovered as `rng_next_modulo`; manual boundary repair narrowed to `000a:48ff-000a:4912`. |
| `000b:2e00` | seg109:0000 | 74 | Start of segment 109. |
| `0007:5a00` | seg043:0000 | 64 | Start of segment 43. Earlier seg001 `debris_spawn` port was rejected; still needs manual function creation and direct analysis. |
| `0009:a200` | seg082:0000 | - | Target of `mem_alloc`. Start of segment 82. |