Crusader_Decomp/docs/overview.md

122 lines
7.7 KiB
Markdown
Raw Normal View History

# Crusader: No Remorse — Binary Overview
## Binary Overview
- **Game**: Crusader: No Remorse (Origin Systems, 1995)
- **Platform**: DOS (16-bit protected mode)
- **DOS Extender**: Phar Lap 286 DOS-Extender (RUN286)
- **Executable Format**: Bound `MZ -> NE` executable with Phar Lap DOS-extender code
- **Entry Point**: `10da:7c40`
## Installed Copy Findings
- No standalone `.EXP` file exists in `F:\Apps\Crusader No Remorse`.
- `CRUSADER.EXE` is the original game binary and contains a valid internal `NE` header.
- Outer DOS `MZ` header points to `e_lfanew = 0x36F70`.
- Internal header at `0x36F70` starts with `NE` and describes **145 segments**.
- The NE segment table references data from the original file directly, so there is no separate embedded payload that needs to be carved out first.
- `CNRCEXP.EXE` is a modern Win32 helper tool, not part of the original DOS execution path.
## Raw Full-EXE Import Mapping
- A separate raw-binary import of the full executable (`crusader-raw.exe`) is usable: Ghidra discovers thousands of functions across a single flat `ram` block.
- Direct `file_offset -> flat_address` mapping from the standalone segment extracts is not reliable for porting names into that raw import.
- The extracted `segNNN_*.bin` files match `CRUSADER_NE.EXE`, but the raw full-EXE import must be mapped by verified byte signatures / known function bodies.
- Verified segment bases in the raw full-EXE import:
- `seg001` base = `0x6E570` (`cursor_update_hover` at `0006:e5d0`, rel `0x0060`)
- `seg021` base = `0x87170` (`entity_count_by_type_a` at `0008:7377`, rel `0x0207`)
- Porting rule for these verified segments:
- `raw_full_exe_flat = verified_segment_base + standalone_segment_relative_offset`
- Naming note:
- `seg001` and `seg021` both contain a keyboard handler; in the full program database, the seg001 copy is named `seg001_input_keyboard_handler` to avoid a symbol collision with seg021 `input_keyboard_handler`.
### Address Space Layout in the Raw Import
Ghidra segment:offset `SSSS:OOOO` = flat address `SSSS * 0x10000 + OOOO`.
| Flat range | Content |
|---|---|
| `0x00000``0x36F6F` | Phar Lap 286 DOS extender (outer MZ stub code) |
| `0x36F70` | NE header (145-segment game image begins here in file) |
| `0x6E570`+ | NE game segments at their Phar Lap linear load addresses |
Mapping rule (verified for seg001 and seg021):
```
runtime_flat_base = NE_segment_file_offset + 0x36F70
```
Example: seg004 at file `0x40A00` → runtime `0x77970` → Ghidra `0007:7970`.
Functions at Ghidra `0003:XXXX` / `0004:XXXX` are **Phar Lap extender code** (flat < `0x40000` is below any game segment). Functions at `0006:E570`+ are game NE segments.
### `0000:ffff` — NE Fixup Placeholder (not a dispatcher)
`unresolved_far_thunk_dispatch` at `0000:ffff` is NOT a runtime function. Every `CALLF 0x0000:ffff` in the original NE image is a **different** external or inter-segment call patched by the NE loader at runtime. The body at `0000:ffff` is just fixup placeholder data, so decompiling it as a function is meaningless.
**`unresolved_far_thunk_dispatch` is NOT a real dispatcher.** It is the NE binary fixup placeholder.
- In a Phar Lap 286 NE executable, inter-segment and external far calls are stored in the binary as `CALLF 0x0000:ffff` (or similar invalid sentinel values).
- The Phar Lap NE loader patches each of these call sites to the real segment:offset at load time using the per-segment relocation records in the NE file.
- In Ghidra's raw import, those fixups are never applied. Every unresolved far call collapses to the same `0000:ffff` stub.
- **Each `CALLF 0x0000:ffff` in the binary is a DIFFERENT call with a DIFFERENT actual target.**
Repair status in `CRUSADER-RAW.EXE`:
- A PyGhidra repair pass now applies the verified NE relocation table directly to the raw-program bytes for literal internal `CALLF 9A ptr16:16` sites, then re-disassembles each patched instruction.
- Current verified batch results:
- `8851` internal literal `CALLF` sites patched to their real segment:offset targets.
- `2841` far-pointer relocation entries skipped because they were not literal `CALLF` instructions (data or other non-call uses).
- `119` import callsites annotated as `NE IMPORT -> module.symbol`.
Known call-site classifications (by argument pattern):
- `PUSH DS; PUSH imm_ordinal; CALLF` — Phar Lap extender calling a runtime-imported procedure by ordinal
- `PUSH ptr_seg; PUSH ptr_off; CALLF` — inter-NE-segment function call (intra-game far call)
- Multiple typed pushes then CALLF — external C runtime / game subsystem call with normal args
### Latest Raw Full-EXE Porting Progress
Newly ported and renamed into `CRUSADER-RAW.EXE` from verified `seg001` mapping (`base 0x6E570`):
- `0007:28ce` = `shot_entity_alloc` (`seg001 + 0x435e`)
- `0007:2a19` = `shot_entity_free` (`seg001 + 0x44a9`)
- `0007:2bc9` = `projectile_init_vector` (`seg001 + 0x4659`)
- `0007:3001` = `entity_fire_weapon` (`seg001 + 0x4a91`)
- `0007:3088` = `fire_weapon_from_cursor` (`seg001 + 0x4b18`)
- `0007:30e8` = `projectile_check_hit` (`seg001 + 0x4b78`)
- `0007:319e` = `projectile_step_update` (`seg001 + 0x4c2e`)
- `0007:3298` = `projectile_trace_ray` (`seg001 + 0x4d28`)
- `0007:371d` = `projectile_update_tick` (`seg001 + 0x51ad`)
- `0007:4009` = `projectile_apply_hit` (`seg001 + 0x5a99`)
## Segment Map
| Segment | Address Range | Purpose |
|---------|--------------|---------|
| CODE_0 | `1000:0000 - 1000:01ff` | Interrupt dispatch table / thunks |
| CODE_1 | `1020:0000 - 1020:0b9f` | Low-level interrupt handlers, mode switching |
| CODE_2 | `10da:0000 - 10da:25ef` | **Main runtime** — C library, I/O, formatting, entry point |
| CODE_3 | `1339:0000 - 1339:0c2f` | **DOS/DPMI services** — INT 21h/31h wrappers, interrupt vector mgmt, fast memcpy |
| CODE_4 | `13fc:0000 - 13fc:27af` | **String data & runtime constants** — error messages, format strings, Phar Lap ID |
| CODE_5 | `1677:0000 - 1677:0e8f` | **EMS/XMS memory management** — expanded memory handlers |
| CODE_6 | `1760:0000 - 1760:7ccd` | **DOS Extender core** — EXP loader, command-line parser, memory management, system init |
| DATA | `1760:7cd0 - 1760:7cdf` | Global data |
| HEADER | `HEADER::0000 - HEADER::044f` | MZ/P2 file header |
## NE Import Details
- File to import: `F:\Apps\Crusader No Remorse\CRUSADER.EXE`
- Outer DOS header: `MZ`
- `e_lfanew`: `0x36F70`
- Internal executable header: `NE`
- Segment count: `145`
- Initial `CS:IP`: `0001:0000`
- Initial `SS:SP`: `0091:2000`
The currently analyzed protected-mode code at addresses like `10da:7c40` is consistent with the Phar Lap runtime/loader path. To reach the rest of the program, import `CRUSADER.EXE` again using an **NE-aware loader** or a workflow that starts from the internal NE header rather than the outer DOS stub.
## Next Steps
1.**NE Segment 1 imported and analyzed** — all 58 identified functions renamed and annotated
2.**Raw 0007 segment analyzed** — rendering, camera/scroll, save slot, and scroll region subsystems documented (~60 functions renamed and annotated)
3. **Import additional NE segments** — priority: segments 22, 30, 59, 86 (segment 21 complete)
4. **Analyze raw 0007 draw helper cluster**`FUN_0007_03b4`, `FUN_0007_04b8`, `FUN_0007_04dc`, `FUN_0007_057f`, `FUN_0007_0614`; called by sprite/draw list functions
5. **Analyze `FUN_0007_4cdf`** — large 15-case animation/movement dispatcher; overlapping instruction warnings; cases 0, 2, 3, 6, 9, 0xa, 0xe are clean
6. **Map file format loaders**`.FLX`, `.SHP`, `.MAP`, `.TNT` resource formats
7. **Cross-reference entity type constants** with game entities (robots, platforms, triggers)
8. **Identify external segment calls** — the `func_0x0000ffff()` placeholders are all cross-segment calls; resolving them requires importing the referenced segments