Crusader_Decomp/docs/overview.md
MaddoScientisto 3daffbf113 Add extractor for Crusader's EUSECODE.FLX container
- Implemented a Python script to extract data from the EUSECODE.FLX file format.
- Defined data structures for candidate entries and extracted chunks using dataclasses.
- Added functions to read and parse the FLX table, extract candidate data, and generate human-readable output files.
- Included functionality for analyzing extracted data, including generating summaries, descriptors, and event family reports.
- Implemented utilities for calculating printable ratios, zero ratios, and identifying text-like data.
- Added support for writing various output formats, including JSON, TSV, and Markdown.
2026-03-22 14:27:38 +01:00

122 lines
7.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Crusader: No Remorse — Binary Overview
## Binary Overview
- **Game**: Crusader: No Remorse (Origin Systems, 1995)
- **Platform**: DOS (16-bit protected mode)
- **DOS Extender**: Phar Lap 286 DOS-Extender (RUN286)
- **Executable Format**: Bound `MZ -> NE` executable with Phar Lap DOS-extender code
- **Entry Point**: `10da:7c40`
## Installed Copy Findings
- No standalone `.EXP` file exists in `F:\Apps\Crusader No Remorse`.
- `CRUSADER.EXE` is the original game binary and contains a valid internal `NE` header.
- Outer DOS `MZ` header points to `e_lfanew = 0x36F70`.
- Internal header at `0x36F70` starts with `NE` and describes **145 segments**.
- The NE segment table references data from the original file directly, so there is no separate embedded payload that needs to be carved out first.
- `CNRCEXP.EXE` is a modern Win32 helper tool, not part of the original DOS execution path.
## Raw Full-EXE Import Mapping
- A separate raw-binary import of the full executable (`crusader-raw.exe`) is usable: Ghidra discovers thousands of functions across a single flat `ram` block.
- Direct `file_offset -> flat_address` mapping from the standalone segment extracts is not reliable for porting names into that raw import.
- The extracted `segNNN_*.bin` files match `CRUSADER_NE.EXE`, but the raw full-EXE import must be mapped by verified byte signatures / known function bodies.
- Verified segment bases in the raw full-EXE import:
- `seg001` base = `0x6E570` (`cursor_update_hover` at `0006:e5d0`, rel `0x0060`)
- `seg021` base = `0x87170` (`entity_count_by_type_a` at `0008:7377`, rel `0x0207`)
- Porting rule for these verified segments:
- `raw_full_exe_flat = verified_segment_base + standalone_segment_relative_offset`
- Naming note:
- `seg001` and `seg021` both contain a keyboard handler; in the full program database, the seg001 copy is named `seg001_input_keyboard_handler` to avoid a symbol collision with seg021 `input_keyboard_handler`.
### Address Space Layout in the Raw Import
Ghidra segment:offset `SSSS:OOOO` = flat address `SSSS * 0x10000 + OOOO`.
| Flat range | Content |
|---|---|
| `0x00000``0x36F6F` | Phar Lap 286 DOS extender (outer MZ stub code) |
| `0x36F70` | NE header (145-segment game image begins here in file) |
| `0x6E570`+ | NE game segments at their Phar Lap linear load addresses |
Mapping rule (verified for seg001 and seg021):
```
runtime_flat_base = NE_segment_file_offset + 0x36F70
```
Example: seg004 at file `0x40A00` → runtime `0x77970` → Ghidra `0007:7970`.
Functions at Ghidra `0003:XXXX` / `0004:XXXX` are **Phar Lap extender code** (flat < `0x40000` is below any game segment). Functions at `0006:E570`+ are game NE segments.
### `0000:ffff` — NE Fixup Placeholder (not a dispatcher)
`unresolved_far_thunk_dispatch` at `0000:ffff` is NOT a runtime function. Every `CALLF 0x0000:ffff` in the original NE image is a **different** external or inter-segment call patched by the NE loader at runtime. The body at `0000:ffff` is just fixup placeholder data, so decompiling it as a function is meaningless.
**`unresolved_far_thunk_dispatch` is NOT a real dispatcher.** It is the NE binary fixup placeholder.
- In a Phar Lap 286 NE executable, inter-segment and external far calls are stored in the binary as `CALLF 0x0000:ffff` (or similar invalid sentinel values).
- The Phar Lap NE loader patches each of these call sites to the real segment:offset at load time using the per-segment relocation records in the NE file.
- In Ghidra's raw import, those fixups are never applied. Every unresolved far call collapses to the same `0000:ffff` stub.
- **Each `CALLF 0x0000:ffff` in the binary is a DIFFERENT call with a DIFFERENT actual target.**
Repair status in `CRUSADER-RAW.EXE`:
- A PyGhidra repair pass now applies the verified NE relocation table directly to the raw-program bytes for literal internal `CALLF 9A ptr16:16` sites, then re-disassembles each patched instruction.
- Current verified batch results:
- `8851` internal literal `CALLF` sites patched to their real segment:offset targets.
- `2841` far-pointer relocation entries skipped because they were not literal `CALLF` instructions (data or other non-call uses).
- `119` import callsites annotated as `NE IMPORT -> module.symbol`.
Known call-site classifications (by argument pattern):
- `PUSH DS; PUSH imm_ordinal; CALLF` — Phar Lap extender calling a runtime-imported procedure by ordinal
- `PUSH ptr_seg; PUSH ptr_off; CALLF` — inter-NE-segment function call (intra-game far call)
- Multiple typed pushes then CALLF — external C runtime / game subsystem call with normal args
### Latest Raw Full-EXE Porting Progress
Newly ported and renamed into `CRUSADER-RAW.EXE` from verified `seg001` mapping (`base 0x6E570`):
- `0007:28ce` = `shot_entity_alloc` (`seg001 + 0x435e`)
- `0007:2a19` = `shot_entity_free` (`seg001 + 0x44a9`)
- `0007:2bc9` = `projectile_init_vector` (`seg001 + 0x4659`)
- `0007:3001` = `entity_fire_weapon` (`seg001 + 0x4a91`)
- `0007:3088` = `fire_weapon_from_cursor` (`seg001 + 0x4b18`)
- `0007:30e8` = `projectile_check_hit` (`seg001 + 0x4b78`)
- `0007:319e` = `projectile_step_update` (`seg001 + 0x4c2e`)
- `0007:3298` = `projectile_trace_ray` (`seg001 + 0x4d28`)
- `0007:371d` = `projectile_update_tick` (`seg001 + 0x51ad`)
- `0007:4009` = `projectile_apply_hit` (`seg001 + 0x5a99`)
## Segment Map
| Segment | Address Range | Purpose |
|---------|--------------|---------|
| CODE_0 | `1000:0000 - 1000:01ff` | Interrupt dispatch table / thunks |
| CODE_1 | `1020:0000 - 1020:0b9f` | Low-level interrupt handlers, mode switching |
| CODE_2 | `10da:0000 - 10da:25ef` | **Main runtime** — C library, I/O, formatting, entry point |
| CODE_3 | `1339:0000 - 1339:0c2f` | **DOS/DPMI services** — INT 21h/31h wrappers, interrupt vector mgmt, fast memcpy |
| CODE_4 | `13fc:0000 - 13fc:27af` | **String data & runtime constants** — error messages, format strings, Phar Lap ID |
| CODE_5 | `1677:0000 - 1677:0e8f` | **EMS/XMS memory management** — expanded memory handlers |
| CODE_6 | `1760:0000 - 1760:7ccd` | **DOS Extender core** — EXP loader, command-line parser, memory management, system init |
| DATA | `1760:7cd0 - 1760:7cdf` | Global data |
| HEADER | `HEADER::0000 - HEADER::044f` | MZ/P2 file header |
## NE Import Details
- File to import: `F:\Apps\Crusader No Remorse\CRUSADER.EXE`
- Outer DOS header: `MZ`
- `e_lfanew`: `0x36F70`
- Internal executable header: `NE`
- Segment count: `145`
- Initial `CS:IP`: `0001:0000`
- Initial `SS:SP`: `0091:2000`
The currently analyzed protected-mode code at addresses like `10da:7c40` is consistent with the Phar Lap runtime/loader path. To reach the rest of the program, import `CRUSADER.EXE` again using an **NE-aware loader** or a workflow that starts from the internal NE header rather than the outer DOS stub.
## Next Steps
1.**NE Segment 1 imported and analyzed** — all 58 identified functions renamed and annotated
2.**Raw 0007 segment analyzed** — rendering, camera/scroll, save slot, and scroll region subsystems documented (~60 functions renamed and annotated)
3. **Import additional NE segments** — priority: segments 22, 30, 59, 86 (segment 21 complete)
4. **Analyze raw 0007 draw helper cluster**`FUN_0007_03b4`, `FUN_0007_04b8`, `FUN_0007_04dc`, `FUN_0007_057f`, `FUN_0007_0614`; called by sprite/draw list functions
5. **Analyze `FUN_0007_4cdf`** — large 15-case animation/movement dispatcher; overlapping instruction warnings; cases 0, 2, 3, 6, 9, 0xa, 0xe are clean
6. **Map file format loaders**`.FLX`, `.SHP`, `.MAP`, `.TNT` resource formats
7. **Cross-reference entity type constants** with game entities (robots, platforms, triggers)
8. **Identify external segment calls** — the `func_0x0000ffff()` placeholders are all cross-segment calls; resolving them requires importing the referenced segments