Crusader_Decomp/docs/overview.md
MaddoScientisto 3daffbf113 Add extractor for Crusader's EUSECODE.FLX container
- Implemented a Python script to extract data from the EUSECODE.FLX file format.
- Defined data structures for candidate entries and extracted chunks using dataclasses.
- Added functions to read and parse the FLX table, extract candidate data, and generate human-readable output files.
- Included functionality for analyzing extracted data, including generating summaries, descriptors, and event family reports.
- Implemented utilities for calculating printable ratios, zero ratios, and identifying text-like data.
- Added support for writing various output formats, including JSON, TSV, and Markdown.
2026-03-22 14:27:38 +01:00

7.7 KiB
Raw Blame History

Crusader: No Remorse — Binary Overview

Binary Overview

  • Game: Crusader: No Remorse (Origin Systems, 1995)
  • Platform: DOS (16-bit protected mode)
  • DOS Extender: Phar Lap 286 DOS-Extender (RUN286)
  • Executable Format: Bound MZ -> NE executable with Phar Lap DOS-extender code
  • Entry Point: 10da:7c40

Installed Copy Findings

  • No standalone .EXP file exists in F:\Apps\Crusader No Remorse.
  • CRUSADER.EXE is the original game binary and contains a valid internal NE header.
  • Outer DOS MZ header points to e_lfanew = 0x36F70.
  • Internal header at 0x36F70 starts with NE and describes 145 segments.
  • The NE segment table references data from the original file directly, so there is no separate embedded payload that needs to be carved out first.
  • CNRCEXP.EXE is a modern Win32 helper tool, not part of the original DOS execution path.

Raw Full-EXE Import Mapping

  • A separate raw-binary import of the full executable (crusader-raw.exe) is usable: Ghidra discovers thousands of functions across a single flat ram block.
  • Direct file_offset -> flat_address mapping from the standalone segment extracts is not reliable for porting names into that raw import.
  • The extracted segNNN_*.bin files match CRUSADER_NE.EXE, but the raw full-EXE import must be mapped by verified byte signatures / known function bodies.
  • Verified segment bases in the raw full-EXE import:
    • seg001 base = 0x6E570 (cursor_update_hover at 0006:e5d0, rel 0x0060)
    • seg021 base = 0x87170 (entity_count_by_type_a at 0008:7377, rel 0x0207)
  • Porting rule for these verified segments:
    • raw_full_exe_flat = verified_segment_base + standalone_segment_relative_offset
  • Naming note:
    • seg001 and seg021 both contain a keyboard handler; in the full program database, the seg001 copy is named seg001_input_keyboard_handler to avoid a symbol collision with seg021 input_keyboard_handler.

Address Space Layout in the Raw Import

Ghidra segment:offset SSSS:OOOO = flat address SSSS * 0x10000 + OOOO.

Flat range Content
0x000000x36F6F Phar Lap 286 DOS extender (outer MZ stub code)
0x36F70 NE header (145-segment game image begins here in file)
0x6E570+ NE game segments at their Phar Lap linear load addresses

Mapping rule (verified for seg001 and seg021):

runtime_flat_base = NE_segment_file_offset + 0x36F70

Example: seg004 at file 0x40A00 → runtime 0x77970 → Ghidra 0007:7970.

Functions at Ghidra 0003:XXXX / 0004:XXXX are Phar Lap extender code (flat < 0x40000 is below any game segment). Functions at 0006:E570+ are game NE segments.

0000:ffff — NE Fixup Placeholder (not a dispatcher)

unresolved_far_thunk_dispatch at 0000:ffff is NOT a runtime function. Every CALLF 0x0000:ffff in the original NE image is a different external or inter-segment call patched by the NE loader at runtime. The body at 0000:ffff is just fixup placeholder data, so decompiling it as a function is meaningless.

unresolved_far_thunk_dispatch is NOT a real dispatcher. It is the NE binary fixup placeholder.

  • In a Phar Lap 286 NE executable, inter-segment and external far calls are stored in the binary as CALLF 0x0000:ffff (or similar invalid sentinel values).
  • The Phar Lap NE loader patches each of these call sites to the real segment:offset at load time using the per-segment relocation records in the NE file.
  • In Ghidra's raw import, those fixups are never applied. Every unresolved far call collapses to the same 0000:ffff stub.
  • Each CALLF 0x0000:ffff in the binary is a DIFFERENT call with a DIFFERENT actual target.

Repair status in CRUSADER-RAW.EXE:

  • A PyGhidra repair pass now applies the verified NE relocation table directly to the raw-program bytes for literal internal CALLF 9A ptr16:16 sites, then re-disassembles each patched instruction.
  • Current verified batch results:
    • 8851 internal literal CALLF sites patched to their real segment:offset targets.
    • 2841 far-pointer relocation entries skipped because they were not literal CALLF instructions (data or other non-call uses).
    • 119 import callsites annotated as NE IMPORT -> module.symbol.

Known call-site classifications (by argument pattern):

  • PUSH DS; PUSH imm_ordinal; CALLF — Phar Lap extender calling a runtime-imported procedure by ordinal
  • PUSH ptr_seg; PUSH ptr_off; CALLF — inter-NE-segment function call (intra-game far call)
  • Multiple typed pushes then CALLF — external C runtime / game subsystem call with normal args

Latest Raw Full-EXE Porting Progress

Newly ported and renamed into CRUSADER-RAW.EXE from verified seg001 mapping (base 0x6E570):

  • 0007:28ce = shot_entity_alloc (seg001 + 0x435e)
  • 0007:2a19 = shot_entity_free (seg001 + 0x44a9)
  • 0007:2bc9 = projectile_init_vector (seg001 + 0x4659)
  • 0007:3001 = entity_fire_weapon (seg001 + 0x4a91)
  • 0007:3088 = fire_weapon_from_cursor (seg001 + 0x4b18)
  • 0007:30e8 = projectile_check_hit (seg001 + 0x4b78)
  • 0007:319e = projectile_step_update (seg001 + 0x4c2e)
  • 0007:3298 = projectile_trace_ray (seg001 + 0x4d28)
  • 0007:371d = projectile_update_tick (seg001 + 0x51ad)
  • 0007:4009 = projectile_apply_hit (seg001 + 0x5a99)

Segment Map

Segment Address Range Purpose
CODE_0 1000:0000 - 1000:01ff Interrupt dispatch table / thunks
CODE_1 1020:0000 - 1020:0b9f Low-level interrupt handlers, mode switching
CODE_2 10da:0000 - 10da:25ef Main runtime — C library, I/O, formatting, entry point
CODE_3 1339:0000 - 1339:0c2f DOS/DPMI services — INT 21h/31h wrappers, interrupt vector mgmt, fast memcpy
CODE_4 13fc:0000 - 13fc:27af String data & runtime constants — error messages, format strings, Phar Lap ID
CODE_5 1677:0000 - 1677:0e8f EMS/XMS memory management — expanded memory handlers
CODE_6 1760:0000 - 1760:7ccd DOS Extender core — EXP loader, command-line parser, memory management, system init
DATA 1760:7cd0 - 1760:7cdf Global data
HEADER HEADER::0000 - HEADER::044f MZ/P2 file header

NE Import Details

  • File to import: F:\Apps\Crusader No Remorse\CRUSADER.EXE
  • Outer DOS header: MZ
  • e_lfanew: 0x36F70
  • Internal executable header: NE
  • Segment count: 145
  • Initial CS:IP: 0001:0000
  • Initial SS:SP: 0091:2000

The currently analyzed protected-mode code at addresses like 10da:7c40 is consistent with the Phar Lap runtime/loader path. To reach the rest of the program, import CRUSADER.EXE again using an NE-aware loader or a workflow that starts from the internal NE header rather than the outer DOS stub.

Next Steps

  1. NE Segment 1 imported and analyzed — all 58 identified functions renamed and annotated
  2. Raw 0007 segment analyzed — rendering, camera/scroll, save slot, and scroll region subsystems documented (~60 functions renamed and annotated)
  3. Import additional NE segments — priority: segments 22, 30, 59, 86 (segment 21 complete)
  4. Analyze raw 0007 draw helper clusterFUN_0007_03b4, FUN_0007_04b8, FUN_0007_04dc, FUN_0007_057f, FUN_0007_0614; called by sprite/draw list functions
  5. Analyze FUN_0007_4cdf — large 15-case animation/movement dispatcher; overlapping instruction warnings; cases 0, 2, 3, 6, 9, 0xa, 0xe are clean
  6. Map file format loaders.FLX, .SHP, .MAP, .TNT resource formats
  7. Cross-reference entity type constants with game entities (robots, platforms, triggers)
  8. Identify external segment calls — the func_0x0000ffff() placeholders are all cross-segment calls; resolving them requires importing the referenced segments