Crusader_Decomp/docs/psx/psx.md

806 lines
39 KiB
Markdown
Raw Permalink Normal View History

2026-03-30 00:19:01 +02:00
# Crusader: No Remorse (PlayStation) Recon
## Scope
- Target disc tree: `E:\emu\psx\Crusader - No Remorse`
- Goal of this pass: identify the boot executable, separate likely code from content, and find the most practical first extraction routes for PS1 assets.
## Immediate Conclusions
- `SYSTEM.CNF` is the disc boot file and points directly at `cdrom:\SLUS_002.68;1`.
- `SLUS_002.68` is the main game executable. It begins with a valid `PS-X EXE` header.
- No other top-level file currently looks like a second normal PS1 executable.
- Disc content is dominated by standard PS1 media (`.STR`, `.XA`) plus a large number of game-specific `.WDL` blobs.
- `FMV.BIN` looks like movie-playback support data or a resource blob, not a bootable executable.
- `ZZZ.ZZZ` looks much more like media/container data than code, and its size exactly matches `MOVIES/FMV3.STR`.
## Disc Boot Evidence
`SYSTEM.CNF` contents:
```ini
BOOT = cdrom:\SLUS_002.68;1
TCB = 4
EVENT = 10
STACK = 801FFFFC
```
`SLUS_002.68` header evidence:
- Magic: `PS-X EXE`
- Initial PC: `0x8004BD90`
- Load address: `0x80010000`
- Image size: `0x000A4800` (`675,840` bytes)
This is the clearest Ghidra import candidate for code analysis.
## Top-Level File Classification
Top-level items:
- `SLUS_002.68`
- `SYSTEM.CNF`
- `FMV.BIN`
- `ZZZ.ZZZ`
- `SPEC_A.WDL`
- `LEGAL.SCR`
- `LICENSEA.DAT`
- `AUDIO/`
- `MOVIES/`
- `MENUS/`
- `LSET1/` through `LSET7/`
Recursive extension summary from the extracted disc tree:
| Extension | Count | Total bytes | Current classification |
|---|---:|---:|---|
| `.STR` | 33 | 415,027,744 | PS1 movie streams |
| `.XA` | 2 | 91,897,856 | PS1 XA audio |
| `.WDL` | 66 | 76,198,860 | custom game asset/level blobs |
| `.ZZZ` | 1 | 26,148,984 | likely media/container data |
| `.68` | 1 | 675,840 | main PS1 executable |
| `.SCR` | 1 | 153,600 | data asset |
| `.BIN` | 1 | 90,812 | support/resource blob |
| `.DAT` | 1 | 28,032 | data asset |
| `.CNF` | 1 | 70 | boot config |
## File Family Findings
### 1. `SLUS_002.68`
- This is the boot target from `SYSTEM.CNF`.
- It has a normal PS1 executable header.
- It should be the first import into Ghidra.
- Current working assumption: this is the only primary native code binary on the disc.
### 2. `AUDIO/*.XA`
- Files found:
- `AUDIO/MULTI8.XA`
- `AUDIO/TALK1.XA`
- These are standard PS1 XA audio files.
- `MULTI8.XA` is divisible by both `2352` and `2048`, which is consistent with sector-oriented media data.
- `TALK1.XA` is divisible by `2048` but not exactly by `2352`.
- Most practical extraction route: use standard PS1/XA tooling rather than custom RE first.
### 3. `MOVIES/*.STR`
- Files found: `FMV0.STR` through `FMV32.STR`.
- These are the strongest candidates for standard PS1 video streams.
- The raw headers look consistent with sectorized PS1 stream data rather than executable code.
- Most practical extraction route: treat them as PS1 STR video and run them through PS1 media tooling first.
### 4. `FMV.BIN`
This file does not look like a normal executable.
First bytes begin with:
```text
\MOVIES\FMV%d.STR
MDEC_rest:bad option(%d)
MDEC_in_sync
MDEC_out_sync
DMA=(%d,%d), ADDR=(0x%08x->0x%08x)
FIFO=(%d,%d),BUSY=%d,DREQ=(%d,%d),RGB24=%d,STP=%d
```
Current best read:
- `FMV.BIN` is movie-related support data, code tables, or debug/resource text for the MDEC/FMVs.
- It clearly references the external movie path pattern `\MOVIES\FMV%d.STR`.
- It is worth a secondary Ghidra import only if the goal is to understand the movie subsystem specifically.
- It is not the disc boot executable.
### 5. `ZZZ.ZZZ`
Key findings:
- Size: `26,148,984` bytes
- That size exactly matches `MOVIES/FMV3.STR`.
- The file begins with stream-like binary data rather than an executable header.
- It also yielded movie-adjacent string evidence such as `MDEC`.
Current best read:
- `ZZZ.ZZZ` is probably not code.
- It is a strong candidate for either:
- a renamed movie stream, or
- a duplicate/alternate copy of `FMV3.STR`
Most practical next check:
1. compare a few sectors or hashes against `MOVIES/FMV3.STR`
2. try opening `ZZZ.ZZZ` directly in PS1 STR-capable tooling
### 6. `LSET*/L*.WDL`
These are the most important unknown asset family for content extraction.
Representative level sample: `LSET1/L0.WDL`
- Size: `1,312,624` bytes
- Header starts with structured values, not raw pixels:
```text
0x00000034 0x00006FDC 0x0000376C 0x00001000
0x00000160 0x00000498 0x0000025C 0x00000FE0
0x00000070 0x00072EC4 0x00034B6C 0x00007448
0x0007407C 0x00010824 0x00000002 0x00000000
```
- This does not behave like a flat framebuffer dump.
- It looks more like a custom structured level/container blob with internal offsets, lengths, or section pointers.
- Additional offset targets inside the file, such as `0x376C`, `0x6FDC`, and `0x10824`, land on repeating structured records rather than code.
Strict TIM-style scan results for `L0.WDL` found plausible embedded PS1 image headers at:
- `0xE7A84`
- `0x117DEC`
- `0x12CECC`
- `0x135F18`
- `0x1369F4`
- `0x136B38`
- `0x136C40`
Current best read:
- `LSET*.WDL` likely holds mixed level resources.
- At least some of those resources may include standard embedded PS1 TIM-like image blocks.
- These files are the strongest current target for a custom extractor.
Executable-guided extraction status:
- `lset_level_bundle_load` in the imported PSX executable now confirms the executable builds `\LSETn\Lx.WDL` paths directly and treats those files as the live level-bundle format.
- The same loader reads a small level header blob first, then a large SPU/audio blob, then dispatches the remaining level resource stream through `level_resource_stream_load`.
- `image_resource_bind_vram_slot` and `image_bundle_load_to_vram` show that resource types `4` and `5` are image/sprite-oriented resources: they resolve VRAM placement and upload image data through `LoadImage`.
- `sprite_rle_decode_rows` is now confirmed as the row-based decompressor used when a type-5 frame record has its compressed bit set.
- Current consequence: sprite extraction now has a real executable-backed path, while map extraction has a reliable raw-carving path even though the full tile/object semantics are not decoded yet.
### 7. `MENUS/*.WDL` and `SPEC_A.WDL`
Representative menu sample: `MENUS/M13.WDL`
- Size: `1,475,928` bytes
- Starts with dense repeating values like:
```text
0x9D499D29 0x9D299D29 0x9D4A9D4A 0xA14A9D4A
0x9D49A14A 0x9D299D29 0x9D4A9D4A 0xA14A9D4A
```
Representative special sample: `SPEC_A.WDL`
- Size: `545,424` bytes
- Starts with the same raw-looking pattern as `M13.WDL`.
Current best read:
- These do not begin with obvious pointer tables.
- They look more like raw or lightly wrapped image/screen asset data than like level containers.
- `M13.WDL` had no strict TIM hits in the quick scan.
- `SPEC_A.WDL` did have a few plausible stricter TIM-style hits at:
- `0x449A8`
- `0x80ED8`
So the `.WDL` family probably is not one single uniform format. Current evidence supports at least two subfamilies:
- structured level/resource blobs (`LSET*/*.WDL`)
- raw-looking menu/special screen blobs (`MENUS/*.WDL`, `SPEC_A.WDL`)
Executable-guided extraction status:
- `SPEC_A.WDL` does not behave like the `LSET*.WDL` container family.
- The executable-backed extractor work currently treats it as a raw blob with embedded image candidates rather than as a contiguous section table.
- A validated carve on `SPEC_A.WDL` currently finds strict TIM-style hits at `0x1B5CC` and `0x80ED8`.
## Executable-Backed Extraction Model
These findings are now grounded in both file inspection and the imported `SLUS_002.68` executable.
### Level bundles: `LSET*/*.WDL`
Validated current extraction model for `LSET1/L0.WDL`:
- top-level header size: `0x34`
- immediately following blob: `0x6FDC` bytes
- post-audio resource area starts at: `0x7010`
- high-confidence internal boundaries recovered from the header and validated with the extractor:
- `0x7448`
- `0x34B6C`
- `0x72EC4`
- `0x7407C`
Current carved regions from `L0.WDL`:
| Region | Offset | Size | Current interpretation |
|---|---:|---:|---|
| `audio_or_spu_blob` | `0x34` | `0x6FDC` | SPU/sequence data loaded by the audio init path |
| `post_audio_region_00` | `0x7010` | `0x438` | small table/directory block |
| `post_audio_region_01` | `0x7448` | `0x2D724` | strong map/placement candidate |
| `post_audio_region_02` | `0x34B6C` | `0x3E358` | strong map/placement candidate |
| `post_audio_region_03` | `0x72EC4` | `0x11B8` | small control/index block |
| `post_audio_region_04` | `0x7407C` | `0xCC6F4` | strongest current sprite/graphics bank candidate |
Important consequence:
- for map work, the best current extraction targets are `post_audio_region_01` and `post_audio_region_02`
- for sprite/graphics work, the best current extraction target is `post_audio_region_04`
Important correction from the next executable pass:
- a previously suspected text-like block in the broader PSX resource system is now confirmed separately in executable analysis as a menu/prompt text resource, not map data
- a heavily used level-side table is also now confirmed as a per-type flag/behavior table used by collision/order logic, not a raw map grid
- so the late-level extraction focus stays on `LSET*.WDL` post-audio regions, not on every large runtime table seen in the executable
The validated strict TIM carver currently finds one confirmed embedded TIM block in `L0.WDL` at:
- `0xBBA54`
That hit lands inside `post_audio_region_04`, which supports treating the late large region as the current best graphics bank candidate.
The same structure now reproduces on `LSET1/L1.WDL` too:
- header size: `0x34`
- audio blob: `0x3244`
- post-audio start: `0x3278`
- high-confidence boundaries: `0x6F48`, `0x334D8`, `0x602C4`, `0x732D4`
- late graphics candidate region: `0x732D4 .. EOF`
- current strict TIM hit: `0xB4DC8`
So the current working model is no longer based on just one level file: `LSET` bundles appear to share a stable pattern of:
1. fixed `0x34` header
2. SPU/audio blob
3. several map/meta candidate regions
4. one large late graphics-oriented region
What is still not stable yet:
- the internal semantics of `post_audio_region_01` and `post_audio_region_02` are still unresolved
- `L0.WDL` starts with rows that look structured when viewed as `u16x6`, but `L1.WDL` does not preserve the same obvious interpretation at the same region boundary
- current safest reading is that these are still raw candidate map/meta payloads, not yet a decoded placement format
### Menu / special blobs: `SPEC_A.WDL`, `MENUS/*.WDL`
These currently behave like raw image-oriented blobs, not like the structured `LSET` family.
Validated current extraction model for `SPEC_A.WDL`:
- whole-file raw blob fallback works cleanly
- strict TIM hits currently validate at:
- `0x1B5CC`
- `0x80ED8`
Representative secondary check on `MENUS/M13.WDL`:
- whole-file raw blob fallback also works cleanly there
- current strict TIM hit validates at:
- `0x493EC`
This keeps menus/special screens as a secondary image-carving problem instead of a map/container problem.
## Working Extractor
Current extractor script:
- `tools/psx_extract_wdl.py`
What it does right now:
- recognizes the validated `LSET*.WDL` top-level layout
- carves the audio blob and header-directed post-audio regions
- scans the whole file for strict TIM blocks and extracts them
- falls back to raw-blob carving for `SPEC_A.WDL` / menu-like files
- emits an exploratory `u16x6` CSV view for the first post-audio LSET candidate region so raw row patterns can be inspected without claiming final semantics
- scans the large late LSET graphics region for type-5 sprite bundle headers
- decodes row-RLE compressed sprite frames and writes raw frame payloads plus grayscale preview images
- writes carved output under `out/psx_wdl/<stem>/`
Current practical usage:
```powershell
c:/Users/Maddo/.PYENV/PYENV-WIN/versions/3.14.3/python.exe tools/psx_extract_wdl.py "E:/emu/psx/Crusader - No Remorse/LSET1/L0.WDL"
c:/Users/Maddo/.PYENV/PYENV-WIN/versions/3.14.3/python.exe tools/psx_extract_wdl.py "E:/emu/psx/Crusader - No Remorse/SPEC_A.WDL"
```
This is enough to start extracting:
- raw map-candidate blocks from level bundles
- strict TIM sprite/image blocks from both level and menu/special blobs
- exploratory raw row exports for the first LSET post-audio candidate region
- actual extracted sprite frames from at least some type-5 bundles inside the late LSET graphics region
Current caution:
- the `u16x6` export is only a raw inspection aid
- `L0.WDL` gives structured-looking rows such as `0041,177B,0F7F,0000,0002,0020`, but `L1.WDL` shows very different values at the same relative region
- so this export should be treated as evidence-gathering for map decoding, not as a solved object-placement parser yet
## Confirmed Sprite Extraction
The extractor now produces actual sprite-frame outputs from at least part of the late LSET graphics bank.
The late-graphics scan is now widened beyond the original first-32-bundle probe. Current `L0.WDL` extraction finds `159` candidate bundles in `post_audio_region_04`, and the larger-first overview now clearly includes not just floor/wall tiles but also several object/UI-like assets such as framed panels, cabinets, a portrait, a hand-shaped sprite, a bone, and other small pickup-like art.
Confirmed current example from `LSET1/L0.WDL`:
- graphics-region-relative bundle offset: `0xE5B8`
- whole-file bundle offset: `0x82634`
- mode: `2` (current best read: 4bpp indexed)
- frame count: `3`
- first extracted frame dimensions: `40 x 66`
- runtime default bundle palette index: `12`
Confirmed output files:
- raw frame bytes: `out/psx_wdl/L0/sprite_bundles/bundle_0000E5B8/frame_000.bin`
- preview image: `out/psx_wdl/L0/sprite_bundles/bundle_0000E5B8/frame_000.png`
- colored preview image: `out/psx_wdl/L0/sprite_bundles/bundle_0000E5B8/frame_000_color.png`
- colored sprite atlas: `out/psx_wdl/L0/sprite_bundles/bundle_0000E5B8/atlas_color.png`
- bundle metadata: `out/psx_wdl/L0/sprite_bundles/bundle_0000E5B8/bundle.json`
- palette metadata: `out/psx_wdl/L0/sprite_bundles/bundle_0000E5B8/palette.json`
Current result:
- this is no longer just a graphics-bank hypothesis
- the workflow can now extract actual sprite/image frame payloads and render preview images from at least some LSET graphics bundles
- the workflow can now also render colored previews and a proper per-bundle RGBA atlas using the bundle default palette index recovered from the PSX executable
- widened grayscale overviews now confirm that the late graphics bank contains recognizable object and UI art, not just texture/noise candidates
Current palette model:
- the executable-backed palette source for `LSET*.WDL` is the first `0x1000` bytes loaded into `DAT_800676d8` by `lset_level_bundle_load`
- `level_palette_upload_cluts` uploads that `0x1000` block as `8 x 16` raw 16-color CLUTs and then caches the raw CLUT handles in `DAT_800a9f48`
- `level_palette_expand_5bit_to_16color` separately builds a second `0x1000` grayscale-expanded table, but `DAT_800a9f48` is populated only from the first 8 raw rows
- the bundle header field at `+0x14` is the default palette-table index used when no override is supplied at draw time
- the remaining blocker is not locating the raw CLUT block anymore; it is recovering the per-placement palette override metadata that can replace the bundle default during map/tile rendering
Current color blocker:
- both main texture draw helpers (`FUN_80044bdc` and `FUN_80044e9c`) fall back to the bundle default palette index only when no override is present
- the important caller path at `FUN_80041458` ORs in a high-byte palette override from object/tile metadata pointed to by object field `+0xa0`
- that means standalone bundle previews can still be wrong even when the bundle parser and raw CLUT table are both correct
- the extractor now emits wider `u16x12` raw CSV views for `post_audio_region_01` and `post_audio_region_02` because the relevant override state appears to live beyond the first 6 words of those candidate placement records
- the current top-ranked portrait bundle (`bundle_00064478`, default palette index `106`) is a useful color-validation anchor because the grayscale frame is obviously correct while all raw-palette candidates remain visibly wrong
- another important unresolved issue is the exact on-disk location of the second-stage runtime header after the initial `0x3520` front image block. The loader assembly proves the runtime sequence is: `0x3520` front block -> `12`-byte header -> palette blob -> audio blob -> `4`-byte stream count, but the raw file bytes at offset `0x3520` do not yet reconcile cleanly with those expected sizes.
- `cd_file_read` itself does not transform or decompress bytes; it performs sector-based buffered CD reads. So the remaining palette-source problem is now narrowed to file-layout interpretation rather than hidden read-time decoding.
### Runtime Dump Grounding: cabinet console bundle
The new RAM/VRAM dump pair was used to ground the known-colored cabinet console bundle against live runtime state instead of continuing static palette guessing.
Verified cabinet anchor:
- bundle: `out/psx_wdl/L0/sprite_bundles/bundle_000A1B04`
- mode: `1`
- frame `0`: `56 x 68`
- default bundle palette index: `0`
Verified live-texture result from `binary/Crusader - No Remorse (USA) GPU RAM.bin`:
- the frame payload from `bundle_000A1B04/frame_000.bin` exists in live VRAM as one exact `8bpp` texture match
- exact match location: texel `x=258`, `y=256`
- texture page: `(1,1)`
- in-page offset: `(2,0)`
- no flipped exact match was found
This is a strong confirmation that the current `mode 1` pixel decode is correct. The remaining problem is CLUT selection, not texture extraction.
Verified CLUT result from the same dump:
- the active CLUT band used by `level_palette_upload_cluts` still sits at rows `0xF0..0xF7`
- the important successful step was simpler than the later screen-match ranking pass: in `live_vram_clut_atlas.png`, the very first candidate at the top-left corner is the correct formula for this visible cabinet family
- that top-left candidate is the contiguous `256`-entry palette taken directly from live GPU row `0xF0` at `x=0`
- in other words, current best read for this `mode 1` family is: `byte value -> direct index into the 256-word slice [row 0xF0, x 0..255]`
- equivalently, this behaves like `16` adjacent live `16-color` CLUTs flattened into one `256`-entry lookup table for the sprite byte stream
- the later numeric ranking pass that preferred handle `64` / row `0xF4` was misleading for this case and should not be treated as the correct palette formula
Important consequence:
- the dump-grounded success case is not `bundle default row 0` from `L0.WDL` and not the later `row 0xF4` ranking result
- the working palette source is the live VRAM CLUT row `0xF0`, `x=0`, treated as one contiguous `256`-entry table
- this means the current extractor problem for `mode 1` bundles is better described as `recover the runtime CLUT-row formula` rather than `pick one cached CLUT handle index`
- for the visible wall-console bundle, that runtime formula now has a concrete verified answer even though the higher-level metadata path that selects it is still unresolved
Wider decode result using the corrected formula:
- a focused batch renderer was run over the detected `mode 1` bundles in `LSET1/L0.WDL` post-audio graphics region `04`
- using the same live palette source `row 0xF0 / x=0`, the pass rendered `92` `mode 1` bundles with plausible colored output instead of only the single cabinet proof case
- the strongest batch proof is the generated overview:
- `out/psx_wdl/L0/mode1_live_clut_row_f0_x0/overview_live_row_f0_x0.png`
- per-bundle outputs and summary metadata now live under:
- `out/psx_wdl/L0/mode1_live_clut_row_f0_x0/`
- `out/psx_wdl/L0/mode1_live_clut_row_f0_x0/summary.json`
- that wider pass now shows many object-like assets decoding plausibly under the same rule: cabinets, panels, tanks, wall fixtures, floor markers, weapons, pickups, and small machinery props
Generated runtime-grounded artifacts:
- `binary/psx_framebuffer_left.png`
- `binary/psx_framebuffer_console_crop.png`
- `out/psx_wdl/L0/sprite_bundles/bundle_000A1B04/live_vram_clut_atlas.png`
- `out/psx_wdl/L0/sprite_bundles/bundle_000A1B04/live_vram_clut_top_matches.png`
- `out/psx_wdl/L0/sprite_bundles/bundle_000A1B04/live_vram_clut_best.png`
- `out/psx_wdl/L0/sprite_bundles/bundle_000A1B04/live_vram_clut_rank.txt`
- `out/psx_wdl/L0/mode1_live_clut_row_f0_x0/overview_live_row_f0_x0.png`
- `out/psx_wdl/L0/mode1_live_clut_row_f0_x0/summary.json`
One important caveat from the dump-grounded pass:
- none of the `8` raw `256-color` palette blocks carved from `LSET1/L0.WDL` matched the live CLUT rows byte-for-byte in this dump
- that means either the dump was captured from a different loaded level/resource set, or the active runtime palette source for this on-screen console is not the raw `L0.WDL` palette blob currently being tested
Palette follow-up note:
- most extracted PSX sprite data is now structurally correct, but palette selection is still only partially solved
- the current exporter should therefore be treated as `good enough to continue map work`, not as a final automatic color pipeline
- we need a later pass to recover the runtime palette-selection rule well enough to assign the correct palette automatically for every bundle family instead of relying on the currently verified `mode 1` rule plus heuristics
## PSX Map Decode Plan
Current objective:
- decode the PSX `LSET*.WDL` map/resource layout well enough to render PSX maps through the existing public map renderer pipeline instead of building a one-off viewer
Current working split:
- `post_audio_region_01` is now the first high-confidence map-placement candidate
- `post_audio_region_02` still looks more like compressed or mixed resource payload than directly renderable placement rows
- `post_audio_region_04` remains the late graphics bank and should be treated as the PSX art source for any eventual renderer integration
Immediate working hypothesis:
- `post_audio_region_01` is a fixed-row authored layout stream
- the raw `u16x12` view is already showing that each `24`-byte row behaves like two adjacent `6`-word records with similar field structure
- the strongest early evidence is that neighboring left/right halves carry similar value ranges and repeat the same small control words in the tail fields, which is what we would expect from paired cell/object placements rather than opaque compressed data
Practical renderer goal:
- adapt the PSX decode into the same broad source model the public renderer already uses for PC fixed maps: coordinates, shape/frame identity, and a few raw metadata bytes/words kept for inspection
- do not block on fully naming every field before producing a first renderer-fed PSX map source
Planned work order:
1. Lock down `post_audio_region_01` row structure across more `LSET` files and confirm whether `24` bytes is the true authored row size.
2. Separate the two half-rows into individual candidate placement records and track stable min/max ranges for each word position.
3. Identify which words are likely coordinates by checking for bounded map-like ranges and local spatial continuity between neighboring rows.
4. Identify which words are likely tile/object ids by checking whether the same values recur in ways that match repeated wall/floor/object motifs.
5. Correlate the placement stream against `post_audio_region_04` bundle offsets or bundle-local ids to recover the art linkage.
6. Determine whether `post_audio_region_02` is a secondary map layer, a lookup table for region `01`, or a different compressed resource class entirely.
7. Prototype a PSX map-source exporter that emits JSON in a renderer-friendly form even if some fields are still labeled as raw words.
8. Add a PSX-specific loader path to the existing map renderer instead of creating a separate PSX viewer.
9. Once the first map renders, iterate on field naming, layer semantics, and art binding rather than trying to solve the whole format up front.
Current evidence-backed next step:
- the extractor now needs to keep emitting a paired-record export for `post_audio_region_01` so the candidate row model can be checked quickly across multiple maps without reinterpreting the CSV by hand each time
Current renderer-compatibility result:
- a first PSX-compatible static real-art probe scene is now exported for the public map renderer
- exporter script:
- `tools/psx_export_map_debug_scene.py`
- current generated public-report outputs:
- `k:\ghidra\Crusader_Decomp_Public\map_renderer\site\data\maps\psx-remorse\map-0\scene.json`
- multiple copied frame atlases such as `k:\ghidra\Crusader_Decomp_Public\map_renderer\site\data\maps\psx-remorse\map-0\bundle_0003917C_frame_000.png`
- `k:\ghidra\Crusader_Decomp_Public\map_renderer\site\data\catalog.json`
- `k:\ghidra\Crusader_Decomp_Public\map_renderer\site\data\catalogs\psx-remorse.csv`
- current scene characteristics:
- source: filtered `LSET1/L0.WDL` `post_audio_region_01` paired-record candidates
- rendered items: `1050`
- unique bundle-backed shape definitions: `49`
- copied atlas/frame PNGs: `62`
- bounds: `3896 x 8431`
- scene format version: `psx-region01-bundle-probe-v1`
- current probe stats: `u0` span `62..111`, fallback frame count `187`
Current art-binding hypothesis used by this probe:
- region-01 `u0` is treated as a provisional direct bundle index into the extracted `sprite_bundles/` set
- region-01 `u4` is treated as a provisional frame index within that bundle, clamped to the highest available frame when out of range
- this is evidence-backed enough to render real PSX art in the existing map renderer, but not strong enough yet to call the binding solved
- the strongest negative check so far is that the region-01 `u5` values (`0x20`, `0x22`, `0x30`) do not match the bundle default palette indexes, so the palette-selection/control path is still missing
Current invalidation result:
- this direct `u0 -> bundle index` mapping is now considered invalid for real renderer output
- the produced scene repeats a small set of obviously wrong assets, including portrait/UI-like art, in places that do not make spatial sense for the map
- executable-side tracing shows that art selection is type-driven through `DAT_800758cc/d0/d4/d8` resource tables loaded by `level_resource_stream_load`, not by directly indexing the raw `post_audio_region_04` bundle scan
New loader/data evidence from this pass:
- `post_audio_region_00` now has dedicated extractor diagnostics:
- `out/psx_wdl/L0/post_audio_region_00_00007010_u16x6.csv`
- `out/psx_wdl/L0/post_audio_region_00_00007010_u16x12.csv`
- `out/psx_wdl/L0/post_audio_region_00_00007010_u32x5.csv`
- `out/psx_wdl/L0/post_audio_region_00_00007010_stream_probe.json`
- the new raw probe confirms that `post_audio_region_00` begins with a little-endian count value `0x20`
- after an initial short header/preamble, the bytes from about `0x3c` onward look like tightly packed `12`-byte records in the same broad shape family as the old candidate placement rows:
- example bytes at `0x3c`: `4a 00 03 16 e7 0e 00 00 01 00 20 00`
- little-endian words: `0x004A, 0x1603, 0x0EE7, 0x0000, 0x0001, 0x0020`
- that record family is a better next target than the invalidated direct bundle probe because it already exposes a small type-like word (`0x004A`) plus coordinate-like words without forcing an arbitrary raw-bundle index
What this first public renderer pass means:
- the existing renderer app can now load a PSX scene bundle from the static report without any PC `FIXED.DAT` dependency
- this is currently a real-art probe of filtered placement candidates, not a final decoded PSX map
- the renderer now displays extracted bundle art from `post_audio_region_04` instead of synthetic colored stand-ins
- the current output is still useful because it shows that filtered region-01 records can drive recognizable, repeatedly used PSX art through the existing renderer pipeline
- one bad extracted origin (`1x6` sprite with `xoff=65535`) initially blew out the fit bounds; the exporter now sanitizes implausible origins before writing scene metadata
Current app compatibility notes:
- the public renderer app was updated so non-`FIXED.DAT` map sources do not advertise a bogus binary export path
- for the PSX probe scene, `Download Map Binary` is intentionally disabled while `Download PNG`, `Download Map JSON`, and `Download Atlas PNG` remain available
- the static app successfully loads the `PSX LSET1/L0 Region 01 Art Probe` catalog entry and currently fits it at about `8%` zoom instead of the earlier collapsed `2%` fit
Immediate implications for the next decode pass:
- the public renderer integration path is now proven enough to use as a live debug target for PSX map-format work
- the next priority is to replace the invalidated `u0 -> bundle index` hypothesis with a real type/resource lookup recovered from `level_resource_stream_load`
- `post_audio_region_00` is now a top-tier candidate for that work because its new diagnostics expose a count-prefixed preamble plus compact typed records that look more loader-compatible than the old region-01 art probe
- the palette override path is still the main blocker to correct final color selection even when the bundle/frame choice is plausible
- once the bundle key and palette control path are recovered, the same scene-export path can graduate from `real-art probe` to actual PSX map rendering
## PSX Script / Usecode Equivalent
Current status:
- there is no evidence yet that the PSX build carries the exact same external `USECODE`/`EUSECODE.FLX` style asset pipeline used by the DOS version
- the current PSX executable-backed work has mostly exposed compiled resource loaders, animation/audio handlers, and image upload/decode paths rather than a separate obvious bytecode container
Current working question:
- the likely PSX equivalent, if one exists, may be either:
- compiled gameplay logic directly inside `SLUS_002.68`, or
- a separate embedded event/script resource format inside the `LSET`/other disc blobs that is not yet isolated
Immediate plan:
1. scan the PSX executable and current renamed function set for script/event-dispatch terminology or obvious VM-style control loops
2. compare any candidate dispatch path against the DOS usecode model only at the behavioral level, not by assuming the asset format is shared
3. keep this as a secondary track while map decoding takes priority
## Practical Extraction Paths
### Standard media first
The easiest wins are the standard PS1 media formats:
- `MOVIES/*.STR`: treat as PS1 video streams
- `AUDIO/*.XA`: treat as XA audio
- `ZZZ.ZZZ`: try as a movie stream too, especially against `FMV3.STR`
This does not need custom reverse engineering first.
### Custom `.WDL` extraction second
The `.WDL` files are the main custom-content frontier.
Current executable-backed extraction order:
1. run `tools/psx_extract_wdl.py` over representative `LSET*.WDL` files
2. treat `post_audio_region_01` and `post_audio_region_02` as the current best map-data extraction targets
3. treat `post_audio_region_04` as the current best sprite/graphics extraction target
4. carve any strict TIM blocks first, because those now have executable support via the type `4` / type `5` image handlers
5. separately carve `SPEC_A.WDL` / `MENUS/*.WDL` as raw image-oriented blobs
The level files and menu/special files should not be assumed to share one parser until that is proven.
## Recommended Ghidra Import Candidates
### Primary
1. `E:\emu\psx\Crusader - No Remorse\SLUS_002.68`
Reason:
- confirmed by `SYSTEM.CNF`
- valid `PS-X EXE`
- main native code image
### Secondary, only if useful for subsystem RE
2. `E:\emu\psx\Crusader - No Remorse\FMV.BIN`
Reason:
- clearly tied to FMV playback
- contains path and MDEC-related strings
- could be worth importing as a raw binary/data blob if the movie subsystem becomes a target
### Not primary code imports
These currently look like content, not executables:
- `E:\emu\psx\Crusader - No Remorse\ZZZ.ZZZ`
- `E:\emu\psx\Crusader - No Remorse\SPEC_A.WDL`
- `E:\emu\psx\Crusader - No Remorse\LSET1\L0.WDL`
- `E:\emu\psx\Crusader - No Remorse\MENUS\M13.WDL`
They may still be worth loading as raw binaries later for format RE, but they are not first-choice code imports.
## Current Working Model
- `SLUS_002.68` = main PS1 executable
- `FMV.BIN` = FMV helper/support blob
- `MOVIES/*.STR` = standard movie streams
- `AUDIO/*.XA` = standard XA audio
- `ZZZ.ZZZ` = likely renamed or duplicated movie stream data
- `LSET*.WDL` = structured level/resource containers
- `MENUS/*.WDL` and `SPEC_A.WDL` = raw-looking screen/menu resource blobs, possibly with some embedded standard PS1 image content
## Executable Catalog Findings
This batch focused on the imported `SLUS_002.68` executable as a catalog source rather than on the raw `WDL` bundles alone.
### Map inventory and mission-facing structure
Current executable-backed map findings:
- `wdl_resource_bundle_load_by_index` now has a direct string-backed proof for the shipped folder layout. The loader copies one of seven hardcoded path prefixes `\LSET1\L` through `\LSET7\L` based on map-index thresholds `10`, `20`, `30`, `40`, `50`, and `60`, then formats the final `.WDL` path.
- The extracted disc tree currently ships `62` level bundles total:
- `LSET1`: `L0` through `L9`
- `LSET2`: `L10` through `L19`
- `LSET3`: `L20` through `L29`
- `LSET4`: `L30` through `L39`
- `LSET5`: `L40` through `L49`
- `LSET6`: `L50` through `L58`
- `LSET7`: `L62` through `L64`
- So the shipped PSX map-bundle range is `L0..L64` with a real on-disc gap at `L59..L61`.
- The executable also preserves only `15` plain-text `Mission Briefing ^Mission N` strings, for `Mission 1` through `Mission 15`.
Current safest read:
- the PSX disc contains `62` shipped map/resource bundles used by the `LSET` loader
- the player-facing campaign/briefing flow exposed by the executable is `15` numbered missions
- any extra bundle coverage beyond that mission-facing set is currently better treated as lower-level map/resource inventory, not automatically as `15 == all shipped WDLs`
Per-bundle shipped inventory from the extracted disc tree:
| Bundle range | Folder | Count | Size range (bytes) |
|---|---|---:|---:|
| `L0..L9` | `LSET1` | 10 | `987,932 .. 1,312,624` |
| `L10..L19` | `LSET2` | 10 | `1,107,380 .. 1,314,992` |
| `L20..L29` | `LSET3` | 10 | `904,384 .. 1,221,556` |
| `L30..L39` | `LSET4` | 10 | `1,104,316 .. 1,321,656` |
| `L40..L49` | `LSET5` | 10 | `1,120,084 .. 1,303,732` |
| `L50..L58` | `LSET6` | 9 | `1,012,956 .. 1,341,684` |
| `L62..L64` | `LSET7` | 3 | `965,072 .. 1,150,428` |
### Passcodes and password-screen cheat status
Current executable-backed passcode findings:
- The mission-complete passcode display path at `80022cd4` and `80022f1c` synthesizes a `4`-character code from generated indexes.
- Those indexes are mapped through the hardcoded alphabet at `80063ef0`:
```text
BCDFGHJKLMNPQRSTVWXZ0123456789
```
- The resulting `4` characters are written into the temporary display buffer at `80063f6e..80063f71`, null-terminated at `80063f72`, and shown through the completion message at `80063f10`:
```text
^Congratulations!^ You have completed your mission.^^The passcode for the next mission is:^
```
- So PSX mission passcodes are definitely real executable-generated `4`-character values, not just external manual text.
Current best password-screen cheat list from public PSX references:
- `XXXX` = hidden pictures
- `L0SR` or `L0SER` = cheat-mode password reported by public sources; the conflicting transcription is almost certainly a `0` vs `O` issue and is not yet closed directly from the executable
Important executable-side caveat:
- none of the known public PSX mission passwords checked in this pass (`FWQP`, `HWQP`, `LRTN`) appear as plain ASCII strings inside `SLUS_002.68`
- the same is true for the public cheat-password candidates `XXXX`, `L0SR`, and `L0SER`
- current safest read is therefore `password entry and/or validation is numeric or transformed`, not `a plain embedded string table of passcodes`
- this pass closed the visible generation/display side, but it did **not** yet directly close the hidden cheat-password compare path
### Weapons and items
The executable does preserve user-facing text tables for equipment.
Recovered ammo names:
- `INVALID AMMO`
- `JL-2 AMMO`
- `AR-7 AMMO`
- `GL-303 AMMO`
- `RP-22 AMMO`
- `SG-A1 AMMO`
Recovered item names:
- `NULL ITEM`
- `INHIBITOR`
- `CREDITS`
- `SCI PLANS`
- `BLAST PAC`
- `DET PAC`
- `DATA LINK`
- `LAND MINE`
- `SPIDER BOMB`
- `MEDICAL KIT`
- `ENERGY CUBE`
- `FUSION PAC`
- `CHEMICAL BATTERY`
- `FISSION BATTERY`
- `FUSION BATTERY`
- `GRAVITON GENERATOR`
- `IONIC GENERATOR`
- `PLASMA GENERATOR`
Recovered weapon names:
- `RP-16`
- `RP-22`
- `RP-32`
- `SG-A1`
- `AC-88`
- `PA-31`
- `EM-4`
- `PL-1`
- `UV-9`
- `GL-303`
- `AR-7`
- `JL-2`
- `JL-9`
Current safest read:
- these are real executable-backed display-name tables, not guessed carryovers from the DOS build
- the PSX build still uses a recognizable Crusader equipment taxonomy even where some item labels differ from the more familiar DOS-side vocabulary
JL-2 / JL-9 follow-up:
- neither `JL-2` nor `JL-9` appears in the known DOS `Weapon_GetNameForShapeNo` tables already extracted in this repo for retail Remorse or Regret; those tables stop at the older DOS weapon families such as `BA-40`, `BA-41`, `PA-21`, `EM-4`, `SG-A1`, `RP-22`, `RP-32`, `AR-7`, `GL-303`, `PA-31`, `PL-1`, `AC-88`, `UV-9`, and the Regret-only additions `BK-16`, `LNR-81`, `XP-5`
- that makes `JL-2` and `JL-9` strong PSX-only naming additions rather than inherited PC names
- `JL-2` is also the only one of the two with an explicit PSX ammo label (`JL-2 AMMO`) in the nearby executable text table, while no matching `JL-9 AMMO` string has been recovered
- the extracted PSX `pickups_and_weapons` sprite category contains repeated weapon-pickup art across a large spread of maps, but this pass still does not have a defensible sprite-to-name mapping for specific `JL-2` or `JL-9` pickup appearances
### Enemies
This pass did **not** recover a comparable plain-text enemy-name table from `SLUS_002.68`.
What is closed:
- the PSX executable has clean user-facing text for mission briefings, passcode UI, ammo, items, and weapons
- the same executable does **not** expose an equally obvious plain-text enemy catalog in its main printable-string regions
Current safest read:
- enemy identities in the PSX build are probably carried primarily as numeric resource/type ids, spawn tables, or script/resource references rather than as a direct display-name list
- the next enemy-focused pass should start from enemy spawn/type dispatch or resource-stream type tables, not from more blind string hunting
## Highest-Value Next Steps
1. Run `tools/psx_extract_wdl.py` over more `LSET*.WDL` samples and compare whether the high-offset region pattern stays stable across level sets.
2. Recover the password-entry validation path directly so the hidden PSX cheat-password compare logic can be proven from code instead of only cross-referenced from public password lists.
3. Focus map decoding on `post_audio_region_01` and `post_audio_region_02`, starting with table structures, coordinate ranges, and repeated record widths.
4. Focus sprite/graphics decoding on `post_audio_region_04`, including more aggressive TIM validation and possible packed-image expansion.
5. Recover the exact type IDs consumed by `level_resource_stream_load` so the sprite/image resource records can be labeled more precisely.
6. Compare carved `post_audio_region_04` image assets against on-screen level graphics to separate sprite sheets from tiles.
7. Run the raw-blob fallback across `MENUS/*.WDL` to identify which menu files contain usable embedded TIM data and which are likely packed 15-bit images.