Crusader_Decomp/psx-map-exporter/docs/spec.md
2026-04-18 14:38:40 +02:00

17 KiB
Raw Blame History

PSX Map Exporter Spec

Goal

psx-map-exporter is a standalone Node.js probe for Crusader PSX map extraction.

It exists to prove a fresh end-to-end path from raw LSET*.WDL input to:

  • extracted intermediate sprite assets under .cache
  • a rendered map PNG under .output

This project does not reuse Crusader-Map-Viewer code, scene caches, donor mappings, or sidecar summaries as binding inputs. It only consumes raw PSX assets plus the documented executable-backed findings from docs/psx and the live Ghidra session.

Scope

Version v0 is intentionally narrow.

It will:

  • read one PSX LSET*.WDL file
  • parse the documented 0x38-byte top-level header
  • carve the post-audio map/art regions from header-derived boundaries
  • parse the loader-sized post-audio sections as a second, higher-value view of the file layout
  • extract the dense constructor-placement family from post_audio_section_00
  • keep the smaller root-dispatch family available as a comparison probe
  • render a layered authored probe that can combine constructor placements with the smaller root-dispatch lane
  • scan post_audio_region_04 for type-4/type-5 sprite bundles
  • decode bundle frames directly from the raw WDL
  • write extracted frame PNGs to .cache
  • compose a probe map PNG to .output

It will not claim full runtime parity yet.

Known non-goals for v0:

  • exact CLUT reproduction
  • full stage-1 dependency-graph ordering
  • full post_audio_region_01 / post_audio_region_02 semantic decode

Landed in the current pass (was previously a non-goal):

  • loader-faithful DAT_800758d8 active-header bank binding via explicit parses of the artInstall and override blocks in both SPEC_A.WDL and the map-local LSET*.WDL. See Loader Layout and Art Binding Rule below.

Evidence Constraints

The implementation is grounded in these current facts from the docs and Ghidra:

  • LSET*.WDL begins with 14 little-endian u32 size fields (56 bytes total) describing the sequence of post-header blocks. The loader (wdl_resource_bundle_load_by_index @ 0x80039444) reads each size and carves the blocks in order: packPreamble, dispatchRootsSize, ctorPlacementsSize, packTailRewindSize, ctorPlacementSection, sectionPackBaseSize, policyTableSize, table8006754cSize, opcodeStreamsSize, detachedBlobSize, artInstallSize, stateBankSize, overrideSize, stateBank2Size.
  • SPEC_A.WDL (global bundle A) begins with a fixed 0x3520-byte VRAM preload, followed by the same 14-field size header. Bundle A skips the sectionPack and detachedBlob blocks entirely; the remaining blocks are still present in the same order.
  • The old "audio blob at word[1]" model was incorrect for this stream; the first 14 u32 words are block-size descriptors, not a single audio size.
  • The post-audio four-region carve is kept as a fallback diagnostic view but is no longer the primary input for record or art extraction.
  • The small count-prefixed section-0 root-dispatch rows are real, but they are not the whole map object set.
  • The dense constructor-placement records recovered from loader-sized post_audio_section_00 are the live-object seed source for rendering.
  • post_audio_region_04 is retained only as a fallback bundle source; real art now flows through the artInstall and override blocks parsed out of the 14-u32 layout.
  • Type-4/type-5 drawable bundles expose width, height, palette mode/index, frame count, frame table offset, and data offset in the raw 0x58-byte bundle header.
  • Bundle frame entries use a 20-byte row with size, relative data offset, width, height, origin x/y, and flags.
  • sprite_rle_decode_rows uses row-local control bytes:
    • positive: repeat next byte N times
    • negative: copy next abs(N) literal bytes
    • zero: end row
  • The executable projection basis (per psx_project_object_main_visible @ 0x80040d44) is, in pixel units, with no extra scale factor:

screen_x = y - x

screen_y = 2z - \frac{x + y}{2}

  • Record X/Y values are already in screen-pixel units. The live view-cull box is camera +/- 0x140 = +/- 320 pixels, matching PSX screen width. The exporter therefore uses PSX_SCREEN_SCALE = 1; earlier builds multiplied by 2, producing over-spaced maps.

Loader Layout

Both SPEC_A.WDL and LSET*.WDL are fed to the same loader body, once per WDL pass. Each pass runs two art installs, two state-bank installs, and one override install. The loader reads the 14-u32 size header starting at offset 0 (LSET) or 0x3520 (SPEC_A) and lays out blocks sequentially.

  • artInstall block (at 0x800396a0 for bundle A, 0x80039988 for bundle B): directory and payloads live at block + 0x2718. The first 0x2710 bytes of the block are a scratch header cache used while resources are built. The directory format is { u32 count; u32 directoryOffset; } at the start of the block, then count entries of { u32 size; u32 typeId; } at block + 0x2718 + directoryOffset. For each non-zero entry the loader installs a built-resource pair { u16 kind; u16 _; u32 resource_ptr } into DAT_800758d8[typeId] (0x18-byte stride).
  • override block (at 0x80039730 for bundle A, 0x80039a18 for bundle B): same directory format, but the payload cursor starts at block + 8 (directly after the 8-byte prefix). Each non-zero entry payload is a raw 0x58-byte drawable header whose pointer is written straight into DAT_800758d8[typeId] at 0x8003977c / 0x80039a64, overwriting whatever the earlier artInstall pass installed. Zero-size entries clear the bank slot.
  • Apply order per loader call: SPEC_A artInstall → SPEC_A override → LSET artInstall → LSET override. Later writes win, so the final DAT_800758d8 state is a mix of built-resource pointers and raw override headers.

Evidence retained for reference

  • The direct typeWord -> bundle slot scan-order binding is disproven as a final art rule and is retained only as a diagnostic bundle-family probe.

Input Model

The exporter accepts either:

  • a direct --wdl path
  • or a --source path relative to a PSX disc root

Default disc root for local workspace runs:

  • d:/Ghidra/Crusader-Map-Viewer/map_renderer/STATIC_PSX

Expected source examples:

  • LSET1/L0.WDL
  • LSET4/L37.WDL

Output Layout

.cache

Per-run cache path:

  • .cache/<map-stem>/

Contents:

  • wdl-summary.json
  • records.json
  • bundles.json
  • frame-manifest.json
  • active-header-overrides.json
  • sprites/<bundle-offset>/frame_<n>.png

The cache is disposable. It exists to preserve intermediate evidence and make re-runs inspectable.

records.json now also records constructor-stream detection metadata when available: stream header offset, record start offset, reported count, and the initial structured-prefix run.

The cache also records candidate late DAT_800758d8 header-only override blobs as a standalone diagnostic. Those candidates are not used as final art binding yet.

wdl-summary.json now also emits sceneInterpretation, which is an explicit warning-bearing classification of what the current export most likely represents. For constructor-placement exports this should currently read as a constructor-fed live-object seed lane rather than a final visible-world reconstruction.

.output

Per-run final outputs:

  • .output/<map-stem>.png
  • .output/<map-stem>.json
  • .output/<map-stem>_<layer>.png for each rendered authored layer when layered mode is active

The JSON stores the final probe scene manifest used to draw the PNG.

The .output folder is reset at the start of each export so evaluation only sees artifacts from the current run.

The .output/<map-stem>.json manifest inherits sceneInterpretation from wdl-summary.json so consumers do not need to infer that warning from prose docs alone.

Record Extraction Rules

v0 pulls scene records from two loader-faithful lanes inside the section pack, matching the executable's two dispatch iterators. Both lanes are indexed through packSubranges from the 14-u32 loader layout.

Constructor placements (12-byte stride)

  • Source: ctorPlacements pack subrange (word 2).
  • Dispatcher: psx_dispatch_section0_constructor_placements @ 0x800258cc.
  • Layout: [u32 count][count * { u16 typeWord; u16 X; u16 Y; u16 Z; u16 selector; u16 flags }].
  • The dispatcher passes each record directly to descriptor_table[typeWord].slot0(record, 0) and downstream spawners (e.g. psx_object_create_compound_record) read exactly the six u16 fields.
  • Older heuristic region-01 / section-0 scans are retained as compatibility fallbacks when the loader block is absent or empty.

Dispatch roots (24-byte stride)

  • Source: dispatchRoots pack subrange (word 1).
  • Dispatcher: psx_dispatch_section0_dispatch_roots @ 0x800256b0.
  • Layout per record: [u32 count] followed by 24-byte entries whose dispatcher-visible fields are:
    • +0x04 u16 typeId indexes psx_type_descriptor_table
    • +0x08 u16 screenX used directly by the +/- 0x140 view-cull
    • +0x0A u16 screenY same
    • +0x10 u16 flags bit 3 skips the record
  • Remaining fields are forwarded to descriptor slot 0. The exporter empirically projects +0x06 as z, +0x0C as selector, +0x0E as lane, with relaxed plausibility because the live dispatcher only requires the fields above.

Selection modes

  • auto / combined / layered merges both lanes into one layered probe.
  • constructors / region01 returns only the 12-byte constructor placement records (preferring the loader block; falling back to the region-01 heuristic stream).
  • roots / region00 returns only the 24-byte dispatch-root records (preferring the loader block; falling back to the region-00 paired-record scan).

Renderable-record counts for the current validation set (auto mode):

  • LSET1/L0.WDL: 2334 total (1182 constructor placements + 1152 dispatch roots).
  • LSET4/L37.WDL: 1463 total.

This is now a loader-faithful schema for the two main visible-object lanes. The older count-prefixed region heuristics are kept only as compatibility fallbacks.

Art Binding Rule

v0 now binds art via a loader-faithful DAT_800758d8 parse. For each scene record with typeWord = T:

  1. First preference: the bundle installed at DAT_800758d8[T] by the LSET override pass (bundleSource = override-bank-lset).
  2. Then: SPEC_A override pass (bundleSource = override-bank-spec-a).
  3. Then: LSET artInstall pass (bundleSource = art-install-lset).
  4. Then: SPEC_A artInstall pass (bundleSource = art-install-spec-a).
  5. Fallback only when no loader block covers the type: raw post_audio_region_04 scan slot (bundleSource = raw-scan).

Mapping sources are recorded per item so failures stay auditable. For the current L0 / L9 / L37 validation runs there are no raw-scan fallbacks; every rendered type resolves through artInstall or override.

The opt-in runtime-map0-masked-proxy mode is retained as a secondary override for research against the runtime map-0 RAM snapshot. It no longer supplies the primary binding.

The older typeWord -> bundle slot scan-order rule is retained only as a named binding mode (raw) for negative-evidence experiments. It is not claimed as executable truth.

When debug labels are enabled for a map render, labels identify unique rendered resources rather than per-instance placements. The stable label key is bundle offset + clamped frame + resolved palette.

Rendering Rule

For each record:

  • compute screenX and screenY from the documented projection basis
  • select frame index from selectorWord, clamped to available frames
  • place sprite top-left at:
    • screenX - originX
    • screenY - originY

Projection sign convention

psx_project_object_main_visible @ 0x80040d44 writes proj_x = Y - X and proj_y = 2*Z - (X + Y) / 2 to obj+0x78/0x7c. The engine's draw step at 0x80040e3c/0x80040e5c then computes screen = (cam - proj) - origin, which flips the sign of the projection relative to canvas Y-down space. For a camera-less full-map export the exporter bakes that flip into projectCtorPlacement, so higher world-Z and smaller (X+Y) sit visibly higher on the output PNG. The same projection is applied to both constructor placements and dispatch-root records; dispatch-root X/Y fields are in world coordinates, not pre-projected screen coordinates, despite the runtime camera +/- 0x140 cull comparing against them directly.

Authored layer semantics

The two authored lanes carry different responsibilities:

  • constructors: static level geometry — walls, floors, architecture placed by psx_dispatch_section0_constructor_placements.
  • roots: interactive / dynamic objects — crates, terminals, doors, pickups placed by psx_dispatch_section0_dispatch_roots.

Despite the dispatcher name, the roots lane is not the map background; it is the live-object seed list. For the exporter, "constructors" is the geometry layer and "roots" is the object layer.

Painter's order

The exporter sorts items before blitting using the following keys, in order:

  1. stage ascending. stage = 1 when typeWord === 4 or laneWord & 0x0400 is set; those overlays draw last.
  2. Authored-layer priority: constructors (0) before roots (1). Static geometry draws first so interactive props in the same room do not get hidden behind the floor or wall pieces that occupy the same cell.
  3. Isometric depth ascending: back-to-front by world X + Y (isometric ground-depth axis). Falls back to projected screenY when world coordinates are unavailable.
  4. World Z ascending within the same ground cell so lower elevations draw before taller objects sharing the same footprint.
  5. screenX ascending as a stable tie-breaker.

This is still an approximation of the engine's stage-1 graph order but is closer to what an isometric painter's algorithm would produce than the earlier screenY-only sort.

The rendered PNG uses a neutral opaque background by default so probe silhouettes are legible without relying on transparency.

Color Rule

The exporter resolves palettes entirely from the WDL contents. It does not require any RAM or VRAM dump; those paths are now optional research overrides.

The map-local palette blob lives at headerWords[2] .. headerWords[2] + 0x1000 (4096 bytes = 2048 colors = 128 × 16-entry CLUTs). The blob is what the engine uploads to VRAM rows 0xf0 .. 0xf7 on map load; each VRAM row is 16 CLUTs wide so the 128 CLUTs tile exactly 8 rows of 16 CLUTs.

Resolution rules by bundle mode:

  • Mode 2 (4bpp): the bundle header's paletteIndex at +0x14 is the 16-entry CLUT index into the WDL palettes16 bank. When that index points at a sparse/empty CLUT the exporter falls back to a per-bundle palette sweep that picks a CLUT covering the pixel-index set used by the bundle frames.
  • Mode 1 (8bpp): the 256-color CLUT is the concatenation of 16 consecutive 16-entry CLUTs from the WDL bank. The bundle's paletteIndex is treated as the starting CLUT index. For the current L0 dataset every mode-1 bundle stores paletteIndex = 0, which is the top-left 256-color bank. Mode-1 color fidelity is therefore approximate until the level-specific 256-CLUT source (suspected to live in the stateBank block) is decoded — tracked as a follow-up.

Transparent pixel index 0 stays transparent during blit regardless of the color value stored at CLUT index 0.

CLI

Primary command:

node src/cli.js --source LSET1/L0.WDL

Supported options:

  • --source <relative-path>
  • --wdl <absolute-or-relative-file>
  • --disc-root <path>
  • --binding-mode <raw|runtime-map0-masked-proxy>
  • --map-source <auto|combined|layered|constructors|roots|region01|region00>
  • --out-name <stem>

Success Criteria

v0 is successful if it can:

  • parse a raw LSET*.WDL
  • recover the loader-sized section view alongside the region carve
  • scan bundles directly from post_audio_region_04
  • decode at least one frame from raw data
  • extract a stable constructor-placement record set from post_audio_section_00
  • write extracted sprite PNGs into .cache
  • write a readable diagnostic probe PNG into .output

Planned Follow-Ups

  • decode the stateBank and stateBank2 blocks to recover the level-specific 256-color CLUT used by mode-1 sprites. Current mode-1 palettes default to CLUT-bank start 0, which produces plausible colors for some sprites but renders many indoor floor tiles as solid green plates.
  • extend sceneInterpretation so it reflects the landed loader-faithful binding instead of the older repeated-wrong-art warning.
  • recover the engine's stage-1 graph ordering instead of approximating with isometric (X + Y, Z, screenX) sort keys.
  • compare the probe scene against fixed live samples such as map 104 without reintroducing viewer-side donor assumptions.