Crusader_Decomp/psx-map-exporter/docs/spec.md
2026-04-16 23:52:41 +02:00

13 KiB

PSX Map Exporter Spec

Goal

psx-map-exporter is a standalone Node.js probe for Crusader PSX map extraction.

It exists to prove a fresh end-to-end path from raw LSET*.WDL input to:

  • extracted intermediate sprite assets under .cache
  • a rendered map PNG under .output

This project does not reuse Crusader-Map-Viewer code, scene caches, donor mappings, or sidecar summaries as binding inputs. It only consumes raw PSX assets plus the documented executable-backed findings from docs/psx and the live Ghidra session.

Scope

Version v0 is intentionally narrow.

It will:

  • read one PSX LSET*.WDL file
  • parse the documented 0x38-byte top-level header
  • carve the post-audio map/art regions from header-derived boundaries
  • parse the loader-sized post-audio sections as a second, higher-value view of the file layout
  • extract the dense constructor-placement family from post_audio_section_00
  • keep the smaller root-dispatch family available as a comparison probe
  • render a layered authored probe that can combine constructor placements with the smaller root-dispatch lane
  • scan post_audio_region_04 for type-4/type-5 sprite bundles
  • decode bundle frames directly from the raw WDL
  • write extracted frame PNGs to .cache
  • compose a probe map PNG to .output

It will not claim full runtime parity yet.

Known non-goals for v0:

  • exact CLUT reproduction
  • full stage-1 dependency-graph ordering
  • full post_audio_region_01 / post_audio_region_02 semantic decode

Landed in the current pass (was previously a non-goal):

  • loader-faithful DAT_800758d8 active-header bank binding via explicit parses of the artInstall and override blocks in both SPEC_A.WDL and the map-local LSET*.WDL. See Loader Layout and Art Binding Rule below.

Evidence Constraints

The implementation is grounded in these current facts from the docs and Ghidra:

  • LSET*.WDL begins with 14 little-endian u32 size fields (56 bytes total) describing the sequence of post-header blocks. The loader (wdl_resource_bundle_load_by_index @ 0x80039444) reads each size and carves the blocks in order: packPreamble, dispatchRootsSize, ctorPlacementsSize, packTailRewindSize, ctorPlacementSection, sectionPackBaseSize, policyTableSize, table8006754cSize, opcodeStreamsSize, detachedBlobSize, artInstallSize, stateBankSize, overrideSize, stateBank2Size.
  • SPEC_A.WDL (global bundle A) begins with a fixed 0x3520-byte VRAM preload, followed by the same 14-field size header. Bundle A skips the sectionPack and detachedBlob blocks entirely; the remaining blocks are still present in the same order.
  • The old "audio blob at word[1]" model was incorrect for this stream; the first 14 u32 words are block-size descriptors, not a single audio size.
  • The post-audio four-region carve is kept as a fallback diagnostic view but is no longer the primary input for record or art extraction.
  • The small count-prefixed section-0 root-dispatch rows are real, but they are not the whole map object set.
  • The dense constructor-placement records recovered from loader-sized post_audio_section_00 are the live-object seed source for rendering.
  • post_audio_region_04 is retained only as a fallback bundle source; real art now flows through the artInstall and override blocks parsed out of the 14-u32 layout.
  • Type-4/type-5 drawable bundles expose width, height, palette mode/index, frame count, frame table offset, and data offset in the raw 0x58-byte bundle header.
  • Bundle frame entries use a 20-byte row with size, relative data offset, width, height, origin x/y, and flags.
  • sprite_rle_decode_rows uses row-local control bytes:
    • positive: repeat next byte N times
    • negative: copy next abs(N) literal bytes
    • zero: end row
  • The executable projection basis (per psx_project_object_main_visible @ 0x80040d44) is, in pixel units, with no extra scale factor:

screen_x = y - x

screen_y = 2z - \frac{x + y}{2}

  • Record X/Y values are already in screen-pixel units. The live view-cull box is camera +/- 0x140 = +/- 320 pixels, matching PSX screen width. The exporter therefore uses PSX_SCREEN_SCALE = 1; earlier builds multiplied by 2, producing over-spaced maps.

Loader Layout

Both SPEC_A.WDL and LSET*.WDL are fed to the same loader body, once per WDL pass. Each pass runs two art installs, two state-bank installs, and one override install. The loader reads the 14-u32 size header starting at offset 0 (LSET) or 0x3520 (SPEC_A) and lays out blocks sequentially.

  • artInstall block (at 0x800396a0 for bundle A, 0x80039988 for bundle B): directory and payloads live at block + 0x2718. The first 0x2710 bytes of the block are a scratch header cache used while resources are built. The directory format is { u32 count; u32 directoryOffset; } at the start of the block, then count entries of { u32 size; u32 typeId; } at block + 0x2718 + directoryOffset. For each non-zero entry the loader installs a built-resource pair { u16 kind; u16 _; u32 resource_ptr } into DAT_800758d8[typeId] (0x18-byte stride).
  • override block (at 0x80039730 for bundle A, 0x80039a18 for bundle B): same directory format, but the payload cursor starts at block + 8 (directly after the 8-byte prefix). Each non-zero entry payload is a raw 0x58-byte drawable header whose pointer is written straight into DAT_800758d8[typeId] at 0x8003977c / 0x80039a64, overwriting whatever the earlier artInstall pass installed. Zero-size entries clear the bank slot.
  • Apply order per loader call: SPEC_A artInstall → SPEC_A override → LSET artInstall → LSET override. Later writes win, so the final DAT_800758d8 state is a mix of built-resource pointers and raw override headers.

Evidence retained for reference

  • The direct typeWord -> bundle slot scan-order binding is disproven as a final art rule and is retained only as a diagnostic bundle-family probe.

Input Model

The exporter accepts either:

  • a direct --wdl path
  • or a --source path relative to a PSX disc root

Default disc root for local workspace runs:

  • d:/Ghidra/Crusader-Map-Viewer/map_renderer/STATIC_PSX

Expected source examples:

  • LSET1/L0.WDL
  • LSET4/L37.WDL

Output Layout

.cache

Per-run cache path:

  • .cache/<map-stem>/

Contents:

  • wdl-summary.json
  • records.json
  • bundles.json
  • frame-manifest.json
  • active-header-overrides.json
  • sprites/<bundle-offset>/frame_<n>.png

The cache is disposable. It exists to preserve intermediate evidence and make re-runs inspectable.

records.json now also records constructor-stream detection metadata when available: stream header offset, record start offset, reported count, and the initial structured-prefix run.

The cache also records candidate late DAT_800758d8 header-only override blobs as a standalone diagnostic. Those candidates are not used as final art binding yet.

wdl-summary.json now also emits sceneInterpretation, which is an explicit warning-bearing classification of what the current export most likely represents. For constructor-placement exports this should currently read as a constructor-fed live-object seed lane rather than a final visible-world reconstruction.

.output

Per-run final outputs:

  • .output/<map-stem>.png
  • .output/<map-stem>.json
  • .output/<map-stem>_<layer>.png for each rendered authored layer when layered mode is active

The JSON stores the final probe scene manifest used to draw the PNG.

The .output folder is reset at the start of each export so evaluation only sees artifacts from the current run.

The .output/<map-stem>.json manifest inherits sceneInterpretation from wdl-summary.json so consumers do not need to infer that warning from prose docs alone.

Record Extraction Rules

v0 pulls scene records from two loader-faithful lanes inside the section pack, matching the executable's two dispatch iterators. Both lanes are indexed through packSubranges from the 14-u32 loader layout.

Constructor placements (12-byte stride)

  • Source: ctorPlacements pack subrange (word 2).
  • Dispatcher: psx_dispatch_section0_constructor_placements @ 0x800258cc.
  • Layout: [u32 count][count * { u16 typeWord; u16 X; u16 Y; u16 Z; u16 selector; u16 flags }].
  • The dispatcher passes each record directly to descriptor_table[typeWord].slot0(record, 0) and downstream spawners (e.g. psx_object_create_compound_record) read exactly the six u16 fields.
  • Older heuristic region-01 / section-0 scans are retained as compatibility fallbacks when the loader block is absent or empty.

Dispatch roots (24-byte stride)

  • Source: dispatchRoots pack subrange (word 1).
  • Dispatcher: psx_dispatch_section0_dispatch_roots @ 0x800256b0.
  • Layout per record: [u32 count] followed by 24-byte entries whose dispatcher-visible fields are:
    • +0x04 u16 typeId indexes psx_type_descriptor_table
    • +0x08 u16 screenX used directly by the +/- 0x140 view-cull
    • +0x0A u16 screenY same
    • +0x10 u16 flags bit 3 skips the record
  • Remaining fields are forwarded to descriptor slot 0. The exporter empirically projects +0x06 as z, +0x0C as selector, +0x0E as lane, with relaxed plausibility because the live dispatcher only requires the fields above.

Selection modes

  • auto / combined / layered merges both lanes into one layered probe.
  • constructors / region01 returns only the 12-byte constructor placement records (preferring the loader block; falling back to the region-01 heuristic stream).
  • roots / region00 returns only the 24-byte dispatch-root records (preferring the loader block; falling back to the region-00 paired-record scan).

Renderable-record counts for the current validation set (auto mode):

  • LSET1/L0.WDL: 2334 total (1182 constructor placements + 1152 dispatch roots).
  • LSET4/L37.WDL: 1463 total.

This is now a loader-faithful schema for the two main visible-object lanes. The older count-prefixed region heuristics are kept only as compatibility fallbacks.

Art Binding Rule

v0 now binds art via a loader-faithful DAT_800758d8 parse. For each scene record with typeWord = T:

  1. First preference: the bundle installed at DAT_800758d8[T] by the LSET override pass (bundleSource = override-bank-lset).
  2. Then: SPEC_A override pass (bundleSource = override-bank-spec-a).
  3. Then: LSET artInstall pass (bundleSource = art-install-lset).
  4. Then: SPEC_A artInstall pass (bundleSource = art-install-spec-a).
  5. Fallback only when no loader block covers the type: raw post_audio_region_04 scan slot (bundleSource = raw-scan).

Mapping sources are recorded per item so failures stay auditable. For the current L0 / L9 / L37 validation runs there are no raw-scan fallbacks; every rendered type resolves through artInstall or override.

The opt-in runtime-map0-masked-proxy mode is retained as a secondary override for research against the runtime map-0 RAM snapshot. It no longer supplies the primary binding.

The older typeWord -> bundle slot scan-order rule is retained only as a named binding mode (raw) for negative-evidence experiments. It is not claimed as executable truth.

When debug labels are enabled for a map render, labels identify unique rendered resources rather than per-instance placements. The stable label key is bundle offset + clamped frame + resolved palette.

Rendering Rule

For each record:

  • compute screenX and screenY from the documented projection basis
  • select frame index from selectorWord, clamped to available frames
  • place sprite top-left at:
    • screenX - originX
    • screenY - originY

Current draw order is conservative:

  • main-visible before special-visible
  • then ascending screenY
  • then ascending screenX

This is a probe approximation. The later graph-based stage-1 ordering still belongs to a future pass.

The rendered PNG uses a neutral opaque background by default so probe silhouettes are legible without relying on transparency.

Color Rule

v0 emits grayscale art from raw pixel indices.

Reason:

  • bundle frame decode is already well constrained
  • full CLUT parity is not
  • grayscale preserves shape/variant evidence without pretending the palette problem is solved

Transparent index 0 stays transparent.

CLI

Primary command:

node src/cli.js --source LSET1/L0.WDL

Supported options:

  • --source <relative-path>
  • --wdl <absolute-or-relative-file>
  • --disc-root <path>
  • --binding-mode <raw|runtime-map0-masked-proxy>
  • --map-source <auto|combined|layered|constructors|roots|region01|region00>
  • --out-name <stem>

Success Criteria

v0 is successful if it can:

  • parse a raw LSET*.WDL
  • recover the loader-sized section view alongside the region carve
  • scan bundles directly from post_audio_region_04
  • decode at least one frame from raw data
  • extract a stable constructor-placement record set from post_audio_section_00
  • write extracted sprite PNGs into .cache
  • write a readable diagnostic probe PNG into .output

Planned Follow-Ups

  • extend sceneInterpretation so it reflects the landed loader-faithful binding instead of the older repeated-wrong-art warning
  • identify and parse the separate static-world or subordinate level substrate that complements the constructor-fed live-object lane, instead of treating section-0 constructor placements as the whole map
  • add palette/CLUT reconstruction
  • add stage-1 graph ordering recovery
  • compare the probe scene against fixed live samples such as map 104 without reintroducing viewer-side donor assumptions