Crusader_Decomp/docs/scummvm-crusader-reference.md
MaddoScientisto de42fd1ea1 Add Crusader-specific USECODE data and documentation
- Introduced new file `vm_mask_ladder.tsv` containing detailed mappings for Crusader USECODE VM masks and their associated descriptors.
- Added comprehensive documentation in `scummvm-crusader-reference.md` outlining the structure, findings, and implications for reverse-engineering the Crusader engine within ScummVM.
- Created `usecode-roundtrip-ir.md` to document the plan for converting Crusader USECODE bytes into a human-readable format, detailing the container layout, event names, and intrinsic tables.
- Implemented a PowerShell script `temp_usecode_sample.ps1` for extracting and analyzing USECODE data from the Crusader FLX files, providing insights into class and event structures.
2026-03-22 17:26:39 +01:00

20 KiB

ScummVM Crusader Reference

Purpose

This note catalogs the Crusader-specific code inside ScummVM's Ultima 8 engine so it can be used as a planning aid for Crusader reverse-engineering work.

Primary source tree: K:\misc\scummvm\engines\ultima\ultima8

Important limitation: this is a high-level reimplementation, not a symbol map for the original DOS binaries. It is most useful for:

  • identifying original data files and container formats
  • naming likely subsystem boundaries
  • understanding USECODE VM and event structure
  • spotting Remorse versus Regret divergences
  • finding concrete file-format footholds for parsers and validators

It is not sufficient on its own for direct raw-function renaming.

Highest-Value Findings

  1. ScummVM keeps a Crusader-specific USECODE description layer with named event ids and large intrinsic signature tables. Files: usecode/uc_machine.cpp, usecode/usecode_flex.cpp, convert/crusader/convert_usecode_crusader.h, convert/crusader/convert_usecode_regret.h, usecode/remorse_intrinsics.h, usecode/regret_intrinsics.h.

  2. ScummVM has explicit parsers for the core Crusader container families used by gameplay assets: FLEX archives, raw archives, USECODE containers, shapes, sound archives, speech archives, save files, and movie subtitle files. Files: filesys/flex_file.cpp, filesys/archive.cpp, filesys/raw_archive.cpp, usecode/usecode_flex.cpp, audio/sound_flex.cpp, audio/speech_flex.cpp, filesys/savegame.cpp, gumps/movie_gump.cpp.

  3. Crusader-specific gameplay metadata is loaded centrally from a predictable file set. File: games/game_data.cpp. This is the best ScummVM-side inventory of original asset families to compare against current RE notes.

  4. World and item loading diverge for Crusader in a few concrete ways that likely reflect real original-engine differences. Files: world/map.cpp, world/current_map.cpp, world/item_factory.cpp, gfx/shape_info.cpp, world/weapon_info.h, world/world.cpp, world/egg.cpp.

  5. Crusader UI, media, and player-control code is separated into clear game-specific files. Files: gumps/cru_*.cpp, world/actors/cru_avatar_mover_process.cpp, audio/cru_music_process.cpp, games/start_crusader_process.cpp, games/cru_game.cpp.

Detection, Boot, and Game Split

metaengine.cpp

  • ScummVM treats Ultima 8 and Crusader as one engine family but gives Crusader its own control map.
  • The Crusader keymap is a useful external reference for action vocabulary: weapon cycling, inventory cycling, medikit, energy cube, bomb detonation, search/select item, use selection, grab item, attack, center camera on player, jump/roll/crouch, sidesteps, rolls, and crouch toggle.
  • querySaveMetaInfos() uses SavegameReader, which is the entry point for ScummVM-side Crusader save metadata.

ultima8.cpp

  • Engine startup registers Crusader-specific process loaders such as CruAvatarMoverProcess, CruPathfinderProcess, and CruMusicProcess.
  • initializePath() explicitly adds a data subdirectory for at least one Regret variant.

games/cru_game.cpp

  • loadFiles() loads Crusader palettes from static/gamepal.pal, cred.pal, diff.pal, misc.pal, misc2.pal, and optionally star.pal.
  • loadFiles() then calls GameData::loadRemorseData(), which is the central Crusader asset-loader in ScummVM.
  • startGame() creates the main actor with shape 1, reserves object ids 384..511, initializes HP and energy-like stats from NPCDat, and switches to map 0.
  • playIntroMovie() uses T01 and T02 for Remorse, origin and ANIM01 for Regret, and warns that FLICS and SOUND directories must be copied from the CD.

games/start_crusader_process.cpp

  • Startup sequence is explicit: intro movie 1, intro movie 2, difficulty menu, then live game setup.
  • ScummVM creates the Crusader HUD gumps (CruStatusGump, CruPickupAreaGump) before normal play begins.
  • It seeds inventory with shape 0x4d4 (datalink) and 0x598 (smiley), sets shield type, teleports the actor through map 1, egg 0x1e, and applies a Regret-specific combat-ready start state.
  • This file is a good checklist for early-game object ids, item shapes, and startup-only side effects.

Core Asset Loading

games/game_data.cpp

GameData::loadRemorseData() is the single best source-file summary of original Crusader asset families known to ScummVM.

Loaded files and why they matter:

  • static/fixed.dat: fixed-object archive for world/map loading.
  • usecode/<lang>usecode.flx: main USECODE container.
  • static/shapes.flx: main shape archive, loaded with Crusader-specific shape format.
  • remorseweapons.ini or regretweapons.ini: ScummVM-maintained weapon metadata overlays.
  • remorsegame.ini: ScummVM-maintained game config overlay.
  • static/typeflag.dat: per-shape type flags.
  • static/anim.dat: animation metadata.
  • static/wpnovlay.dat: weapon overlay metadata.
  • static/glob.flx: glob data loaded into MapGlob objects.
  • static/fonts.flx: font archive.
  • static/mouse.shp: cursor shapes.
  • static/gumps.flx: UI art.
  • static/dtable.flx: NPC data table (NPCDat).
  • static/damage.flx: damage data consumed by main shape logic.
  • sound/sound.flx: sound archive.
  • sound/<lang><shape>.flx: speech per shape, loaded lazily by getSpeechFlex().

Implication for RE:

  • This gives a concrete file-driven decomposition of the engine: world placement, usecode, shape/type metadata, overlay metadata, NPC tables, damage rules, UI art, sound, and speech are all separated.
  • dtable.flx, damage.flx, glob.flx, and wpnovlay.dat should be treated as high-value parser targets if they are not already covered in local tooling.

Container and File-Format Evidence

filesys/flex_file.cpp

  • FLEX detection looks for a padded header region filled with 0x1A.
  • Metadata reader uses:
    • table offset 0x80
    • entry count at file offset 0x54
    • 8-byte table entries of <offset, size>
  • ScummVM rejects counts above 4095 and notes that the largest observed Crusader/U8 FLEX has 3074 entries.

Implication for RE:

  • This strongly matches the currently validated EUSECODE/FLEX structure already recovered locally.
  • It also gives a second independent implementation to compare against any local extractor edge cases.

filesys/archive.cpp and filesys/raw_archive.cpp

  • Archive layers multiple FlexFile sources and resolves objects from newest source to oldest source.
  • RawArchive caches raw object bytes and exposes them as memory streams.

Implication for RE:

  • If any Crusader resources use overlay-style replacement behavior, ScummVM already models that archive precedence.
  • This is worth checking before assuming a single-file source of truth for a given object id.

usecode/usecode_flex.cpp

  • USECODE classes are addressed as classid + 2 inside the archive.
  • Class names are read from object 1 at name_object + 4 + 13 * classid.
  • For Crusader, class base offset is read from bytes 8..11 of the class object and decremented by 1.
  • Crusader event count is computed as (get_class_base_offset(classid) + 19) / 6.

Implication for RE:

  • This is directly relevant to current USECODE work. It provides ScummVM's concrete interpretation of the Crusader class header layout and event-table sizing.
  • If local EUSECODE or USECODE parsing still has uncertainties around header size, entry table layout, or event count, this file is the first external cross-check to apply.

USECODE VM, Events, and Intrinsics

usecode/uc_machine.cpp

  • Crusader uses a ByteSet(0x1000) global-state store, unlike the U8 BitSet path.
  • Remorse initializes global 0x003c to avatar number 1; Regret initializes global 0x001e.
  • The VM selects ConvertUsecodeCrusader for Remorse and ConvertUsecodeRegret for Regret.

Implication for RE:

  • This is concrete evidence that the Crusader VM/global model diverges from U8 enough that it should not be treated as a drop-in match.
  • The initialized global slots are worth comparing against already-known runtime globals in the raw executable.

convert/crusader/convert_usecode_crusader.h

  • ScummVM ships a named Crusader event table for event ids 0x00..0x1f.
  • Named events include look, use, anim, setActivity, cachein, hit, gotHit, hatch, schedule, release, equip, unequip, combine, calledFromAnim, enterFastArea, leaveFastArea, avatarStoleSomething, animGetHit, and unhatch.
  • The same file also includes a large 512-entry intrinsic signature table with many behavior comments extracted from prior Pentagram reverse-engineering.

convert/crusader/convert_usecode_regret.h

  • Regret reuses the Crusader event-name table but has a different intrinsic numbering/signature map.

usecode/remorse_intrinsics.h and usecode/regret_intrinsics.h

  • These provide the live intrinsic dispatch tables used by the engine.
  • High-value entries for current RE include weapon firing, status/quality accessors, object creation/destruction, camera moves, palette fades, movie playback, teleport-to-egg, keycard clearing, damage reception, and Crusader-specific audio calls.

High-value USECODE bridge examples from ScummVM's tables:

  • Item::I_fireWeapon
  • AudioProcess::I_playSFXCru
  • AudioProcess::I_playAmbientSFXCru
  • StatusGump::I_hideStatusGump / I_showStatusGump
  • MovieGump::I_playMovieOverlay
  • World::I_setControlledNPCNum
  • MainActor::I_clrKeycards
  • PaletteFaderProcess fade/jump helpers
  • Egg::I_getEggId, I_getEggXRange, I_setEggXRange

Implication for RE:

  • These files are an immediate planning aid for USECODE annotation. Even where names are approximate, they constrain argument counts, broad behavior, and event purpose.
  • convert_usecode_crusader.h is especially valuable because it records many comments of the form "based on disasm" or "same coff as", which likely came from earlier source-level Crusader RE.

Shapes, Type Flags, Weapons, and Item Families

convert/crusader/convert_shape_crusader.cpp

  • ScummVM declares two Crusader-specific shape layouts: CrusaderShapeFormat and Crusader2DShapeFormat.
  • The main 3D-ish shape format uses:
    • 6-byte header
    • 8-byte frame header
    • 28-byte secondary frame header
    • explicit width/height/xoff/yoff fields
  • The 2D shape format uses a 20-byte secondary frame header.

Implication for RE:

  • This is the quickest external reference for main-world versus UI/mouse/gump shape decoding.

gfx/shape_info.cpp

  • Crusader type flags are decoded with a different bit layout than U8.
  • ScummVM treats Crusader type-flag space as extending to at least bit 71, with several still-marked unknown ranges.

Implication for RE:

  • Any local typeflag decoder should treat Crusader as its own layout, not as the U8 layout with extra cases.

world/weapon_info.h

  • Crusader-specific weapon fields include _sound, _reloadSound, _ammoType, _ammoShape, _displayGumpShape, _displayGumpFrame, _small, _clipSize, _energyUse, _field8, and _shotDelay.

Implication for RE:

  • This header is a good target schema for interpreting weapon-related tables and shape metadata in the original data.
  • _field8 is still uncertain in ScummVM, which is a useful warning not to over-claim its meaning in the raw game.

world/item_factory.cpp

  • Crusader item families include SF_CRUWEAPON, SF_CRUAMMO, SF_CRUBOMB, and SF_CRUINVITEM.
  • Item construction applies Crusader-only defaults:
    • damage points from shape damage info
    • weapon clip size copied into initial quality
    • ammo and bomb quality initialized to 1

Implication for RE:

  • This ties together shape family, shape damage info, weapon tables, and runtime item state.
  • The quality field is confirmed as overloaded for ammo/clip counts and inventory stack-like quantities.

World, Maps, Eggs, and Cache-In Behavior

world/map.cpp

  • Fixed and nonfixed map objects are read as 16-byte records.
  • ScummVM reads each record as:
    • x = uint16
    • y = uint16
    • z = uint8
    • shape = uint16
    • frame = uint8
    • flags = uint16
    • quality = uint16
    • npcNum = uint8
    • mapNum = uint8
    • next = uint16
  • It then applies World_FromUsecodeXY(x, y) before constructing items.
  • Container nesting is not read from a separate structure: the on-disk x field is temporarily treated as container depth while reading hierarchical contents.

Implication for RE:

  • This is one of the most concrete format descriptions in the ScummVM codebase.
  • It is directly useful for validating fixed/nonfixed parsers and for checking whether any currently unnamed raw loader functions correspond to this record layout.

world/current_map.cpp

  • Crusader uses _mapChunkSize = 1024; U8 uses 512.
  • When loading a map, ScummVM always calls cache-in events in Crusader (callCacheIn = (_currentMap != nullptr || GAME_IS_CRUSADER)).
  • It also explicitly calls actor cache-in events for Crusader after actor scheduling.

Implication for RE:

  • Cache-in behavior appears more aggressive or more semantically important in Crusader than in U8.
  • This may help explain some map-enter or object-activation behavior currently attributed to general dispatch code.

world/egg.cpp

  • Crusader supports unhatch() as a real egg event path; U8 does not.
  • Eggs store a _hatched state and expose get/set egg x/y range plus get/set egg id intrinsics.

Implication for RE:

  • unhatch is a strong clue for interpreting Crusader trigger/reset behavior.

world/world.cpp

  • Crusader save/load stores extra world fields beyond the shared baseline:
    • alert active
    • difficulty
    • controlled NPC number
    • Vargas shield value
  • setAlertActiveRemorse() and setAlertActiveRegret() search for concrete shape ids and mutate frames/shapes to update world-state visuals.
  • setGameDifficulty() contains a Remorse-specific BA-40 ammo patch that modifies weapon metadata at runtime.

Implication for RE:

  • Alert-state and difficulty are not just UI globals; ScummVM models them as world-affecting state with concrete shape mutations.

UI, Interaction, and Player-Control Code

gumps/cru_status_gump.cpp

  • Crusader HUD is composed from five child gumps: weapon, ammo, inventory, health, and energy.

gumps/cru_weapon_gump.cpp, cru_ammo_gump.cpp, cru_inventory_gump.cpp

  • HUD display is driven by weapon metadata fields such as _displayGumpShape, _displayGumpFrame, _ammoShape, and live quality values.
  • CruAmmoGump confirms bullets are current weapon quality and reserve clips are counted from the first inventory item matching ammoShape.
  • CruInventoryGump renders the active inventory item through the weapon-info display fields and shows quantity when quality > 1.

Implication for RE:

  • These files are a good external model for active-weapon, ammo-reserve, and active-inventory state fields.

gumps/game_map_gump.cpp

  • Double-click use range is 512 in Crusader versus 128 in the shared path.

world/actors/cru_avatar_mover_process.cpp

  • Crusader movement logic is explicitly different from U8 and models combat movement, one-shot moves, short jump, crouch, sidesteps, rolls, rebel-base special cases, and combat-angle smoothing.

Implication for RE:

  • This file is a practical behavioral checklist when classifying input/combat locomotion code in the raw executable.

Audio, Speech, and Movies

audio/sound_flex.cpp

  • Crusader sound.flx differs from U8:
    • object 0 contains an index whose entries start with a leading 0x00 or 0xFF, then 3 bytes of extra data, then a null-terminated sound name
    • ASFX entries are interpreted as 32-byte header plus raw 11025 Hz sample data
  • Non-ASFX entries fall back to Sonarc decoding.

Implication for RE:

  • This is one of the strongest container-format anchors in the ScummVM codebase.
  • If local tooling still treats Crusader audio as opaque FLEX payloads, this file should drive the next parser pass.

audio/speech_flex.cpp

  • Speech FLEX object 0 is parsed as a sequence of null-terminated phrases.
  • Playback lookup is phrase-prefix based: ScummVM normalizes text and searches phrase table entries to map text to sound samples.

Implication for RE:

  • Speech archives are not just sample banks; they embed text phrase indices.
  • This can help tie dialog strings back to per-shape voice resources.

audio/cru_music_process.cpp

  • Remorse and Regret have separate track name tables.
  • Regret track 0x45 means "use the current map's default track" via a hardcoded map-to-track table.
  • Remorse track 16 cycles through M16A, M16B, and M16C.
  • Music is loaded from sound/<track>.amf.

Implication for RE:

  • This is useful for identifying music-selection logic and map-to-music linkage in the original executable.

gumps/movie_gump.cpp

  • Crusader movie playback uses AVI files under flics/.
  • Subtitle loading accepts either .txt or .iff sidecar files.
  • ScummVM normalizes certain movie names because USECODE references mva1, mva3a, mva5a, etc., while files on disk may be mva01, mva03a, mva05a.

Implication for RE:

  • This is a concrete example of ScummVM compensating for original asset-name/usecode mismatches.
  • The subtitle .iff fallback is a useful clue for unexplained IFF-like resources.

Save/Load Format

filesys/savegame.cpp

  • ScummVM supports two save formats:
    • native VMU8 saves with versioned file-entry archive payloads
    • older Pentagram zip-based saves
  • Native saves use a 12-byte file name field and per-entry size/data blocks.

Implication for RE:

  • This is mostly relevant to ScummVM compatibility, not original DOS save format recovery.
  • It still matters because ScummVM serializes engine state explicitly enough to reveal which runtime fields it considers necessary for Crusader continuity.

Best Files For Immediate RE Follow-Up

If time is limited, the most valuable ScummVM files to mine first are:

  1. games/game_data.cpp Why: best single inventory of Crusader data files and subsystems.

  2. usecode/usecode_flex.cpp Why: concrete Crusader USECODE class header and event-count interpretation.

  3. convert/crusader/convert_usecode_crusader.h Why: named event ids plus a large intrinsic-signature table with comments.

  4. audio/sound_flex.cpp Why: concrete Crusader sound archive interpretation.

  5. world/map.cpp Why: concrete fixed/nonfixed map record layout and container nesting behavior.

  6. world/weapon_info.h and world/item_factory.cpp Why: practical schema for weapon/ammo/inventory metadata.

  7. gumps/movie_gump.cpp Why: movie filename normalization and subtitle sidecar handling.

  8. world/current_map.cpp and world/world.cpp Why: Crusader-only cache-in, alert-state, difficulty, and map chunk differences.

Suggested RE Uses In This Repo

USECODE parsing

  • Compare local USECODE/EUSECODE container assumptions against usecode/usecode_flex.cpp.
  • Import ScummVM's event-name table as a conservative annotation source for event ids 0x00..0x1f.
  • Use convert_usecode_crusader.h and remorse_intrinsics.h as a cross-check for intrinsic numbering, argument counts, and broad semantics.
  • Compare Remorse versus Regret intrinsic numbering before assuming one numbering scheme is universal.

Data-format work

  • Validate local FLEX readers against filesys/flex_file.cpp.
  • Prioritize parsers for dtable.flx, damage.flx, glob.flx, and wpnovlay.dat because ScummVM treats them as core runtime inputs.
  • Split shape decoding between Crusader main shapes and 2D/gump shapes using convert_shape_crusader.cpp.
  • Treat sound.flx and speech FLEX files as structured formats, not opaque blob stores.

Raw executable classification

  • Use ScummVM's subsystem boundaries to guide search targets for:
    • cache-in and unhatch event paths
    • alert-state world mutations
    • map chunking and area search behavior
    • weapon clip/ammo/energy metadata consumers
    • movie name normalization and subtitle loading
    • Regret map-to-track music selection

Conservative Takeaways

  • ScummVM does not directly solve raw-symbol naming, but it materially sharpens the planning surface for Crusader RE.
  • The most actionable ScummVM contributions are format schemas, event/intrinsic vocabularies, and subsystem boundaries.
  • For current repo priorities, the strongest leverage is on USECODE parsing, data-file parser expansion, and validation of world/object metadata structures.