# FaceAI Processor Technical Design ## Goal Add an internal processor service that executes `face_matcher` jobs for the public FaceAI site, while preventing duplicate searches per user and keeping all state short-lived and restart-safe. ## Scope Of This Slice - add Redis-backed queue and job state - add a dedicated `processor` workspace and container scaffold - replace in-memory search orchestration in the public backend - preserve the existing frontend polling and legacy return flow - support local PKL testing from `test_pkl/` This slice does not yet implement production NAS mounting, persistent databases, or a final parser tailored to the real matcher CSV format. ## Runtime Architecture ### Public backend - owns the authenticated API used by the Vue frontend - stores uploaded selfies in a shared runtime volume - enqueues jobs into BullMQ - keeps per-search state, results, rate limits, and active-user locks in Redis - never executes `face_matcher` directly ### Processor - consumes queue jobs from Redis using BullMQ worker concurrency - resolves the race-scoped PKL path for each job - executes the Linux `face_matcher` binary - parses the CSV result into legacy-compatible `photoId` matches - writes final state and result payload back to Redis ### Redis - queue broker for BullMQ - source of truth for active-user locks - source of truth for search status and short-lived results - source of truth for rate-limit counters ## Queue And Locking Model - queue name: `faceai-searches` - active lock key: `faceai:active-search:user:{legacyUserId}` - search record key: `faceai:search:{searchId}` - result record key: `faceai:result:{resultId}` - rate limit key prefix: `faceai:rate-limit:{legacyUserId}` `POST /api/searches` must acquire the active-user lock before enqueueing. If the lock already exists, the backend returns `409` with error code `ACTIVE_SEARCH_EXISTS`. The lock is released only when the processor marks the search as terminal: `completed`, `failed`, or `timed_out`. ## Race And PKL Resolution The canonical race key is the legacy `id_gara`, already exposed as `raceId` in the existing handoff flow. The processor resolves the PKL path using a race-based directory layout: ```text /data/pkl/ 101/ face_encodings.pkl 202/ face_encodings.pkl ``` The lookup rule is: 1. try `/data/pkl/{raceId}/face_encodings.pkl` 2. optionally fall back to `/data/pkl/{raceId}.pkl` 3. fail the job if neither exists For local development, `test_pkl/` is mounted into `/data/pkl/test` and the backend can fall back to the first `.pkl` file in that folder when no race-specific file exists yet. ## Shared Runtime Storage Both the public backend and the processor mount the same writable runtime directory: ```text /data/runtime/ uploads/ searches/ ``` - uploaded selfies are written under `uploads/{searchId}/` - worker output and logs are written under `searches/{searchId}/` - cleanup can safely remove old per-search directories after retention expires ## Search Lifecycle 1. frontend uploads a selfie and calls `POST /api/searches` 2. backend validates session, rate limit, and active-user lock 3. backend stores the upload and creates a Redis search record with status `queued` 4. backend enqueues a BullMQ job 5. processor picks up the job and sets status `processing` 6. processor runs `face_matcher` 7. processor parses CSV output into matches 8. processor stores a result record and marks the search `completed` 9. frontend polling reads Redis-backed state through `GET /api/searches/:id` 10. existing redirect flow sends the user back to the legacy filtered page ## Search Record Shape ```json { "id": "search_...", "status": "queued", "raceId": "101", "userId": "legacy-user-1", "returnUrl": "https://...", "lang": "it", "selfieName": "selfie.jpg", "selfiePath": "/data/runtime/uploads/search_.../selfie.jpg", "resultId": null, "matchCount": 0, "errorCode": null, "errorMessage": null, "createdAt": 0, "startedAt": null, "completedAt": null } ``` ## Result Shape ```json { "id": "result_...", "raceId": "101", "raceName": "Mezza di Firenze", "userId": "legacy-user-1", "returnUrl": "https://...", "lang": "it", "matches": [ { "photoId": "legacy-photo-id", "score": 0.98, "label": "legacy-photo-id" } ], "createdAt": 0 } ``` ## Compose Topology - `faceai`: public backend plus built frontend - `processor`: queue consumer and matcher executor - `redis`: queue and short-lived state - `legacy-php`: local bridge simulator for end-to-end testing ## Operational Defaults - worker concurrency: `2` - active search retention: `24h` - result retention: `24h` - rate limit window: `5 requests / 10 minutes / user` - worker timeout: `5 minutes` ## Known Follow-Up Work - confirm the real CSV columns emitted by `face_matcher` - verify the Linux binary shared library requirements inside the processor image - replace the PKL fallback with a strict NAS-backed race mapping once the final folder layout is agreed - add cleanup jobs for expired runtime files