MaddoScientisto bbb9c193ce feat: add processor service with Redis-backed job queue

- Introduced a new `processor` service in the Docker Compose setup to handle face matching jobs.
- Configured Redis as a job queue and state management system for processing searches.
- Updated the backend to enqueue jobs and manage user locks using Redis.
- Added environment variables for Redis configuration and runtime paths.
- Created technical design documentation for the processor service outlining architecture, queue model, and search lifecycle.
- Updated package.json and package-lock.json to include dependencies for BullMQ and ioredis in the processor workspace.
- Added sample PKL files for local testing in the `test_pkl` directory.

2026-04-11 17:53:22 +02:00

5 KiB

Raw Blame History

FaceAI Processor Technical Design

Goal

Add an internal processor service that executes face_matcher jobs for the public FaceAI site, while preventing duplicate searches per user and keeping all state short-lived and restart-safe.

Scope Of This Slice

add Redis-backed queue and job state
add a dedicated processor workspace and container scaffold
replace in-memory search orchestration in the public backend
preserve the existing frontend polling and legacy return flow
support local PKL testing from test_pkl/

This slice does not yet implement production NAS mounting, persistent databases, or a final parser tailored to the real matcher CSV format.

Runtime Architecture

Public backend

owns the authenticated API used by the Vue frontend
stores uploaded selfies in a shared runtime volume
enqueues jobs into BullMQ
keeps per-search state, results, rate limits, and active-user locks in Redis
never executes face_matcher directly

Processor

consumes queue jobs from Redis using BullMQ worker concurrency
resolves the race-scoped PKL path for each job
executes the Linux face_matcher binary
parses the CSV result into legacy-compatible photoId matches
writes final state and result payload back to Redis

Redis

queue broker for BullMQ
source of truth for active-user locks
source of truth for search status and short-lived results
source of truth for rate-limit counters

Queue And Locking Model

queue name: faceai-searches
active lock key: faceai:active-search:user:{legacyUserId}
search record key: faceai:search:{searchId}
result record key: faceai:result:{resultId}
rate limit key prefix: faceai:rate-limit:{legacyUserId}

POST /api/searches must acquire the active-user lock before enqueueing. If the lock already exists, the backend returns 409 with error code ACTIVE_SEARCH_EXISTS.

The lock is released only when the processor marks the search as terminal: completed, failed, or timed_out.

Race And PKL Resolution

The canonical race key is the legacy id_gara, already exposed as raceId in the existing handoff flow.

The processor resolves the PKL path using a race-based directory layout:

/data/pkl/
  101/
    face_encodings.pkl
  202/
    face_encodings.pkl

The lookup rule is:

try /data/pkl/{raceId}/face_encodings.pkl
optionally fall back to /data/pkl/{raceId}.pkl
fail the job if neither exists

For local development, test_pkl/ is mounted into /data/pkl/test and the backend can fall back to the first .pkl file in that folder when no race-specific file exists yet.

Shared Runtime Storage

Both the public backend and the processor mount the same writable runtime directory:

/data/runtime/
  uploads/
  searches/

uploaded selfies are written under uploads/{searchId}/
worker output and logs are written under searches/{searchId}/
cleanup can safely remove old per-search directories after retention expires

Search Lifecycle

frontend uploads a selfie and calls POST /api/searches
backend validates session, rate limit, and active-user lock
backend stores the upload and creates a Redis search record with status queued
backend enqueues a BullMQ job
processor picks up the job and sets status processing
processor runs face_matcher
processor parses CSV output into matches
processor stores a result record and marks the search completed
frontend polling reads Redis-backed state through GET /api/searches/:id
existing redirect flow sends the user back to the legacy filtered page

Search Record Shape

{
  "id": "search_...",
  "status": "queued",
  "raceId": "101",
  "userId": "legacy-user-1",
  "returnUrl": "https://...",
  "lang": "it",
  "selfieName": "selfie.jpg",
  "selfiePath": "/data/runtime/uploads/search_.../selfie.jpg",
  "resultId": null,
  "matchCount": 0,
  "errorCode": null,
  "errorMessage": null,
  "createdAt": 0,
  "startedAt": null,
  "completedAt": null
}

Result Shape

{
  "id": "result_...",
  "raceId": "101",
  "raceName": "Mezza di Firenze",
  "userId": "legacy-user-1",
  "returnUrl": "https://...",
  "lang": "it",
  "matches": [
    {
      "photoId": "legacy-photo-id",
      "score": 0.98,
      "label": "legacy-photo-id"
    }
  ],
  "createdAt": 0
}

Compose Topology

faceai: public backend plus built frontend
processor: queue consumer and matcher executor
redis: queue and short-lived state
legacy-php: local bridge simulator for end-to-end testing

Operational Defaults

worker concurrency: 2
active search retention: 24h
result retention: 24h
rate limit window: 5 requests / 10 minutes / user
worker timeout: 5 minutes

Known Follow-Up Work

confirm the real CSV columns emitted by face_matcher
verify the Linux binary shared library requirements inside the processor image
replace the PKL fallback with a strict NAS-backed race mapping once the final folder layout is agreed
add cleanup jobs for expired runtime files

5 KiB Raw Blame History