Regalamiunsorriso/faceai/docs/processor-technical-design.md
MaddoScientisto bbb9c193ce feat: add processor service with Redis-backed job queue
- Introduced a new `processor` service in the Docker Compose setup to handle face matching jobs.
- Configured Redis as a job queue and state management system for processing searches.
- Updated the backend to enqueue jobs and manage user locks using Redis.
- Added environment variables for Redis configuration and runtime paths.
- Created technical design documentation for the processor service outlining architecture, queue model, and search lifecycle.
- Updated package.json and package-lock.json to include dependencies for BullMQ and ioredis in the processor workspace.
- Added sample PKL files for local testing in the `test_pkl` directory.
2026-04-11 17:53:22 +02:00

5 KiB

FaceAI Processor Technical Design

Goal

Add an internal processor service that executes face_matcher jobs for the public FaceAI site, while preventing duplicate searches per user and keeping all state short-lived and restart-safe.

Scope Of This Slice

  • add Redis-backed queue and job state
  • add a dedicated processor workspace and container scaffold
  • replace in-memory search orchestration in the public backend
  • preserve the existing frontend polling and legacy return flow
  • support local PKL testing from test_pkl/

This slice does not yet implement production NAS mounting, persistent databases, or a final parser tailored to the real matcher CSV format.

Runtime Architecture

Public backend

  • owns the authenticated API used by the Vue frontend
  • stores uploaded selfies in a shared runtime volume
  • enqueues jobs into BullMQ
  • keeps per-search state, results, rate limits, and active-user locks in Redis
  • never executes face_matcher directly

Processor

  • consumes queue jobs from Redis using BullMQ worker concurrency
  • resolves the race-scoped PKL path for each job
  • executes the Linux face_matcher binary
  • parses the CSV result into legacy-compatible photoId matches
  • writes final state and result payload back to Redis

Redis

  • queue broker for BullMQ
  • source of truth for active-user locks
  • source of truth for search status and short-lived results
  • source of truth for rate-limit counters

Queue And Locking Model

  • queue name: faceai-searches
  • active lock key: faceai:active-search:user:{legacyUserId}
  • search record key: faceai:search:{searchId}
  • result record key: faceai:result:{resultId}
  • rate limit key prefix: faceai:rate-limit:{legacyUserId}

POST /api/searches must acquire the active-user lock before enqueueing. If the lock already exists, the backend returns 409 with error code ACTIVE_SEARCH_EXISTS.

The lock is released only when the processor marks the search as terminal: completed, failed, or timed_out.

Race And PKL Resolution

The canonical race key is the legacy id_gara, already exposed as raceId in the existing handoff flow.

The processor resolves the PKL path using a race-based directory layout:

/data/pkl/
  101/
    face_encodings.pkl
  202/
    face_encodings.pkl

The lookup rule is:

  1. try /data/pkl/{raceId}/face_encodings.pkl
  2. optionally fall back to /data/pkl/{raceId}.pkl
  3. fail the job if neither exists

For local development, test_pkl/ is mounted into /data/pkl/test and the backend can fall back to the first .pkl file in that folder when no race-specific file exists yet.

Shared Runtime Storage

Both the public backend and the processor mount the same writable runtime directory:

/data/runtime/
  uploads/
  searches/
  • uploaded selfies are written under uploads/{searchId}/
  • worker output and logs are written under searches/{searchId}/
  • cleanup can safely remove old per-search directories after retention expires

Search Lifecycle

  1. frontend uploads a selfie and calls POST /api/searches
  2. backend validates session, rate limit, and active-user lock
  3. backend stores the upload and creates a Redis search record with status queued
  4. backend enqueues a BullMQ job
  5. processor picks up the job and sets status processing
  6. processor runs face_matcher
  7. processor parses CSV output into matches
  8. processor stores a result record and marks the search completed
  9. frontend polling reads Redis-backed state through GET /api/searches/:id
  10. existing redirect flow sends the user back to the legacy filtered page

Search Record Shape

{
  "id": "search_...",
  "status": "queued",
  "raceId": "101",
  "userId": "legacy-user-1",
  "returnUrl": "https://...",
  "lang": "it",
  "selfieName": "selfie.jpg",
  "selfiePath": "/data/runtime/uploads/search_.../selfie.jpg",
  "resultId": null,
  "matchCount": 0,
  "errorCode": null,
  "errorMessage": null,
  "createdAt": 0,
  "startedAt": null,
  "completedAt": null
}

Result Shape

{
  "id": "result_...",
  "raceId": "101",
  "raceName": "Mezza di Firenze",
  "userId": "legacy-user-1",
  "returnUrl": "https://...",
  "lang": "it",
  "matches": [
    {
      "photoId": "legacy-photo-id",
      "score": 0.98,
      "label": "legacy-photo-id"
    }
  ],
  "createdAt": 0
}

Compose Topology

  • faceai: public backend plus built frontend
  • processor: queue consumer and matcher executor
  • redis: queue and short-lived state
  • legacy-php: local bridge simulator for end-to-end testing

Operational Defaults

  • worker concurrency: 2
  • active search retention: 24h
  • result retention: 24h
  • rate limit window: 5 requests / 10 minutes / user
  • worker timeout: 5 minutes

Known Follow-Up Work

  • confirm the real CSV columns emitted by face_matcher
  • verify the Linux binary shared library requirements inside the processor image
  • replace the PKL fallback with a strict NAS-backed race mapping once the final folder layout is agreed
  • add cleanup jobs for expired runtime files