Refactor code structure for improved readability and maintainability

This commit is contained in:
MaddoScientisto 2026-04-12 17:26:17 +02:00
commit c71e4b4cd0
27 changed files with 1738 additions and 324 deletions

View file

@ -10,7 +10,7 @@ Add an internal processor service that executes `face_matcher` jobs for the publ
- add a dedicated `processor` workspace and container scaffold
- replace in-memory search orchestration in the public backend
- preserve the existing frontend polling and legacy return flow
- support local PKL testing from `test_pkl/`
- support local PKL testing from `test_pkl/` mounted with the same directory shape used in hosted deployment
This slice does not yet implement production NAS mounting, persistent databases, or a final parser tailored to the real matcher CSV format.
@ -53,25 +53,34 @@ The lock is released only when the processor marks the search as terminal: `comp
## Race And PKL Resolution
The canonical race key is the legacy `id_gara`, already exposed as `raceId` in the existing handoff flow.
The canonical race key is still the legacy `id_gara`, but the worker no longer guesses the PKL path from `raceId` alone.
The processor resolves the PKL path using a race-based directory layout:
The legacy handoff must provide a `raceStorage` object with:
- `year`
- `monthFolder` like `04.APRILE`
- `raceFolder` like `PISA`
The processor resolves the PKL path using this mounted directory layout:
```text
/data/pkl/
101/
face_encodings.pkl
202/
face_encodings.pkl
2026/
04.APRILE/
PISA/
face_encodings_20260330_170210.pkl
LUCCA/
face_encodings_20260330_170155.pkl
```
The lookup rule is:
1. try `/data/pkl/{raceId}/face_encodings.pkl`
2. optionally fall back to `/data/pkl/{raceId}.pkl`
3. fail the job if neither exists
1. resolve `/data/pkl/{year}/{monthFolder}/{raceFolder}`
2. list files at that race root
3. take the first `.pkl` file found there, regardless of filename
4. fail the job if the directory does not exist or contains no `.pkl` file
For local development, `test_pkl/` is mounted into `/data/pkl/test` and the backend can fall back to the first `.pkl` file in that folder when no race-specific file exists yet.
For local development, `test_pkl/` is mounted directly into `/data/pkl` in both the public FaceAI container and the processor container, so the same rule is used in every environment.
## Shared Runtime Storage
@ -91,14 +100,15 @@ Both the public backend and the processor mount the same writable runtime direct
1. frontend uploads a selfie and calls `POST /api/searches`
2. backend validates session, rate limit, and active-user lock
3. backend stores the upload and creates a Redis search record with status `queued`
4. backend enqueues a BullMQ job
5. processor picks up the job and sets status `processing`
6. processor runs `face_matcher`
7. processor parses CSV output into matches
8. processor stores a result record and marks the search `completed`
9. frontend polling reads Redis-backed state through `GET /api/searches/:id`
10. existing redirect flow sends the user back to the legacy filtered page
3. backend verifies that the mounted race directory exists and already contains a `.pkl`; if not, it rejects the request before queueing
4. backend stores the upload and creates a Redis search record with status `queued`
5. backend enqueues a BullMQ job
6. processor picks up the job and sets status `processing`
7. processor runs `face_matcher`
8. processor parses CSV output into matches
9. processor stores a result record and marks the search `completed`
10. frontend polling reads Redis-backed state through `GET /api/searches/:id`
11. existing redirect flow sends the user back to the legacy filtered page
## Search Record Shape
@ -107,6 +117,11 @@ Both the public backend and the processor mount the same writable runtime direct
"id": "search_...",
"status": "queued",
"raceId": "101",
"raceStorage": {
"year": "2026",
"monthFolder": "04.APRILE",
"raceFolder": "PISA"
},
"userId": "legacy-user-1",
"returnUrl": "https://...",
"lang": "it",
@ -162,5 +177,4 @@ Both the public backend and the processor mount the same writable runtime direct
- confirm the real CSV columns emitted by `face_matcher`
- verify the Linux binary shared library requirements inside the processor image
- replace the PKL fallback with a strict NAS-backed race mapping once the final folder layout is agreed
- add cleanup jobs for expired runtime files