77 lines
3.0 KiB
Markdown
77 lines
3.0 KiB
Markdown
# Direct Parser Source Ingestion Design
|
|
|
|
## Goal
|
|
|
|
Parser runtime must write parsed source records directly into the organization-centric
|
|
polymorphic storage:
|
|
|
|
- `organizations_organization`
|
|
- `organizations_source_extension`
|
|
- source extension subclass tables
|
|
- `organizations_source_record`
|
|
- `organizations_source_financial_line`
|
|
|
|
Legacy parser record tables remain only as migration/audit inputs until a later
|
|
destructive cleanup. They must not be part of the parser runtime write path or the
|
|
runtime read path used by the application.
|
|
|
|
## Current Runtime Problem
|
|
|
|
Current parser tasks write source rows into legacy parser tables such as
|
|
`GenericParserRecord`, `InspectionRecord`, `ProcurementRecord`,
|
|
`IndustrialProductRecord`, and `FinancialReport`, then enqueue source backfill into
|
|
the new organization storage. This keeps old tables in the hot path and allows new
|
|
runtime data to diverge before the async backfill runs.
|
|
|
|
## Target Runtime
|
|
|
|
Parser tasks keep using `ParserLoadLog`, `ParserBatchSequence`, and `BackgroundJob`
|
|
as operational metadata. Parsed records are converted into normalized source-record
|
|
inputs and persisted through one ingestion service.
|
|
|
|
The ingestion service is responsible for:
|
|
|
|
- normalizing identity fields before writing canonical organizations;
|
|
- resolving or creating `Organization`;
|
|
- creating or updating the source-group polymorphic extension;
|
|
- creating or updating `OrganizationSourceRecord` by `(source, external_id)`;
|
|
- writing structured financial lines for FNS reports;
|
|
- refreshing extension counters in the same transaction.
|
|
|
|
Parser save services return the number of inserted or updated source records. They no
|
|
longer create or query legacy parser record models for runtime decisions.
|
|
|
|
## Runtime Read Scope
|
|
|
|
The following runtime reads must use organization source storage:
|
|
|
|
- parser source cards and source item counters;
|
|
- parser log organization counts;
|
|
- source detail lists;
|
|
- source record detail reads;
|
|
- frontend-facing parser result compatibility endpoints while they remain exposed;
|
|
- admin/dashboard/export paths that are used by the app during normal operation.
|
|
|
|
Legacy parser tables may still be read by explicit migration/backfill tooling only.
|
|
|
|
## Compatibility
|
|
|
|
Existing v1 parser-result URLs can remain during transition, but their data source must
|
|
be `OrganizationSourceRecord`, not the legacy parser models. Response shape can be
|
|
kept best-effort through serializers/adapters that read source-record payloads.
|
|
|
|
## Non-Goals
|
|
|
|
- Do not drop legacy parser tables in this phase.
|
|
- Do not rewrite parser clients.
|
|
- Do not remove parser load logs or background jobs.
|
|
- Do not make every payload strongly typed immediately.
|
|
|
|
## Risks
|
|
|
|
- Industrial product ingestion is large; the writer must avoid per-record table scans.
|
|
- Existing tests assert legacy model counts and must be updated to assert source-record
|
|
behavior.
|
|
- Some compatibility endpoints expose legacy primary keys. New records use UUIDs, so
|
|
compatibility adapters must accept source-record UUIDs where needed.
|