Files
mostovik-backend/docs/superpowers/plans/2026-05-18-direct-parser-source-ingestion.md

3.6 KiB

Direct Parser Source Ingestion Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Move parser runtime reads and writes from legacy parser record tables to organization source storage.

Architecture: Add a focused ingestion service in organizations that persists normalized source-record inputs directly into polymorphic source extensions. Parser services become adapters from parser dataclasses to ingestion inputs. Runtime reads use OrganizationSourceRecord and extension counters.

Tech Stack: Django 3.2, PostgreSQL, DRF, django-polymorphic, pytest.


Task 1: Direct Ingestion Core

Files:

  • Create: src/organizations/source_identity.py

  • Create: src/organizations/source_ingestion.py

  • Modify: src/organizations/source_backfill.py

  • Test: tests/apps/organizations/test_source_ingestion.py

  • Test: tests/apps/organizations/test_source_backfill.py

  • Write failing tests for direct generic source ingestion.

  • Write failing tests for FNS report ingestion with financial lines.

  • Extract identity normalization from backfill into a shared helper.

  • Implement SourceRecordInput and SourceFinancialLineInput.

  • Implement OrganizationSourceIngestionService.save_records.

  • Keep backfill behavior green by using the same identity normalization helper.

Task 2: Parser Save Services

Files:

  • Modify: src/apps/parsers/services.py

  • Test: tests/apps/parsers/test_services.py

  • Switch generic source saves to OrganizationSourceIngestionService.

  • Switch industrial certificate/manufacturer/product saves.

  • Switch inspection and procurement saves.

  • Switch FNS report saves and duplicate checks.

  • Replace period/deduplication helpers with source-record queries.

Task 3: Parser Tasks

Files:

  • Modify: src/apps/parsers/tasks.py

  • Test: tests/apps/parsers/test_tasks.py

  • Remove source backfill queueing from parser completion.

  • Keep parser load logs and background job progress unchanged.

  • Return source-record identifiers for FNS processing instead of legacy report ids.

Task 4: Runtime Reads

Files:

  • Modify: src/apps/parsers/source_cards.py

  • Modify: src/apps/parsers/views.py

  • Modify: src/apps/parsers/serializers.py

  • Modify: src/apps/core/admin_dashboard.py

  • Modify: src/apps/backups/services.py

  • Test: parser source-card and result endpoint tests.

  • Move source card counts and timestamps to source extensions/source records.

  • Move parser log organization counts to source records.

  • Adapt v1 parser result endpoints to read source records.

  • Move dashboard/export runtime reads off legacy parser models.

Task 5: Frontend Record Detail

Files:

  • Modify: mostovik-frontend/src/pages/main/model/source-record-detail/*

  • Test: frontend source-detail/source-record-detail unit tests.

  • Replace legacy generated v1 detail clients with organization source-record reads.

  • Use payload plus top-level source-record fields for detail rendering.

  • Keep source-detail lists on the new source-record list endpoint.

Task 6: Validation

Files:

  • No production files.

  • Run focused backend parser/organization tests.

  • Run frontend source-detail/source-record-detail checks.

  • Run live parser smoke against one small generic source.

  • Confirm legacy parser record counts do not change during the smoke.

  • Confirm new organization source-record counts do change.