# Dashboard Registry Enrichment Analytics Design Date: 2026-05-06 ## Goal Rework the dashboard analytics tab so it treats active registry organizations as the primary population and parser/enrichment jobs as the operational process that fills data for those organizations. The existing analytics page is source-centric: total records, source counts, and load quality. That remains useful, but secondary. The new first screen must answer: - How many active registry organizations are under control? - How many have additional data from enrichment sources? - Which registries are under-covered by source? - Which enrichment jobs are scheduled, running, successful, failed, or stale? - What actions should the operator take next? ## Scope In scope: - Rebuild only the `analyticsPanel` dashboard tab. - Keep current navigation and other dashboard tabs unchanged. - Add dashboard API aggregate data under `/api/v1/parsers/dashboard/`. - Use active `RegistryMembershipPeriod` rows as the population. - Exclude `unfair_suppliers` from completeness degradation. It is a risk signal, not a required enrichment source. - Keep source record totals available, but move them below the primary registry analytics. Out of scope: - Changing v2 organization API contracts. - Changing parser execution behavior. - Changing Celery scheduling semantics. - Adding external chart dependencies. ## UX Structure The analytics tab becomes a hybrid of "coverage center" and "enrichment pipeline". Top section: Registry Organization Coverage - KPI cards: - Active registry organizations. - Organizations with at least one enrichment source. - Organizations with core profile coverage. - Organizations requiring attention. - Coverage by source: - Bar rows for FNS reports, industrial certificates, products, manufacturers, inspections, procurements, arbitration, bankruptcy, FSTEC, vacancies, etc. - Each row shows matched organization count and percent of active registry organizations. - `unfair_suppliers` is not included here. - Registry × source matrix: - Rows are registries. - Columns are important enrichment sources. - Cells show percent coverage for organizations in that registry. - This gives a fast view of which registry/source pair needs work. Second section: Enrichment Pipeline - Job KPI cards: - Active schedules. - Running jobs. - Recent successes. - Recent failures. - Recent job quality meter: - Reuse existing load log status data, but frame it as enrichment pipeline health. - Action queue: - Organizations without enrichment data. - Organizations with identifier/matching problems. - Snapshots older than latest parser batches. - Risk signals such as unfair suppliers, bankruptcy, GOZ evasion shown separately from coverage. Third section: Secondary Technical Counters - Current source record totals and source mode breakdown move below the registry-focused blocks. - These remain useful for diagnostics, but no longer dominate the page. ## Backend Data Contract Extend `/api/v1/parsers/dashboard/` with an analytics object: ```json { "registry_enrichment_analytics": { "population": { "active_registry_organizations": 252, "active_memberships": 647, "registries_with_data_percent": 100 }, "coverage_summary": { "with_any_enrichment": 68, "with_any_enrichment_percent": 27.0, "core_profile_complete": 21, "core_profile_complete_percent": 8.3, "requires_attention": 184 }, "source_coverage": [ { "source": "fns_reports", "label": "ФНС отчетность", "organizations_count": 45, "coverage_percent": 17.9, "required_for_core_profile": true, "risk_signal": false } ], "registry_source_matrix": [ { "registry_id": "uuid", "registry_name": "Реестр ГК Росатом ГОЗ", "active_organizations": 139, "sources": { "fns_reports": { "organizations_count": 20, "coverage_percent": 14.4 } } } ], "risk_signals": [ { "source": "unfair_suppliers", "label": "Недобросовестные поставщики", "organizations_count": 3, "coverage_percent": 1.2 } ], "pipeline": { "active_schedules": 15, "running_jobs": 0, "recent_success": 13, "recent_failed": 0, "recent_other": 1 } } } ``` The existing `registry_data_coverage` can remain temporarily for compatibility inside dashboard JS, but new UI should read `registry_enrichment_analytics`. ## Aggregation Rules - Population is distinct organizations from active registry memberships: `ended_at IS NULL`. - Source coverage matches parser records to registry organizations by INN or OGRN. - `FinancialReport` matches by OGRN. - legacy `ProcurementRecord` matches by `customer_inn` and `customer_ogrn`. - `unfair_suppliers` is excluded from completeness and shown as a risk signal. - Percent values use one decimal place. - If a source has records but no identifiers, it does not count as organization coverage. Core profile completeness for the first version: - Organization has FNS reports. - Organization has at least one industrial/manufacturer/product source. This is intentionally conservative and can become configurable later. ## Frontend Design Implementation remains in `src/templates/dashboard.html` for now, following the current dashboard pattern. New/updated DOM blocks: - `analyticsRegistryKpis` - `registrySourceCoverageChart` - `registrySourceMatrix` - `enrichmentPipelinePanel` - `analyticsActionQueue` - `technicalSourceCounters` No chart dependency is added. Use CSS bars, compact matrix cells, and existing badges/cards. This keeps the dashboard self-contained. ## Error Handling - If analytics aggregate is missing, show empty states instead of crashing. - If registries are unavailable, keep the pipeline and technical counters visible. - If coverage has zero population, render zeroed KPIs and explanatory empty states. ## Testing Add/update tests: - Dashboard API returns `registry_enrichment_analytics`. - `unfair_suppliers` appears in `risk_signals`, not `source_coverage`. - Matrix counts source coverage per registry. - Template contains new analytics sections and still includes secondary source counters. - Existing parser/dashboard tests continue to pass. Manual validation: - Open `/dashboard`. - Confirm first visible analytics content is registry organization coverage. - Confirm source record totals are below primary registry analytics. - Confirm FNS table and existing organization drill-down are unaffected. ## Risks - Matching by INN/OGRN can undercount sources with incomplete identifiers. - Current dashboard API may become heavier with matrix aggregation. Keep queries bounded and use grouped SQL where practical. - Core completeness definition is a business rule; first implementation uses a conservative default and should be easy to adjust.