Add organizations v2 API and registry enrichment
All checks were successful
CI/CD Pipeline / Quality Gate (push) Successful in 26s
CI/CD Pipeline / Build and Push Images (push) Successful in 6s
CI/CD Pipeline / Internal Notify (push) Successful in 0s
CI/CD Pipeline / Deploy Dev in Dokploy (push) Successful in 1s

This commit is contained in:
2026-05-06 19:04:46 +02:00
parent f54aa4cb0b
commit 0f17ff6773
62 changed files with 10311 additions and 430 deletions

View File

@@ -0,0 +1,205 @@
# Dashboard Registry Enrichment Analytics Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Rebuild the dashboard analytics tab around active registry organizations and the enrichment pipeline that fills data for them.
**Architecture:** Keep the existing Django API endpoint `/api/v1/parsers/dashboard/` and template `src/templates/dashboard.html`. Add a focused backend aggregate `registry_enrichment_analytics` beside existing fields, then make the analytics tab render registry coverage, matrix, pipeline, action queue, and secondary technical counters from that aggregate.
**Tech Stack:** Django 3.2, DRF, existing parser/register ORM models, server-rendered HTML with inline vanilla JS/CSS, pytest.
---
### Task 1: Backend Analytics Contract
**Files:**
- Modify: `src/apps/parsers/views.py`
- Test: `tests/apps/parsers/test_views.py`
- [ ] **Step 1: Write failing API test**
Add a test that creates two active registry organizations in two registries, FNS data for one organization, industrial data for one organization, and `unfair_suppliers` for one organization. Assert:
- `registry_enrichment_analytics` exists.
- active population is `2`.
- `source_coverage` contains FNS/industrial and excludes `unfair_suppliers`.
- `risk_signals` contains `unfair_suppliers`.
- `registry_source_matrix` has per-registry source counts.
- `core_profile_complete` is `1` when one organization has FNS and industrial coverage.
Run:
```bash
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py::ParsersViewSetTest::test_dashboard_data_exposes_registry_enrichment_analytics -v
```
Expected: FAIL because `registry_enrichment_analytics` is missing.
- [ ] **Step 2: Implement aggregate helpers**
In `src/apps/parsers/views.py`, add helpers based on the existing registry coverage matching:
- active registry organization identity indexes.
- source matched organization id sets.
- source coverage entries.
- risk signal entries.
- registry/source matrix rows.
- pipeline summary from schedules, jobs, and load logs.
Use source matching rules already present:
- default fields: `inn`, `ogrn`.
- FNS: `ogrn` only.
- legacy procurements: `customer_inn`, `customer_ogrn`.
- `unfair_suppliers`: risk signal, not completeness.
- [ ] **Step 3: Add aggregate to dashboard response**
Return `registry_enrichment_analytics` from `ParserDashboardDataView.get()` while keeping existing `registry_data_coverage`.
- [ ] **Step 4: Verify backend test passes**
Run:
```bash
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py::ParsersViewSetTest::test_dashboard_data_exposes_registry_enrichment_analytics -v
```
Expected: PASS.
### Task 2: Analytics Template Structure
**Files:**
- Modify: `src/templates/dashboard.html`
- Test: `tests/apps/parsers/test_dashboard_page.py`
- [ ] **Step 1: Write failing template test**
Add assertions that `/dashboard` contains:
- `analyticsRegistryKpis`
- `registrySourceCoverageChart`
- `registrySourceMatrix`
- `enrichmentPipelinePanel`
- `analyticsActionQueue`
- `technicalSourceCounters`
- `renderRegistryEnrichmentAnalytics`
- `renderRegistrySourceMatrix`
Run:
```bash
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_dashboard_page.py::ParserDashboardPageTest::test_dashboard_prioritizes_registry_enrichment_analytics -v
```
Expected: FAIL because the new DOM/functions are missing.
- [ ] **Step 2: Replace analytics panel markup**
Update `analyticsPanel` so its first visible blocks are:
- registry organization KPI grid.
- registry source coverage + matrix.
- enrichment pipeline + action queue.
- secondary technical counters below.
- [ ] **Step 3: Add CSS for compact bars and matrix**
Add focused classes:
- `.analytics-hero-grid`
- `.registry-coverage-layout`
- `.source-coverage-list`
- `.registry-matrix`
- `.matrix-cell`
- `.action-list`
- `.technical-counters`
- [ ] **Step 4: Verify template test passes**
Run:
```bash
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_dashboard_page.py::ParserDashboardPageTest::test_dashboard_prioritizes_registry_enrichment_analytics -v
```
Expected: PASS.
### Task 3: Frontend Rendering
**Files:**
- Modify: `src/templates/dashboard.html`
- Test: `tests/apps/parsers/test_dashboard_page.py`
- [ ] **Step 1: Implement render functions**
Add or update inline JS:
- `renderRegistryEnrichmentAnalytics()`
- `renderRegistrySourceCoverage()`
- `renderRegistrySourceMatrix()`
- `renderEnrichmentPipeline()`
- `renderAnalyticsActionQueue()`
- `renderTechnicalSourceCounters()`
Change `renderAnalytics()` to call these functions and keep existing status/source totals as secondary content.
- [ ] **Step 2: Add defensive empty states**
If `dashboardData.registry_enrichment_analytics` is missing, render an empty state and keep secondary counters visible.
- [ ] **Step 3: Syntax-check JS**
Run:
```bash
perl -0ne 'while (m{<script>(.*?)</script>}sg) { print $1 }' src/templates/dashboard.html > /tmp/mostovik-dashboard-inline.js
node --check /tmp/mostovik-dashboard-inline.js
```
Expected: exit code 0.
### Task 4: Verification
**Files:**
- No new files.
- [ ] **Step 1: Run focused tests**
Run:
```bash
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py::ParsersViewSetTest::test_dashboard_data_exposes_registry_enrichment_analytics tests/apps/parsers/test_dashboard_page.py::ParserDashboardPageTest::test_dashboard_prioritizes_registry_enrichment_analytics -v
```
Expected: PASS.
- [ ] **Step 2: Run dashboard/parser regression suite**
Run:
```bash
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py tests/apps/parsers/test_dashboard_page.py tests/apps/organizations tests/apps/registers/test_services.py
```
Expected: PASS.
- [ ] **Step 3: Run Django system check**
Run:
```bash
PYTHONPATH=src .venv/bin/python src/manage.py check
```
Expected: `System check identified no issues`.
- [ ] **Step 4: Browser smoke test**
Open `/dashboard`, confirm:
- first analytics section is registry coverage, not raw source records.
- matrix renders.
- pipeline renders.
- source totals moved below primary analytics.
- existing organization/FNS drill-down still works.

View File

@@ -0,0 +1,198 @@
# Dashboard Registry Enrichment Analytics Design
Date: 2026-05-06
## Goal
Rework the dashboard analytics tab so it treats active registry organizations as the primary population and parser/enrichment jobs as the operational process that fills data for those organizations.
The existing analytics page is source-centric: total records, source counts, and load quality. That remains useful, but secondary. The new first screen must answer:
- How many active registry organizations are under control?
- How many have additional data from enrichment sources?
- Which registries are under-covered by source?
- Which enrichment jobs are scheduled, running, successful, failed, or stale?
- What actions should the operator take next?
## Scope
In scope:
- Rebuild only the `analyticsPanel` dashboard tab.
- Keep current navigation and other dashboard tabs unchanged.
- Add dashboard API aggregate data under `/api/v1/parsers/dashboard/`.
- Use active `RegistryMembershipPeriod` rows as the population.
- Exclude `unfair_suppliers` from completeness degradation. It is a risk signal, not a required enrichment source.
- Keep source record totals available, but move them below the primary registry analytics.
Out of scope:
- Changing v2 organization API contracts.
- Changing parser execution behavior.
- Changing Celery scheduling semantics.
- Adding external chart dependencies.
## UX Structure
The analytics tab becomes a hybrid of "coverage center" and "enrichment pipeline".
Top section: Registry Organization Coverage
- KPI cards:
- Active registry organizations.
- Organizations with at least one enrichment source.
- Organizations with core profile coverage.
- Organizations requiring attention.
- Coverage by source:
- Bar rows for FNS reports, industrial certificates, products, manufacturers, inspections, procurements, arbitration, bankruptcy, FSTEC, vacancies, etc.
- Each row shows matched organization count and percent of active registry organizations.
- `unfair_suppliers` is not included here.
- Registry × source matrix:
- Rows are registries.
- Columns are important enrichment sources.
- Cells show percent coverage for organizations in that registry.
- This gives a fast view of which registry/source pair needs work.
Second section: Enrichment Pipeline
- Job KPI cards:
- Active schedules.
- Running jobs.
- Recent successes.
- Recent failures.
- Recent job quality meter:
- Reuse existing load log status data, but frame it as enrichment pipeline health.
- Action queue:
- Organizations without enrichment data.
- Organizations with identifier/matching problems.
- Snapshots older than latest parser batches.
- Risk signals such as unfair suppliers, bankruptcy, GOZ evasion shown separately from coverage.
Third section: Secondary Technical Counters
- Current source record totals and source mode breakdown move below the registry-focused blocks.
- These remain useful for diagnostics, but no longer dominate the page.
## Backend Data Contract
Extend `/api/v1/parsers/dashboard/` with an analytics object:
```json
{
"registry_enrichment_analytics": {
"population": {
"active_registry_organizations": 252,
"active_memberships": 647,
"registries_with_data_percent": 100
},
"coverage_summary": {
"with_any_enrichment": 68,
"with_any_enrichment_percent": 27.0,
"core_profile_complete": 21,
"core_profile_complete_percent": 8.3,
"requires_attention": 184
},
"source_coverage": [
{
"source": "fns_reports",
"label": "ФНС отчетность",
"organizations_count": 45,
"coverage_percent": 17.9,
"required_for_core_profile": true,
"risk_signal": false
}
],
"registry_source_matrix": [
{
"registry_id": "uuid",
"registry_name": "Реестр ГК Росатом ГОЗ",
"active_organizations": 139,
"sources": {
"fns_reports": {
"organizations_count": 20,
"coverage_percent": 14.4
}
}
}
],
"risk_signals": [
{
"source": "unfair_suppliers",
"label": "Недобросовестные поставщики",
"organizations_count": 3,
"coverage_percent": 1.2
}
],
"pipeline": {
"active_schedules": 15,
"running_jobs": 0,
"recent_success": 13,
"recent_failed": 0,
"recent_other": 1
}
}
}
```
The existing `registry_data_coverage` can remain temporarily for compatibility inside dashboard JS, but new UI should read `registry_enrichment_analytics`.
## Aggregation Rules
- Population is distinct organizations from active registry memberships: `ended_at IS NULL`.
- Source coverage matches parser records to registry organizations by INN or OGRN.
- `FinancialReport` matches by OGRN.
- legacy `ProcurementRecord` matches by `customer_inn` and `customer_ogrn`.
- `unfair_suppliers` is excluded from completeness and shown as a risk signal.
- Percent values use one decimal place.
- If a source has records but no identifiers, it does not count as organization coverage.
Core profile completeness for the first version:
- Organization has FNS reports.
- Organization has at least one industrial/manufacturer/product source.
This is intentionally conservative and can become configurable later.
## Frontend Design
Implementation remains in `src/templates/dashboard.html` for now, following the current dashboard pattern.
New/updated DOM blocks:
- `analyticsRegistryKpis`
- `registrySourceCoverageChart`
- `registrySourceMatrix`
- `enrichmentPipelinePanel`
- `analyticsActionQueue`
- `technicalSourceCounters`
No chart dependency is added. Use CSS bars, compact matrix cells, and existing badges/cards. This keeps the dashboard self-contained.
## Error Handling
- If analytics aggregate is missing, show empty states instead of crashing.
- If registries are unavailable, keep the pipeline and technical counters visible.
- If coverage has zero population, render zeroed KPIs and explanatory empty states.
## Testing
Add/update tests:
- Dashboard API returns `registry_enrichment_analytics`.
- `unfair_suppliers` appears in `risk_signals`, not `source_coverage`.
- Matrix counts source coverage per registry.
- Template contains new analytics sections and still includes secondary source counters.
- Existing parser/dashboard tests continue to pass.
Manual validation:
- Open `/dashboard`.
- Confirm first visible analytics content is registry organization coverage.
- Confirm source record totals are below primary registry analytics.
- Confirm FNS table and existing organization drill-down are unaffected.
## Risks
- Matching by INN/OGRN can undercount sources with incomplete identifiers.
- Current dashboard API may become heavier with matrix aggregation. Keep queries bounded and use grouped SQL where practical.
- Core completeness definition is a business rule; first implementation uses a conservative default and should be easy to adjust.