Add organizations v2 API and registry enrichment
This commit is contained in:
@@ -0,0 +1,205 @@
|
||||
# Dashboard Registry Enrichment Analytics Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Rebuild the dashboard analytics tab around active registry organizations and the enrichment pipeline that fills data for them.
|
||||
|
||||
**Architecture:** Keep the existing Django API endpoint `/api/v1/parsers/dashboard/` and template `src/templates/dashboard.html`. Add a focused backend aggregate `registry_enrichment_analytics` beside existing fields, then make the analytics tab render registry coverage, matrix, pipeline, action queue, and secondary technical counters from that aggregate.
|
||||
|
||||
**Tech Stack:** Django 3.2, DRF, existing parser/register ORM models, server-rendered HTML with inline vanilla JS/CSS, pytest.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Backend Analytics Contract
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/apps/parsers/views.py`
|
||||
- Test: `tests/apps/parsers/test_views.py`
|
||||
|
||||
- [ ] **Step 1: Write failing API test**
|
||||
|
||||
Add a test that creates two active registry organizations in two registries, FNS data for one organization, industrial data for one organization, and `unfair_suppliers` for one organization. Assert:
|
||||
|
||||
- `registry_enrichment_analytics` exists.
|
||||
- active population is `2`.
|
||||
- `source_coverage` contains FNS/industrial and excludes `unfair_suppliers`.
|
||||
- `risk_signals` contains `unfair_suppliers`.
|
||||
- `registry_source_matrix` has per-registry source counts.
|
||||
- `core_profile_complete` is `1` when one organization has FNS and industrial coverage.
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py::ParsersViewSetTest::test_dashboard_data_exposes_registry_enrichment_analytics -v
|
||||
```
|
||||
|
||||
Expected: FAIL because `registry_enrichment_analytics` is missing.
|
||||
|
||||
- [ ] **Step 2: Implement aggregate helpers**
|
||||
|
||||
In `src/apps/parsers/views.py`, add helpers based on the existing registry coverage matching:
|
||||
|
||||
- active registry organization identity indexes.
|
||||
- source matched organization id sets.
|
||||
- source coverage entries.
|
||||
- risk signal entries.
|
||||
- registry/source matrix rows.
|
||||
- pipeline summary from schedules, jobs, and load logs.
|
||||
|
||||
Use source matching rules already present:
|
||||
|
||||
- default fields: `inn`, `ogrn`.
|
||||
- FNS: `ogrn` only.
|
||||
- legacy procurements: `customer_inn`, `customer_ogrn`.
|
||||
- `unfair_suppliers`: risk signal, not completeness.
|
||||
|
||||
- [ ] **Step 3: Add aggregate to dashboard response**
|
||||
|
||||
Return `registry_enrichment_analytics` from `ParserDashboardDataView.get()` while keeping existing `registry_data_coverage`.
|
||||
|
||||
- [ ] **Step 4: Verify backend test passes**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py::ParsersViewSetTest::test_dashboard_data_exposes_registry_enrichment_analytics -v
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Task 2: Analytics Template Structure
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/templates/dashboard.html`
|
||||
- Test: `tests/apps/parsers/test_dashboard_page.py`
|
||||
|
||||
- [ ] **Step 1: Write failing template test**
|
||||
|
||||
Add assertions that `/dashboard` contains:
|
||||
|
||||
- `analyticsRegistryKpis`
|
||||
- `registrySourceCoverageChart`
|
||||
- `registrySourceMatrix`
|
||||
- `enrichmentPipelinePanel`
|
||||
- `analyticsActionQueue`
|
||||
- `technicalSourceCounters`
|
||||
- `renderRegistryEnrichmentAnalytics`
|
||||
- `renderRegistrySourceMatrix`
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_dashboard_page.py::ParserDashboardPageTest::test_dashboard_prioritizes_registry_enrichment_analytics -v
|
||||
```
|
||||
|
||||
Expected: FAIL because the new DOM/functions are missing.
|
||||
|
||||
- [ ] **Step 2: Replace analytics panel markup**
|
||||
|
||||
Update `analyticsPanel` so its first visible blocks are:
|
||||
|
||||
- registry organization KPI grid.
|
||||
- registry source coverage + matrix.
|
||||
- enrichment pipeline + action queue.
|
||||
- secondary technical counters below.
|
||||
|
||||
- [ ] **Step 3: Add CSS for compact bars and matrix**
|
||||
|
||||
Add focused classes:
|
||||
|
||||
- `.analytics-hero-grid`
|
||||
- `.registry-coverage-layout`
|
||||
- `.source-coverage-list`
|
||||
- `.registry-matrix`
|
||||
- `.matrix-cell`
|
||||
- `.action-list`
|
||||
- `.technical-counters`
|
||||
|
||||
- [ ] **Step 4: Verify template test passes**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_dashboard_page.py::ParserDashboardPageTest::test_dashboard_prioritizes_registry_enrichment_analytics -v
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Task 3: Frontend Rendering
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/templates/dashboard.html`
|
||||
- Test: `tests/apps/parsers/test_dashboard_page.py`
|
||||
|
||||
- [ ] **Step 1: Implement render functions**
|
||||
|
||||
Add or update inline JS:
|
||||
|
||||
- `renderRegistryEnrichmentAnalytics()`
|
||||
- `renderRegistrySourceCoverage()`
|
||||
- `renderRegistrySourceMatrix()`
|
||||
- `renderEnrichmentPipeline()`
|
||||
- `renderAnalyticsActionQueue()`
|
||||
- `renderTechnicalSourceCounters()`
|
||||
|
||||
Change `renderAnalytics()` to call these functions and keep existing status/source totals as secondary content.
|
||||
|
||||
- [ ] **Step 2: Add defensive empty states**
|
||||
|
||||
If `dashboardData.registry_enrichment_analytics` is missing, render an empty state and keep secondary counters visible.
|
||||
|
||||
- [ ] **Step 3: Syntax-check JS**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
perl -0ne 'while (m{<script>(.*?)</script>}sg) { print $1 }' src/templates/dashboard.html > /tmp/mostovik-dashboard-inline.js
|
||||
node --check /tmp/mostovik-dashboard-inline.js
|
||||
```
|
||||
|
||||
Expected: exit code 0.
|
||||
|
||||
### Task 4: Verification
|
||||
|
||||
**Files:**
|
||||
- No new files.
|
||||
|
||||
- [ ] **Step 1: Run focused tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py::ParsersViewSetTest::test_dashboard_data_exposes_registry_enrichment_analytics tests/apps/parsers/test_dashboard_page.py::ParserDashboardPageTest::test_dashboard_prioritizes_registry_enrichment_analytics -v
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run dashboard/parser regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/pytest tests/apps/parsers/test_views.py tests/apps/parsers/test_dashboard_page.py tests/apps/organizations tests/apps/registers/test_services.py
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run Django system check**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src .venv/bin/python src/manage.py check
|
||||
```
|
||||
|
||||
Expected: `System check identified no issues`.
|
||||
|
||||
- [ ] **Step 4: Browser smoke test**
|
||||
|
||||
Open `/dashboard`, confirm:
|
||||
|
||||
- first analytics section is registry coverage, not raw source records.
|
||||
- matrix renders.
|
||||
- pipeline renders.
|
||||
- source totals moved below primary analytics.
|
||||
- existing organization/FNS drill-down still works.
|
||||
@@ -0,0 +1,198 @@
|
||||
# Dashboard Registry Enrichment Analytics Design
|
||||
|
||||
Date: 2026-05-06
|
||||
|
||||
## Goal
|
||||
|
||||
Rework the dashboard analytics tab so it treats active registry organizations as the primary population and parser/enrichment jobs as the operational process that fills data for those organizations.
|
||||
|
||||
The existing analytics page is source-centric: total records, source counts, and load quality. That remains useful, but secondary. The new first screen must answer:
|
||||
|
||||
- How many active registry organizations are under control?
|
||||
- How many have additional data from enrichment sources?
|
||||
- Which registries are under-covered by source?
|
||||
- Which enrichment jobs are scheduled, running, successful, failed, or stale?
|
||||
- What actions should the operator take next?
|
||||
|
||||
## Scope
|
||||
|
||||
In scope:
|
||||
|
||||
- Rebuild only the `analyticsPanel` dashboard tab.
|
||||
- Keep current navigation and other dashboard tabs unchanged.
|
||||
- Add dashboard API aggregate data under `/api/v1/parsers/dashboard/`.
|
||||
- Use active `RegistryMembershipPeriod` rows as the population.
|
||||
- Exclude `unfair_suppliers` from completeness degradation. It is a risk signal, not a required enrichment source.
|
||||
- Keep source record totals available, but move them below the primary registry analytics.
|
||||
|
||||
Out of scope:
|
||||
|
||||
- Changing v2 organization API contracts.
|
||||
- Changing parser execution behavior.
|
||||
- Changing Celery scheduling semantics.
|
||||
- Adding external chart dependencies.
|
||||
|
||||
## UX Structure
|
||||
|
||||
The analytics tab becomes a hybrid of "coverage center" and "enrichment pipeline".
|
||||
|
||||
Top section: Registry Organization Coverage
|
||||
|
||||
- KPI cards:
|
||||
- Active registry organizations.
|
||||
- Organizations with at least one enrichment source.
|
||||
- Organizations with core profile coverage.
|
||||
- Organizations requiring attention.
|
||||
- Coverage by source:
|
||||
- Bar rows for FNS reports, industrial certificates, products, manufacturers, inspections, procurements, arbitration, bankruptcy, FSTEC, vacancies, etc.
|
||||
- Each row shows matched organization count and percent of active registry organizations.
|
||||
- `unfair_suppliers` is not included here.
|
||||
- Registry × source matrix:
|
||||
- Rows are registries.
|
||||
- Columns are important enrichment sources.
|
||||
- Cells show percent coverage for organizations in that registry.
|
||||
- This gives a fast view of which registry/source pair needs work.
|
||||
|
||||
Second section: Enrichment Pipeline
|
||||
|
||||
- Job KPI cards:
|
||||
- Active schedules.
|
||||
- Running jobs.
|
||||
- Recent successes.
|
||||
- Recent failures.
|
||||
- Recent job quality meter:
|
||||
- Reuse existing load log status data, but frame it as enrichment pipeline health.
|
||||
- Action queue:
|
||||
- Organizations without enrichment data.
|
||||
- Organizations with identifier/matching problems.
|
||||
- Snapshots older than latest parser batches.
|
||||
- Risk signals such as unfair suppliers, bankruptcy, GOZ evasion shown separately from coverage.
|
||||
|
||||
Third section: Secondary Technical Counters
|
||||
|
||||
- Current source record totals and source mode breakdown move below the registry-focused blocks.
|
||||
- These remain useful for diagnostics, but no longer dominate the page.
|
||||
|
||||
## Backend Data Contract
|
||||
|
||||
Extend `/api/v1/parsers/dashboard/` with an analytics object:
|
||||
|
||||
```json
|
||||
{
|
||||
"registry_enrichment_analytics": {
|
||||
"population": {
|
||||
"active_registry_organizations": 252,
|
||||
"active_memberships": 647,
|
||||
"registries_with_data_percent": 100
|
||||
},
|
||||
"coverage_summary": {
|
||||
"with_any_enrichment": 68,
|
||||
"with_any_enrichment_percent": 27.0,
|
||||
"core_profile_complete": 21,
|
||||
"core_profile_complete_percent": 8.3,
|
||||
"requires_attention": 184
|
||||
},
|
||||
"source_coverage": [
|
||||
{
|
||||
"source": "fns_reports",
|
||||
"label": "ФНС отчетность",
|
||||
"organizations_count": 45,
|
||||
"coverage_percent": 17.9,
|
||||
"required_for_core_profile": true,
|
||||
"risk_signal": false
|
||||
}
|
||||
],
|
||||
"registry_source_matrix": [
|
||||
{
|
||||
"registry_id": "uuid",
|
||||
"registry_name": "Реестр ГК Росатом ГОЗ",
|
||||
"active_organizations": 139,
|
||||
"sources": {
|
||||
"fns_reports": {
|
||||
"organizations_count": 20,
|
||||
"coverage_percent": 14.4
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"risk_signals": [
|
||||
{
|
||||
"source": "unfair_suppliers",
|
||||
"label": "Недобросовестные поставщики",
|
||||
"organizations_count": 3,
|
||||
"coverage_percent": 1.2
|
||||
}
|
||||
],
|
||||
"pipeline": {
|
||||
"active_schedules": 15,
|
||||
"running_jobs": 0,
|
||||
"recent_success": 13,
|
||||
"recent_failed": 0,
|
||||
"recent_other": 1
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The existing `registry_data_coverage` can remain temporarily for compatibility inside dashboard JS, but new UI should read `registry_enrichment_analytics`.
|
||||
|
||||
## Aggregation Rules
|
||||
|
||||
- Population is distinct organizations from active registry memberships: `ended_at IS NULL`.
|
||||
- Source coverage matches parser records to registry organizations by INN or OGRN.
|
||||
- `FinancialReport` matches by OGRN.
|
||||
- legacy `ProcurementRecord` matches by `customer_inn` and `customer_ogrn`.
|
||||
- `unfair_suppliers` is excluded from completeness and shown as a risk signal.
|
||||
- Percent values use one decimal place.
|
||||
- If a source has records but no identifiers, it does not count as organization coverage.
|
||||
|
||||
Core profile completeness for the first version:
|
||||
|
||||
- Organization has FNS reports.
|
||||
- Organization has at least one industrial/manufacturer/product source.
|
||||
|
||||
This is intentionally conservative and can become configurable later.
|
||||
|
||||
## Frontend Design
|
||||
|
||||
Implementation remains in `src/templates/dashboard.html` for now, following the current dashboard pattern.
|
||||
|
||||
New/updated DOM blocks:
|
||||
|
||||
- `analyticsRegistryKpis`
|
||||
- `registrySourceCoverageChart`
|
||||
- `registrySourceMatrix`
|
||||
- `enrichmentPipelinePanel`
|
||||
- `analyticsActionQueue`
|
||||
- `technicalSourceCounters`
|
||||
|
||||
No chart dependency is added. Use CSS bars, compact matrix cells, and existing badges/cards. This keeps the dashboard self-contained.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- If analytics aggregate is missing, show empty states instead of crashing.
|
||||
- If registries are unavailable, keep the pipeline and technical counters visible.
|
||||
- If coverage has zero population, render zeroed KPIs and explanatory empty states.
|
||||
|
||||
## Testing
|
||||
|
||||
Add/update tests:
|
||||
|
||||
- Dashboard API returns `registry_enrichment_analytics`.
|
||||
- `unfair_suppliers` appears in `risk_signals`, not `source_coverage`.
|
||||
- Matrix counts source coverage per registry.
|
||||
- Template contains new analytics sections and still includes secondary source counters.
|
||||
- Existing parser/dashboard tests continue to pass.
|
||||
|
||||
Manual validation:
|
||||
|
||||
- Open `/dashboard`.
|
||||
- Confirm first visible analytics content is registry organization coverage.
|
||||
- Confirm source record totals are below primary registry analytics.
|
||||
- Confirm FNS table and existing organization drill-down are unaffected.
|
||||
|
||||
## Risks
|
||||
|
||||
- Matching by INN/OGRN can undercount sources with incomplete identifiers.
|
||||
- Current dashboard API may become heavier with matrix aggregation. Keep queries bounded and use grouped SQL where practical.
|
||||
- Core completeness definition is a business rule; first implementation uses a conservative default and should be easy to adjust.
|
||||
Reference in New Issue
Block a user