Product Version History
| Version | Purpose | Date | Owner |
|---|---|---|---|
| V01 | Initial release | 1/19/2026 | Alex Wines |
Glossary
| Term | Definition |
|---|---|
| Ground Truth | The authoritative source of records directly from vendor systems (e.g., ASM OIS DBs, Fuji sources, FlexFlow DBs), accessed through the Broker where required. |
| Data Lake | Arch-hosted aggregated dataset used for reporting, KPI sets, dashboards, and downstream analysis. |
| Validation Run | A single execution of the validation pipeline over one or more sites and a date range (manual or scheduled). |
Purpose of this Feature
What This Feature Does
Sprint 2 expands the Data Validation Tool from a manual spreadsheet-based workflow into a system that supports:
- Hosted dashboards (Fuji/ASM + FlexFlow) to replace the “pivot spreadsheet” experience
- More actionable discrepancy views, including:
- Clear “missing source” attribution (Data Lake vs Ground Truth)
- Better filtering, sorting, and readability
- Stable date behavior using UTC
- Operational scaling improvements, including:
- Scheduled validation runs
- Automated data lake uploads
- On-demand Excel generation using data already in the Data Lake (vs re-pulling ground truth)
- Stability for ASM validation through a repeatable OIS discovery + onboarding workflow
The goal is to allow Flex and Arch to answer one question in a single UI:
“Where is machine truth not matching what’s in the Arch Data Lake, and why?”
Who should use it
- Flex stakeholders monitoring integrity of ASM / Fuji / FlexFlow ingestion
- Arch Customer Success and Support operators delivering reports and diagnosing gaps
What value it brings
- Makes validation results visible without manual spreadsheet work
- Speeds root cause diagnosis (missing Data Lake vs missing Ground Truth)
- Improves trust in downstream dashboards and KPI Sets
- Enables scalable onboarding of new ASM OIS databases without fragile manual CSV maintenance
Where it appears in the UI
- Validation Dashboards
- Fuji + ASM Validation Dashboard (Flex Advanced Grafana Org.)
- FlexFlow Validation Dashboard (Flex Advanced Grafana Org.)
- Validation Outputs
- Consolidated Excel output folder per run (for packaging/sharing)
- Data Lake tables powering dashboards and reporting
- Operational Scheduling
- Tool runs weekly
Usage Instructions
Viewing Validation Results in the Fuji + ASM Validation Dashboard (Pivot View)
The Fuji + ASM Validation Dashboard replaces the Excel pivot workflow for Fuji and ASM validations. It provides a hosted, filterable view of discrepancies between Ground Truth and Data Lake data.
From the Flex Advanced Grafana org, hosted dashboard environment, open the Fuji + ASM Data Validation Dashboard to access the pivot view.
Filtering Validation Results by Vendor, Site, and Line
You can narrow validation results using the available filters:
- Vendor: ASM or Fuji
- Site: One or more Flex sites (multi-select)
- Line: One or more lines within a selected Site.
These filters update the pivot table in real time and allow you to focus on specific areas of interest.
Understanding Discrepancy and Missing Data Indicators
Each cell in the pivot table represents the validation result for a specific source on a specific day.
- When both Ground Truth and Data Lake data are available, the cell displays a discrepancy ratio or percentage
- When data is missing from one side, the cell clearly indicates the reason:
- Missing Data from Data Lake
- Missing Data from Ground Truth
This makes it easy to distinguish true data mismatches from ingestion, connectivity, or availability issues.
Reviewing Discrepancies with Noise Suppression
To keep the dashboard focused on meaningful differences, very small discrepancies are automatically suppressed:
- Any discrepancy within ±1% is displayed as 0%
This prevents minor noise from obscuring real validation gaps and helps prioritize investigation.
Highlighting Specific Conditions Using the Custom Filter (Fuji + ASM)
The Ratio table includes a Custom Filter that allows you to highlight rows based on specific validation conditions.
Using this filter, you can focus on rows that contain:
- Discrepancies (non-zero differences)
- No Data
- Missing Data Lake
- Missing Ground Truth
Multiple options can be selected at the same time, or the default ALL option can be used to show all rows.
This filter highlights relevant rows without removing context from the table.
Viewing Validation Run Status by Site (Fuji + ASM)
Below the Ratio table, the dashboard includes a summary view showing whether validation successfully ran at each site on each day.
The Data Validation Report Run Status by Site table displays:
- One row per Flex site
- One column per UTC day in the selected range
- Cell values indicating:
- Run Successful when validation data exists
- Run Unsuccessful when no validation data exists
Viewing FlexFlow Validation Results (Table View)
From the Flex Advanced org., open the FlexFlow Data Validation Dashboard to view validation results.
Identifying Large Discrepancies Using Delta Columns
To make discrepancies easier to identify and prioritize, the table includes delta columns:
- History Count Delta: FlexFlow History Count − Data Lake History Count
- Test Count Delta: FlexFlow Test Count − Data Lake Test Count
These columns appear next to their corresponding Data Lake columns and can be sorted to quickly surface the largest gaps.
Viewing FlexFlow Validation Pull Status Over Time
Below the main FlexFlow report table, the dashboard includes a timeline view showing validation pull status by site and database.
This table displays:
- One row per Site + DB Name combination
- A day-by-day timeline for the selected date range
- Color-coded bars:
- Green for successful pulls
- Red for days where no data was returned
- Bars change color as status changes over time
This view helps quickly identify intermittent connectivity or ingestion issues.
Comments
0 comments
Please sign in to leave a comment.