Data Verification: Design Document

Data Verification: Design Document

The Core Problem

The current Atria platform validates connectivity but not data quality or configuration correctness. This creates a dangerous blind spot: devices appear healthy while producing meaningless or incorrect data.

Real-World Failures

Scenario 1: Wrong Pulse Rate Configuration

A pulse meter was installed on a meter at a site. The installer entered a pulse rate value of "1", but the correct value should have been a different scalar (value per pulse). A team member checked the Site Report and saw:

  • Status: Connected (green)

  • Readings: Data coming in

Everything appeared normal. However, the data was meaningless because the pulse rate configuration was wrong. This was only discovered during a conversation with the installer — not through any system alert or verification.

What should have happened: The system should have flagged "Pulse scalar value of 1 is outside the expected range for this device model" as a configuration error.

Scenario 2: Dying Battery

A sensor's battery was dying — voltage was trending downward over weeks. This was only discovered by manually querying the Timestream table and inspecting BatteryLevel values. No alert, no visual indicator, no proactive warning.

What should have happened: The system should have flagged "Battery voltage 2.3V is below critical threshold (2.8V)" and sent a Chime notification.

Scenario 3: Decoder Errors (Silent Data Loss)

When LoRaWAN device data is received, it passes through a decoder Lambda before landing in Timestream. If the decoder fails (malformed payload, unsupported firmware version, etc.), the error is logged in CloudWatch but the data never reaches Timestream. From the frontend, the device simply appears to have a data gap — there's no visibility into why data stopped.

What should have happened: The system should have checked CloudWatch decoder logs and flagged "3 decoder errors in the last 2 hours for device XYZ — payload format mismatch."

Why This Matters

  • Installers leave sites thinking everything is working when it's not

  • Managers make decisions based on data they believe is accurate but isn't

The system collects the right data but never interprets it.

Solution: Verification as a Layered Engine

Instead of a flat "check if data exists" approach, verification is organized as 5 layers — each catching a different class of problem:

Layer 5: Pipeline Integrity → Timestream Layer 4: Data Sensibility → Values physically possible? Patterns normal? Layer 3: Device Health → Battery OK? Signal OK? Packet loss? Layer 2: Data Presence → Is data arriving? Any gaps? Decoder errors? Layer 1: Configuration → Pulse rate plausible? Units correct? Setup valid?

Scenario 1 mapping: Wrong pulse rate = Layer 1 failure. Data was flowing (Layer 2 pass), device was healthy (Layer 3 pass), but the configuration was wrong.

Scenario 2 mapping: Dying battery = Layer 3 failure. Data may still have been arriving, but the device was degrading.

Scenario 3 mapping: Decoder errors = Layer 2 failure. Data stopped arriving because the decoder was failing silently.

Solution Approach: Dedicated Applet + Embedded Widgets

Data verification gets its own dedicated applet — it is a first-class operational concern, not a side feature. Embedding it inside Site Report or Installer alone would minimize its importance and limit its scope.

Why a Dedicated Applet

  1. Different users need it differently:

  • Installer (post-install): "Did I set this up correctly?" — Quick pass/fail check

  • Manager (daily monitoring): "Is everything healthy across my sites?" — Dashboard overview

  • Engineer (investigation): "Why is this device sending weird data?" — Deep drill-down with raw readings

  1. Scope beyond installation: Verification must work for legacy devices (installed before Atria app), not just newly installed ones.

  2. History and audit: Verification runs, sign-offs, and overrides need a dedicated home for accountability.

  3. Extensibility: Adding new device types, new rules, or new data sources should happen in one place.

Three UI Surfaces, One Engine

Build a shared data verification engine that checks sensor data is reaching Timestream, validates device configuration (e.g., pulse scalar), detects anomalies, and checks CloudWatch for decoder errors. Three surfaces consume the same backend API and consume the same backend verification API:

  1. Data Verification applet (new, dedicated) — the central hub for all verification: full detail, history, re-run, all devices (new + legacy), audit trail, scheduled run results, cross-site capability. This is the source of truth.

  2. Verify & Handoff section in Installer Review — inline verification + sign-off after installation, with a link to the Data Verification applet for deeper investigation

  3. Verification badges in Site Report — per-device pass/warn/fail at a glance, with a "View Full Verification" link that redirects to the Data Verification applet for that site

The applet is the source of truth; Installer and Site Report provide quick views and link into it.

Architecture Overview

┌─────────────────────────────────┐ │ Verification Lambda │ ├─────────────────────────────────┤ │ Triggers: │ │ 1. API call (manual from UI) │ │ 2. Post-install event (SNS) │ │ 3. Scheduled (EventBridge) │ └────┬─────────────── ┬───────────┘ │ │ ┌────▼──┐ ┌──────▼─────┐ │Timest-│ │CloudWatch │ │ream │ │Decoder Logs│ │Query │ │ │ └───────┘ └────────────┘ │ │ ┌────▼───────────────▼─────────── ─┐ │ Rule Engine │ │ Rules from AtriaVerification- │ │ Rules DDB table (per DeviceType)│ └────────────┬─────────────────────┘ ┌────────────▼─────────────────────┐ │ Results │ │ ├─AtriaInstallerAppDeviceDetails│ │ ├─ SNS → Chime (alerts) │ │ └─ API response (to UI) │ └──────────────────────────────────┘

Note: Verification results are returned directly in the API response to the caller. They are not persisted to any DynamoDB table. Each verification run is stateless.

Data Storage Architecture

Storage

What

Why

Storage

What

Why

AtriaVerificationRules(new DDB table)

Verification rules- each rule is its own record with thresholds, appliesTo (device types), enabled flag

Admin-configurable rules; will be consumed by future Chimes/notification system

AtriaInstallerAppDeviceDetails

Stores installer sign-off data (InstallerSignoffBy, InstallerSignoffAt, InstallerSignoffNotes, InstallerSignoffVerificationResult). Sign-off overwrites
previous record, allowing re-signoff.

To track sign-off status

LorawanDevicesTable

Device metadata (read-only) - DeviceType, Site, Vendor, Model, PulseScalarValue, BatteryLevel

Source of truth for device configuration and attributes. Queried during verification for rule evaluation. Not written to by the verification system.

Why a rules table (not hardcoded)?

  1. Admin users need to configure thresholds per device type via Rules Config View

  2. The future Chimes/notification system will consume the same rules table — Chimes can monitor Battery Health, Sensor Data, etc. by referencing verification rules and their thresholds

  3. New rules can be added without code deployment

AtriaVerificationRules record structure:

Each rule is its own DDB record:

{ "RuleId": "BATTERY_LOW", // Primary key, uppercase with underscores "Type": "BATTERY", // Rule evaluator type (maps to Python class) "DisplayName": "Battery Low", // Human-readable name for UI "Description": "...", // What the rule checks "Layer": "health", // config | presence | health | sensibility "Enabled": true, // Toggle on/off without deleting "AppliesTo": ["PWM", "PGM"], // Device types, empty = all devices "Thresholds": { ... }, // Rule-specific parameters "CreatedAt": "2026-01-01T00:00:00Z", "CreatedBy": "system@seed", "UpdatedAt": "2026-03-15T10:30:00Z", // Set on UI edits "UpdatedBy": "admin@atria.com" // From Cognito token }

UI Surfaces

Data Verification Applet (New, Dedicated)

Where: New applet with its own CDK stack, route `/data-verification`, API Gateway, and navigation entry in the app menu.

Who uses it: Installers (via redirect from Installer Review), Managers/PMs (via redirect from Site Report), IoT engineers, admins — anyone who needs to investigate data quality.

Purpose: The central hub for all data verification. This is where the full analysis lives — detailed per-device results, verification history, audit trail, cross-site verification. Both Installer and Site Report link here for deeper investigation.

Applet Views:

The Data Verification applet has 4 distinct views:

View 1: Search View (Landing Page)

The landing page when opening the applet directly.

  • Two search modes:

    • Search by Site ID: Enter a site code (e.g., NCT2, BVA1), click "View Site" to navigate to Site View. Verification runs automatically.

    • Search by Device EUI: Enter the last 5 digits of a device EUI (e.g., 56170), click "Search" to find and navigate directly to the device detail view.

  • Search results for device EUI show matching devices with EUI, Device Type, and Site.

View 2: Site View (All Devices at a Site)

Shown when a site is selected from Search View, or when navigating to /monitor/data-verification/site/{siteId}.

  • Site ID displayed prominently

  • Overall status banner: "N passed, N warnings, N failed" with color coding

    • "Run Verification" button: Runs POST /data-verification/verify for all devices at the site

    • "Refresh" button: Re-runs verification (same as Run Verification — results are not cached)

  • Per-device cards in a grid/list:

    • Traffic light badge (green/yellow/red)

    • Device EUI, vendor, model, DeviceType, location

    • Top issue summary (e.g., "Pulse scalar out of range", "Battery 2.3V")

    • Click card → navigates to Device View for that device

  • Filters by DeviceType, status (pass/warn/fail), vendor

  • Shows verification result per device from the latest verify API response

  • Sign-off status visible in Device View when checking installation status (from AtriaInstallerAppDeviceDetails table)

View 3: Device View (Single Device Deep-Dive)

Shown when a device is selected from Site View, or when arriving via `?deviceEui=` query param.

  • Full device details: EUI, vendor, model, DeviceType, location, floor, site

    • Data Verification Results:

      • Timestream status: last reading age, total readings in window, anomalies with evidence

      • IoT Connectivity Status: On-demand diagnostic check (separate "Check Connectivity" button) — checks IoT Core for device uplinks

      • Decoder Status: On-demand diagnostic check (separate "Check Decoder" button) — checks CloudWatch decoder logs for errors

      • Each check shown with pass/warn/fail badge and detailed message

    • Config Verification Results:

      • Pulse scalar: current value, expected range, model default (for PWM/PGM/PEM)

      • Each config check shown with pass/fail and explanation

    • Anomaly Evidence: Timestamps, values, duration for each detected anomaly

    • Last Good Reading: Timestamp and value of last known-good reading before any anomaly

    • Raw Readings Table: Recent Timestream readings for manual inspection (configurable limit)

  • Back navigation to Site View

View 4: Rules Config View

Accessible via a settings/config tab within the applet.

  • View current verification rules from `AtriaVerificationRules` DDB table

  • Table showing: Rule ID, Description, Layer, Severity, Applies To (device types), Enabled, Thresholds

  • Vendor/model-specific threshold overrides displayed per rule

  • Toggle rules on/off per device type

  • Admin users can edit thresholds in-app (via PUT /data-verification/rules/{ruleId})

  • Future: Linked to Chimes/notification system — rules feed into notification triggers

Deep-link support:

  • /monitor/data-verification → Search page

  • /monitor/data-verification/site/ABC1 → Site View for ABC1 (auto-runs verification)

  • /monitor/data-verification/site/ABC1/0004a30b012e2891 → Device View for specific device

  • /monitor/data-verification/device/56170 → Device View via EUI search

Installer Review — "Verify & Handoff" Section

Where: `InstallerReviewPage.jsx`, below the existing "Uncaptured Pulse Counts" section and above "Devices Ready for Installation."

Who uses it: Installers, on-site after installing devices.

Purpose: Quick inline verification + sign-off before the installer leaves the site. Shows a summary and links to the Data Verification applet for deeper investigation.

Behavior:

  • On page load (when `siteId` is available), automatically runs verification for ALL devices at the site — both newly installed devices (from `installationSummary.allDevices`) and legacy LoRaWAN devices (from `lorawanDevices`).

  • Shows per-device verification status with pass/warn/fail badges (compact summary view).

  • Expandable detail per device: Timestream data freshness, config issues (e.g., pulse scalar out of range), decoder log snippets.

  • "Re-run Verification" button for manual refresh.

  • If any device has `fail` status, show a prominent banner: "Verification failed for N device(s). Review before leaving site."

  • Sign-off: "Sign Off & Complete" button that records the installer's sign-off in the AtriaInstallerAppDeviceDetails table (InstallerSignoffBy, InstallerSignoffAt, InstallerSignoffNotes,
    InstallerSignoffVerificationResult). If warnings or failures exist, a note is required before sign-off is enabled.

  • Previous sign-off display: If a previous installer has already signed off for this site, a blue banner shows who signed off, when, the verification status at the time, and any notes. The
    installer can re-verify and overwrite the previous sign-off.

  • Sign-off record stored for audit: who signed off (from Cognito email), when, verification result (green/yellow/red), and notes.

  • "View Full Verification" link: Redirects to the Data Verification applet (`/data-verification?siteId=ABC1`) for deeper investigation, full history, and detailed per-device analysis.

Integration point: The existing `loadInstallationSummary()` already fetches device lists. After that completes, fire the verification call using the combined device EUI list.

Site Report — Verification Badges + Redirect

Where: `SiteReportPage.jsx` / `SiteHealthDisplay.jsx`, per-device cards and a site-level summary.

Who uses it: Installers, Managers, PMs, ops — anyone reviewing a site's device health remotely.

Purpose: Give non-technical users an at-a-glance view of verification status on the page they already use. This is where a reviewer would have seen "Config mismatch" on the device card. For full details, they click through to the Data Verification applet.

Behavior:

  • Per-device verification badge (green/yellow/red) on each device card. Since verification results are not cached in DDB, badges require running verification on demand via the verify API, or
    showing "Not Verified" by default until verification is triggered.

  • Site-level summary banner: "3 passed, 1 warning, 1 failed" with color coding.

  • Click any badge or the summary banner → redirects to the Data Verification applet for that site (`/data-verification?siteId=ABC1`), optionally deep-linking to a specific device (`/data-verification?siteId=ABC1&deviceEui=ABCDE`).

  • "Verify Now" button to trigger a fresh run from Site Report (results appear as badges, full detail in the applet).

  • Verification results from automated scheduled runs are shown as badges automatically — users see the latest status even if they didn't trigger a run.

Backend — Verification API (Dedicated)

Why a Separate API

The Data Verification service gets its own API Gateway + Lambda stack, following the existing pattern where each applet has its own API. Reasons:

  1. Separation of concerns — verification logic (rule engine, anomaly detection, multi-source checks) should not be entangled with device health or installer APIs

  2. Different IAM permissions — the verification Lambda needs access to CloudWatch Logs, Timestream, and DynamoDB — a broader scope than any existing Lambda

  3. Independent scaling — verification runs (especially scheduled cross-site scans) have different load patterns than regular API calls

  4. Independent deployment — rules, thresholds, and verification logic can be updated without touching other services

Frontend env variable: `VITE_DATA_VERIFICATION_API_URL`

API Endpoints

Base path: /data-verification (on a new dedicated API Gateway)

# POST /data-verification/verify : Run verification rules and return results directly (no DDB write). Accepts optional deviceEui for single device and ruleIds for subset of rules.

# POST /data-verification/signoff : Record installer sign-off in AtriaInstallerAppDeviceDetails table with InstallerSignoff* fields. Supports re-signoff (overwrites previous). Uses ConditionExpression to skip devices not in installer table.

# GET /data-verification/sites/{siteId}/devices : Get device metadata at site (read-only, no verification status).

# GET /data-verification/devices/search?eui=XXXXX : Search devices by partial or full EUI.

# GET /data-verification/devices/{deviceEui}/readings : Get Timestream readings for a device.

# GET /data-verification/devices/{deviceEui}/installation-status : Check installer table for installation record + signoff status.

# GET /data-verification/devices/{deviceEui}/iot-status : IoT Core connectivity diagnostic chain.

# GET /data-verification/devices/{deviceEui}/decoder-status : CloudWatch decoder log check.

# GET /data-verification/rules : Get all verification rules (Admin).

# POST /data-verification/rules : Create a new verification rule (Admin).

# PUT /data-verification/rules/{ruleId} : Update a rule (Admin).

# DELETE /data-verification/rules/{ruleId} : Delete a rule (Admin).

Verification Criteria — Rule Engine

Data Verification Rules

All rules are stored in the AtriaVerificationRules DynamoDB table. Each rule is its own record with thresholds, appliesTo (device types), and enabled flag. The Lambda loads all enabled rules at verification time.

Rule

Layer

Applies To

Description

DATA_FRESHNESS

presence

All sensors

Warns >6h, fails >12h since last reading. Runs IoT diagnostic chain for stale devices.

BATTERY_LOW

health

All sensors

Auto-detects voltage vs percentage. Voltage: warn <3.0V, fail <2.7V. Percentage: warn <20%, fail <10%.

SIGNAL_WEAK

health

All sensors

RSSI warn <-115 dBm, fail <-120 dBm.

CONSTANT_READING

sensibility

PWM, WM, PGM, PEM

20 consecutive identical readings on consumption measures.

CONSUMPTION_CHECK

sensibility

PWM, WM, PGM, PEM

Daily/monthly consumption exceeds physical limits (e.g., PWM max 10,000 gal/day).

NEGATIVE_CONSUMPTION

sensibility

PWM, WM, PGM, PEM

Cumulative meter reading decreased.

ZERO_FLOW

sensibility

PWM, WM, PGM, PEM

Daily consumption is zero — meter unchanged for 24h.

SPIKE_DETECTION

sensibility

PWM, WM, PGM, PEM

Single delta exceeds 10x the rolling average.

Config Verification Rules

Rule

Layer

Applies To

Description

SCALAR_PRESENCE

config

PWM, PGM, PEM

PulseScalarValue must exist.

DEVICE_TYPE_PRESENCE

config

All sensors

DeviceType must be set.

LWM_NO_INPUTS

config

LWM

LWM device must have _P1/_P2/_P3 pulse counter records.

Note: Unit-to-DeviceType consistency (e.g., PWM uses water units, PEM uses electrical units) is already validated in the Installer applet at form submission time and does not need to be re-verified here.

Model/Type Metadata (Config Store)

Store in DynamoDB table `VerificationConfig` or SSM parameters. Examples for different device types:

Automated Triggers

  1. Post-Install Event

  • Current implementation: When the VerifyAndHandoff component mounts on the Installer Review page, it automatically calls POST /data-verification/verify for the site. Results are returned directly in the response and displayed inline. No SNS, no polling.

  • Planned: Post-install SNS trigger after handleSubmitAll() for async verification.

  1. Scheduled (EventBridge)

  • EventBridge rule: every 1 hour (configurable).

  • Triggers Verification Lambda for all active sites (or sites with recent installs).

  • Planned: EventBridge scheduled verification. Results would be returned via the verify API (not stored in DynamoDB). Alert integration with SNS → Chime on warn/fail.

  1. Manual (API Call)

  • From Installer Review "Verify & Handoff" button.

  • From Site Report "Verify Now" button.

  • From Data Verification applet "Verify Site" or "Re-verify" button.

Notifications

  • SNS Topic: `atria-verification-alerts`

  • Chime Webhook: Configured as SSM parameter `/atria/verification/chime-webhook-url`

  • Alert format:

[VERIFICATION FAIL] Site: ABC1

Device: 0004A30B00ABCDE (LWL03A)

Issues:

  • Config: Pulse scalar value 1 outside expected range (0.01–100)

  • Data: 5 consecutive duplicate readings

Triggered by: installer_handoff

View: https://atria.example.com/data-verification?siteId=ABC1

Future: Chimes Notification System

The AtriaVerificationRules table is designed to integrate with a future Chimes notification system:

  • Chimes are assigned to users and have types: Site, Sensor, Gateway, User, Admin

  • Chimes can monitor verification rules — e.g., a Chime of type "Sensor" watches for `BATTERY_LOW` rule failures and notifies the assigned user

  • Clauses in notification emails reference rule details (ruleId, severity, threshold, current value)

  • In-app NotificationBoard shows verification alerts alongside other Chimes

  • Toggle/snooze: Users can snooze specific rule notifications per device or site

  • The rules table's ruleId, severity, thresholds, and appliesTo fields provide the data Chimes needs to determine which notifications to fire