Blocking Now — Must resolve before any feature work starts
What is the testing pyramid for this product?
Without a defined testing strategy, coverage is inconsistent across the monorepo and migration validation has no acceptance criteria. This must be agreed before the first feature PR is merged — retrofitting a test strategy onto existing code is significantly more expensive than building it in.
E2E — Critical paths only
Playwright (web) · Maestro (mobile)
Integration — Heavy coverage, real Postgres in Docker
Vitest + testcontainers
Unit — Domain logic and utilities
Vitest
Option A — Comprehensive
Unit + integration + E2E for all layers. Highest confidence, highest implementation cost. Realistic for a greenfield product only if E2E is scoped tightly.
Option B ★ Recommended — Pragmatic
Integration-heavy (real Postgres, real RLS policies, real auth middleware). Unit tests for pure domain logic. Playwright for critical web paths (work order CRUD, login). Maestro for the mobile happy path. Skip E2E for admin-only screens until Phase 4.
Option C — Minimal
Unit tests only until post-launch. Low confidence in the integration layer — RLS policy bugs and migration errors go undetected until production.
How do you verify that RLS policies never leak one tenant's data to another?
Multi-tenant data isolation is the highest-stakes correctness requirement in the system. A single misconfigured RLS policy can expose one customer's work orders, assets, or personnel records to another customer. This is not a "test before GA" problem — it must be verified on every PR that touches the database layer.
Option A
Manual test scripts run before each release. Human error risk is high; this does not scale as the schema grows.
Option B ★ Recommended
Automated integration tests that spin up two isolated tenant contexts, insert data for each, and assert that cross-tenant queries return empty. Written before any RLS policy merges to production. Run on every PR that touches packages/db. These tests must pass before any migration is approved to go live.
Option C
Formal security audit of all RLS policies by a third party. Valuable before GA, but not a substitute for automated regression tests — an audit is a point-in-time assessment, not continuous verification.
Needed Soon — Resolve before Phase 3 mobile work begins
Mobile testing: real devices or simulators?
Camera access, push notifications, and offline sync (WatermelonDB) do not work reliably in simulators. The field app depends on all three. CI can use simulators for regression, but a physical device is required for any meaningful QA of the features that differentiate the new app from the old one.
Option A — Simulators only
iOS Simulator + Android Emulator in CI. Fast, free, but camera, push notifications, and real network transitions are unreliable. Cannot validate offline sync behavior.
Option B ★ Recommended
Team-owned devices for daily QA (1x iPhone, 1x Android mid-range), iOS Simulator for CI regression. Physical devices are essential for camera, push notifications, and offline sync validation. Low cost, no vendor dependency.
Option C — Device farm
Sauce Labs or BrowserStack physical device farm. Broad device coverage, high cost (~$400–800/mo). Appropriate for public consumer apps; likely over-engineered for a B2B field tool with a controlled device environment.
What are the acceptance criteria for a successful customer migration?
Without defined migration acceptance criteria, there is no objective standard for declaring a migration complete. Teams default to "it looks right" — which means migration quality varies by customer and rollback decisions are made under pressure. These criteria must be written before the first beta migration.
Option A — Automated only
Automated validation script checks record counts and foreign key integrity. Fast, but misses semantic errors (e.g., work order statuses mapped to wrong enum values).
Option B ★ Recommended
Automated script (record count parity, relationship integrity, orphan check) plus manual spot-check: 10 randomly selected work orders compared field-by-field, 1 report output compared between old and new systems, and sign-off from the customer's primary contact before go-live.
Option C — Full parallel run
Both systems live simultaneously, all outputs compared. Maximum confidence, 2–4x the migration labor per customer. Justified only for the largest or highest-risk customers.
What are the performance targets the system must meet?
Performance testing cannot be designed without targets. Define these before Phase 2 ends so that performance tests can be written alongside feature development, not bolted on before launch. These targets also serve as SLA inputs for any enterprise contracts.
| Endpoint / Operation |
Target |
Rationale |
| API list endpoints (work orders, assets) |
p95 < 300ms |
Field workers refresh lists frequently; latency directly affects perceived app speed |
| Work order detail / single record |
p95 < 150ms |
Cached in Redis; should be near-instant |
| Report generation |
p95 < 1s |
Background job acceptable for large reports; inline acceptable for dashboard widgets |
| Mobile offline sync (typical WO queue) |
< 5s |
Field workers sync at shift start; >10s causes abandonment |
| Concurrent tenants (load test) |
50 tenants, 200 req/s |
Covers current ~80 customer base with room for growth |
Recommended approach
Adopt the targets above. Write k6 load tests that assert these thresholds. Run them in CI against the staging environment on every release candidate. Do not defer target definition — "test and see" produces no baseline and no regression detection.
Before Launch — Resolve before any customer goes live on production
Who performs security testing, and when?
The multi-tenancy layer (RLS + JWT claims) and authentication are the two highest-risk attack surfaces. CNIC and government customers will ask for evidence of security testing during procurement. An internal review is insufficient for these customers — they need a third-party attestation.
Option A — Internal only
OWASP Top 10 review performed by the engineering team. Lowest cost, no external attestation document, likely insufficient for government procurement.
Option B ★ Recommended
Third-party penetration test before GA, focused on multi-tenancy isolation, authentication flows, and API authorization. Produces a report that can be shared with compliance-tier customers. Remediation SLA: critical findings fixed before GA, high findings within 30 days.
Option C — Bug bounty
Public or invite-only bug bounty program. Ongoing rather than point-in-time, but not a substitute for a structured pre-launch assessment. Consider adding post-launch, not instead of a pen test.
What accessibility standard does the product target?
Many MainTrac customers are municipal or government agencies with accessibility procurement requirements. WCAG 2.1 AA is the standard cited in most US government accessibility procurement language. Retrofitting accessibility onto an existing product is 3–5x more expensive than building it in from the start — especially in a React Native app where accessible component patterns must be established early.
Option A — WCAG 2.1 A (minimum)
Covers only the most basic accessibility requirements. Insufficient for many government contracts.
Option B ★ Recommended — WCAG 2.1 AA
The standard required by most US government accessibility procurement and Section 508. Build accessible component patterns into the design system from Phase 1. Include automated accessibility checks (axe-core in Playwright) in the E2E suite.
Option C — No formal commitment
Maximizes short-term development velocity, creates procurement blockers with any government customer, and makes retrofitting very expensive post-launch.
What is the test environment topology and data strategy?
Without a defined environment topology, QA runs against inconsistent environments and production deployments have no validated staging gate. The data strategy for each environment determines whether tests reflect real-world behavior or produce false confidence with synthetic data.
Option A ★ Recommended — dev + staging + prod
Dev (local): Docker Compose with Postgres + Redis. Each engineer runs the full stack locally. Seed data from a shared fixture set.
Staging: Mirrors prod infrastructure (same ECS task definitions, same RDS instance class). Uses anonymized production data refreshed weekly. QA runs the full automated test suite against staging before any prod deployment. No prod deployment without a passing staging run.
Prod: Live customer data, automated backups, no direct test access.
Option B — dev + qa + staging + prod
Adds a dedicated QA environment between dev and staging. Increases ops cost and environment drift risk. Justified for larger teams; likely over-engineered for the initial build team size.
Option C — dev + prod only
Minimum overhead, maximum production risk. No gate between development and live customer data. Not acceptable for a multi-tenant SaaS product.