QA Decisions — MainTrac

Blocking Now 01

What is the testing pyramid for this product?

Without a defined testing strategy, coverage is inconsistent across the monorepo and migration validation has no acceptance criteria. This must be agreed before the first feature PR is merged — retrofitting a test strategy onto existing code is significantly more expensive than building it in.

E2E — Critical paths only Playwright (web) · Maestro (mobile)

Integration — Heavy coverage, real Postgres in Docker Vitest + testcontainers

Unit — Domain logic and utilities Vitest

Option A — Comprehensive

Unit + integration + E2E for all layers. Highest confidence, highest implementation cost. Realistic for a greenfield product only if E2E is scoped tightly.

Option B ★ Recommended — Pragmatic

Integration-heavy (real Postgres, real RLS policies, real auth middleware). Unit tests for pure domain logic. Playwright for critical web paths (work order CRUD, login). Maestro for the mobile happy path. Skip E2E for admin-only screens until Phase 4.

Option C — Minimal

Unit tests only until post-launch. Low confidence in the integration layer — RLS policy bugs and migration errors go undetected until production.

Blocking Now 02

How do you verify that RLS policies never leak one tenant's data to another?

Multi-tenant data isolation is the highest-stakes correctness requirement in the system. A single misconfigured RLS policy can expose one customer's work orders, assets, or personnel records to another customer. This is not a "test before GA" problem — it must be verified on every PR that touches the database layer.

Option A

Manual test scripts run before each release. Human error risk is high; this does not scale as the schema grows.

Option B ★ Recommended

Automated integration tests that spin up two isolated tenant contexts, insert data for each, and assert that cross-tenant queries return empty. Written before any RLS policy merges to production. Run on every PR that touches packages/db. These tests must pass before any migration is approved to go live.

Option C

Formal security audit of all RLS policies by a third party. Valuable before GA, but not a substitute for automated regression tests — an audit is a point-in-time assessment, not continuous verification.

Needed Soon 03

Mobile testing: real devices or simulators?

Camera access, push notifications, and offline sync (WatermelonDB) do not work reliably in simulators. The field app depends on all three. CI can use simulators for regression, but a physical device is required for any meaningful QA of the features that differentiate the new app from the old one.

Option A — Simulators only

iOS Simulator + Android Emulator in CI. Fast, free, but camera, push notifications, and real network transitions are unreliable. Cannot validate offline sync behavior.

Option B ★ Recommended

Team-owned devices for daily QA (1x iPhone, 1x Android mid-range), iOS Simulator for CI regression. Physical devices are essential for camera, push notifications, and offline sync validation. Low cost, no vendor dependency.

Option C — Device farm

Sauce Labs or BrowserStack physical device farm. Broad device coverage, high cost (~$400–800/mo). Appropriate for public consumer apps; likely over-engineered for a B2B field tool with a controlled device environment.

Needed Soon 04

What are the acceptance criteria for a successful customer migration?

Without defined migration acceptance criteria, there is no objective standard for declaring a migration complete. Teams default to "it looks right" — which means migration quality varies by customer and rollback decisions are made under pressure. These criteria must be written before the first beta migration.

Option A — Automated only

Automated validation script checks record counts and foreign key integrity. Fast, but misses semantic errors (e.g., work order statuses mapped to wrong enum values).

Option B ★ Recommended

Automated script (record count parity, relationship integrity, orphan check) plus manual spot-check: 10 randomly selected work orders compared field-by-field, 1 report output compared between old and new systems, and sign-off from the customer's primary contact before go-live.

Option C — Full parallel run

Both systems live simultaneously, all outputs compared. Maximum confidence, 2–4x the migration labor per customer. Justified only for the largest or highest-risk customers.

Needed Soon 05

What are the performance targets the system must meet?

Performance testing cannot be designed without targets. Define these before Phase 2 ends so that performance tests can be written alongside feature development, not bolted on before launch. These targets also serve as SLA inputs for any enterprise contracts.

Endpoint / Operation	Target	Rationale
API list endpoints (work orders, assets)	p95 < 300ms	Field workers refresh lists frequently; latency directly affects perceived app speed
Work order detail / single record	p95 < 150ms	Cached in Redis; should be near-instant
Report generation	p95 < 1s	Background job acceptable for large reports; inline acceptable for dashboard widgets
Mobile offline sync (typical WO queue)	< 5s	Field workers sync at shift start; >10s causes abandonment
Concurrent tenants (load test)	50 tenants, 200 req/s	Covers current ~80 customer base with room for growth

Recommended approach

Adopt the targets above. Write k6 load tests that assert these thresholds. Run them in CI against the staging environment on every release candidate. Do not defer target definition — "test and see" produces no baseline and no regression detection.

Before Launch 06

Who performs security testing, and when?

The multi-tenancy layer (RLS + JWT claims) and authentication are the two highest-risk attack surfaces. CNIC and government customers will ask for evidence of security testing during procurement. An internal review is insufficient for these customers — they need a third-party attestation.

Option A — Internal only

OWASP Top 10 review performed by the engineering team. Lowest cost, no external attestation document, likely insufficient for government procurement.

Option B ★ Recommended

Third-party penetration test before GA, focused on multi-tenancy isolation, authentication flows, and API authorization. Produces a report that can be shared with compliance-tier customers. Remediation SLA: critical findings fixed before GA, high findings within 30 days.

Option C — Bug bounty

Public or invite-only bug bounty program. Ongoing rather than point-in-time, but not a substitute for a structured pre-launch assessment. Consider adding post-launch, not instead of a pen test.

Before Launch 07

What accessibility standard does the product target?

Many MainTrac customers are municipal or government agencies with accessibility procurement requirements. WCAG 2.1 AA is the standard cited in most US government accessibility procurement language. Retrofitting accessibility onto an existing product is 3–5x more expensive than building it in from the start — especially in a React Native app where accessible component patterns must be established early.

Option A — WCAG 2.1 A (minimum)

Covers only the most basic accessibility requirements. Insufficient for many government contracts.

Option B ★ Recommended — WCAG 2.1 AA

The standard required by most US government accessibility procurement and Section 508. Build accessible component patterns into the design system from Phase 1. Include automated accessibility checks (axe-core in Playwright) in the E2E suite.

Option C — No formal commitment

Maximizes short-term development velocity, creates procurement blockers with any government customer, and makes retrofitting very expensive post-launch.

Before Launch 08

What is the test environment topology and data strategy?

Without a defined environment topology, QA runs against inconsistent environments and production deployments have no validated staging gate. The data strategy for each environment determines whether tests reflect real-world behavior or produce false confidence with synthetic data.

Option A ★ Recommended — dev + staging + prod

Dev (local): Docker Compose with Postgres + Redis. Each engineer runs the full stack locally. Seed data from a shared fixture set.

Staging: Mirrors prod infrastructure (same ECS task definitions, same RDS instance class). Uses anonymized production data refreshed weekly. QA runs the full automated test suite against staging before any prod deployment. No prod deployment without a passing staging run.

Prod: Live customer data, automated backups, no direct test access.

Option B — dev + qa + staging + prod

Adds a dedicated QA environment between dev and staging. Increases ops cost and environment drift risk. Justified for larger teams; likely over-engineered for the initial build team size.

Option C — dev + prod only

Minimum overhead, maximum production risk. No gate between development and live customer data. Not acceptable for a multi-tenant SaaS product.