Engineering Decisions

Blocking Now — Must resolve before writing any authenticated or infrastructure code

Blocking Now 01

Auth provider: Clerk or Auth0?

This decision gates the auth package and the entire permission model. Every authenticated route, organization model, and permission check depends on it. Check first whether VSI has an existing Auth0 enterprise contract — if credits or a negotiated rate exist, that changes the calculus.

Option A ★ Recommended (if no Auth0 contract)

Clerk — faster setup, the organization/membership model maps cleanly to multi-tenant SaaS, excellent Next.js and React Native SDKs, JWT custom claims are straightforward to configure.

Option B

Auth0 — enterprise pedigree, stronger compliance story for government customers. Prefer this if VSI already has a contract. More configuration required to achieve the same org model.

Option C

Self-hosted Keycloak — zero vendor cost, full control, required for any future on-prem/dedicated deployments. Significant ops burden to maintain and upgrade.

Blocking Now 02

Cloud provider: AWS, Azure, or Railway-first?

Every infrastructure choice — ECS Fargate, RDS, RDS Proxy, S3 — is cloud-specific. Check whether VSI has existing AWS or Azure accounts, credits, or enterprise agreements before deciding. Greenfield ops on Railway is the lowest-overhead start, but it cannot host ECS or RDS natively.

Option A ★ Recommended (if existing VSI account)

AWS — aligns with the recommended stack (ECS Fargate, RDS, RDS Proxy, S3). Use existing accounts and credits. Most documentation and tooling assumes AWS.

Option B

Railway for first 12 months, then migrate to AWS. Lowest ops overhead to get running, no IAM/VPC complexity. Trade-off: a migration event at the worst possible time (post-launch growth).

Option C

Azure — prefer if VSI is on Microsoft stack with existing Azure credits. Good managed Postgres option (Azure Database for PostgreSQL). Less community tooling for this specific stack.

Blocking Now 03

CI/CD pipeline: what replaces the existing Jenkinsfile?

The existing RecTrac Jenkinsfile is built for the Progress/OpenEdge monolith. A Node/Bun/Expo/TypeScript monorepo needs a different CI/CD setup from day one. The Turborepo monorepo setup assumes a modern pipeline that understands affected package detection. This must be chosen before the first PR is merged.

Option A ★ Recommended

GitHub Actions — zero new vendors, free for private repos at reasonable usage, excellent Expo EAS and Node ecosystem integrations, well-documented. Turborepo remote caching works natively. Great documentation for this exact stack.

Option B

Buildkite — if VSI already uses it for other products, reuse the existing agents and billing. Higher setup cost if starting fresh.

Option C

CircleCI — strong Node/Docker support, parallelism features. Additional vendor cost; no strong reason to prefer over GitHub Actions unless the team has existing expertise.

Needed Soon — Resolve before Phase 2 data layer work begins

Needed Soon 04

Connection pooling: RDS Proxy or PgBouncer?

Row-Level Security session variables (SET LOCAL app.current_tenant) require connection pinning or transaction-scoped settings. The two pooling options handle this differently. This decision must be made before writing the first database connection — changing it later requires rewriting connection management across the codebase.

Option A ★ Recommended (if on AWS)

RDS Proxy — managed, AWS-native, connection pinning is supported, no additional container to operate. Works cleanly with Drizzle ORM. Zero ops overhead relative to PgBouncer.

Option B

PgBouncer as a sidecar container — portable (works on Railway, self-hosted, any cloud), open source, transaction mode compatible with SET LOCAL per transaction. Requires container management.

Needed Soon 05

Run a WatermelonDB spike before committing to it for offline mobile sync

WatermelonDB is the recommended offline sync solution for the field app, but it requires a custom server-side sync endpoint and a conflict resolution strategy. Before the mobile sprint begins, a 1-week spike should validate that it works with the MainTrac data model (work orders, assets, combination logs) and that the team can implement the sync protocol.

Option A ★ Recommended

WatermelonDB + custom sync server — spike in Week 1 of Phase 3. Full offline capability, battle-tested in production React Native apps. The sync server is ~200 lines of Hono route code once the conflict strategy is defined.

Option B

TanStack Query + optimistic updates only — no true offline, but vastly simpler. Acceptable if customers have reliable cell coverage in the field. Validate this assumption with beta customers before choosing.

Option C

PowerSync — managed sync service, handles the server side for you. Reduces implementation effort but adds a vendor dependency and monthly cost.

Needed Soon 06

How is data extracted from Progress/OpenEdge for customer migrations?

Each customer migration requires extracting their data from the Progress database and loading it into the new PostgreSQL schema. The extraction approach affects how much control the team has over data validation and transformation. This must be designed before the first migration, not during it.

Option A ★ Recommended

Progress ABL export script — writes data to JSON or CSV. The team already knows ABL, the schema is well-understood, and a custom script gives full control over validation logic and transformation rules. Pair with a Node.js loader that validates and inserts into Postgres.

Option B

ETL tool (Airbyte, Fivetran) — automated connectors, but Progress/OpenEdge connectors are limited and may not handle the MainTrac schema's quirks. Significant setup time for limited gain.

Option C

Node.js script via Progress ODBC driver — avoids ABL, but ODBC connectivity from Node to Progress is fragile and poorly documented.

Needed Soon 07

How are customers routed between the old and new system during the strangler fig migration?

The strangler fig pattern requires a mechanism to route each customer to either RecTrac MainTrac (old) or the new standalone system. This routing mechanism must be in place before the first customer migrates and must be operable without a code deployment.

Option A ★ Recommended

Simple boolean flag on the tenants table (is_migrated). Flip it per customer at migration time. No new services, no new vendors. Graduate to a feature flag service (LaunchDarkly, Unleash) if per-feature granularity is needed later.

Option B

LaunchDarkly — full-featured, supports gradual rollouts and per-user flags. Adds $300–500/mo in cost and a vendor dependency before it's needed.

Option C

Unleash self-hosted — open source feature flags. Adds a service to operate; reasonable if the team already runs Unleash elsewhere.

Before Launch — Resolve before the first PR is merged on production code

Before Launch 08

What are the TypeScript and linting standards for the monorepo?

This must be established before the first PR is merged — not after six months of inconsistent code. In a Turborepo monorepo with multiple apps and packages, loose typing in one package propagates into others. The cost of retrofitting strict TypeScript after the fact in a multi-package repo is very high.

Recommendation: Strict TypeScript (strict: true, noUncheckedIndexedAccess: true) across all packages from day one. Single shared ESLint config in packages/config/eslint. Prettier enforced in CI. Zero any policy — use unknown with type guards. These standards are non-negotiable in a shared codebase where a type error in packages/core can silently break both apps/api and apps/mobile.

Option A ★ Recommended

Strict TypeScript + ESLint + Prettier with a shared config package. Enforced in CI — PRs fail if type errors or lint violations exist. No exceptions without explicit suppression comments that require justification.

Option B

Flexible per-app config — each app/package sets its own strictness. Fast to start, accumulates inconsistency quickly across the monorepo boundary.

Option C

No enforced standard until post-launch. Maximizes short-term velocity, guarantees a painful refactor before any serious scale or contributor growth.