DataStudios← 4 Star Hub/Architecture & Flow
4 Star × DataStudios · Artifact

Architecture & Flow

Current state, the 8 gaps, and the target state

Offline conversion flow — ad click → RFQ → Quote → Order

How an ad click ties to revenue today, where it leaks, and the target state. Grounded in the live Supabase schema (supabase-data-dictionary.md), the "4 Star Data Setup" doc, and measured match rates (../findings/match-key-and-match-rate-2026-06-22.md). Items marked ⚠ are inferred / pending confirmation (e.g. the existing n8n workflow, which we can't yet read).

Current state

flowchart LR subgraph Web click["Ad click
gclid / msclkid"] --> form["Web RFQ form
/part_search/straightrfq.asp"] form -. "gclid, utm, ga_cid" .-> tids["deals.tracking_ids (JSON)"] form --> gtm["GTM: GA4 + Ads RFQ
(1072234740) + LinkedIn + CIO"] end offline["Phone / Email / Fax RFQ
(no ad click)"] --> crm form --> crm[("CRM — MS SQL Server")] rep["Sales rep (offline)"] --> crm crm -- "hourly sync" --> sb[("Supabase")] sb --> deals["deals
rfq/quote_sent/booked dates, value"] deals -- "hourly explode" --> touches["rfq_attribution_touches
gclid / msclkid / utm"] deals -. "LOSSY: Google Sheet import ⚠" .-> ads["Google Ads
Quotes / Orders actions"] n8n["n8n 'Daily 06:00 EST' ⚠"] -. "today ⚠" .-> ads

Stage by stage: 1. Ad click — Google/Bing ad click carries gclid/msclkid; GTM Conversion Linker sets the _gcl cookie. ~74% of touches carry a gclid; only web RFQs have any click. 2. RFQ submit — the web form mints an RFQ in the CRM and persists gclid/utm/ ga_cid into deals.tracking_ids. GTM fires GA4, the Ads RFQ conversion (1072234740), LinkedIn, and Customer.io. Phone/email/fax RFQs also create deals but have no click to attribute (~1/3 of RFQs). 3. Sync — CRM → Supabase hourly; an hourly function explodes tracking_ids into rfq_attribution_touches (the click join, keyed by deal_id). 4. Quote / Order — the rep advances the deal offline; quote_sent_date, booked_date, order_total are set on the same deals row. 5. Upload to platforms (the part we replace) — a lossy Google Sheet import feeds offline conversions to Google Ads ⚠, alongside an existing n8n "Daily @ 06:00 EST" workflow we have not yet been able to read ⚠. campaign_daily_metrics separately stores platform-reported rfqs/quotes/orders/cost by day.

Where it leaks (gaps)

# Gap Impact Evidence
1 Lossy Google Sheet upload — manual, no idempotency, error-prone Mis/over/under-counted conversions; no audit client-stated; pain #1
2 No per-row "did not import" log Can't see which rows failed → can't fix client pain #2
3 gclid-only matching — ECL (email/phone) not used as fallback ~1,860 web quotes + ~278 orders/90d unattributed match-rate findings; contacts has email/phone
4 90-day window loss — long B2B cycles outlast the gclid window ~828 web quotes/90d have a gclid but it's expired match-rate findings
5 Value signal weak — quotes often uploaded at $0 (rfq_value=0) Bidding can't optimize on revenue EXPIRED_EVENT/value probes; order_total is the real value
6 Bing/msclkid unrouted ~134 quote / 12 order clicks/90d ignored touches data
7 No systematic reconciliation (Obj 2) CRM vs GA4/Ads/Microsoft compared ad hoc client pain #3; campaign_daily_metrics unreconciled vs deals
8 GA4 offline parity missing — no RFI/order-id custom dimension events No observed-side view of offline conversions platform audit

Future state

flowchart LR sb[("Supabase
deals + touches + contacts")] --> poll["n8n: poll since
sync_watermarks"] poll --> match["Normalize + match key
gclid → ECL(hashed email/phone)
website-only, in-window"] match --> dry["validateOnly dry-run"] dry --> dm["Data Manager API"] dm --> ads["Google Ads
Quotes 7208316397 / Orders 7208513754
value = order_total"] dm --> ga4["GA4 events
+ RFI/order-id custom dim"] match --> log["Write-back import log
status / error / run_id"] log --> recon["Reconciliation (Obj 2)
deals vs campaign_daily_metrics
website-only, exclude unsettled days"] bing["Microsoft/Bing leg (msclkid) ⚠ scope"] -.-> match

What changes: - Direct Data Manager API replaces the Sheet — idempotent (transactionId = rfq_id), validateOnly dry-run before every write (closes gaps 1, 5). - Per-row import log written back to Supabase (status/error/run_id) → the "which rows didn't import" answer + a feed into reconciliation (gap 2). - gclid → ECL fallback: when no gclid, hash contacts.email/phone (filtered to valid + consented) → recovers the no-click web conversions (gap 3). - Value = order_total on orders for value-based bidding; quote value confirmed with the client (gap 5). - Reconciliation job compares deals (source of truth) to campaign_daily_metrics (platform-reported), website-only, ad platforms click-dated / GA4 event-dated, excluding unsettled days (gaps 7, 8). - Incremental via sync_watermarks; Bing leg added if scope is confirmed (gaps 6). - Window loss (gap 4) is largely structural (Google's 90-day limit vs. long sales cycles) — surfaced honestly in the reconciliation, partially mitigated by ECL and by uploading the earlier Quote conversion (closer to the click) in addition to the Order.

Build status against this flow

Extraction + matching + the Data Manager leg + a read-only account validation are built and verified against live data (fourstar.pipeline, scripts/validate_account.py). Remaining: ECL wiring (now data-unblocked via contacts), GA4 leg (RAPT-gated), reconciliation job, and orchestration inside the client's n8n.