Bloga Dönindustry2026-05-2413 dk okuma
Webhook retry mekanizması lansmandan beri hiç çalışmamış — schema drift cron sessiz crash (PR #529 II)
Mardin Eski Kale Caddesi nde 40 yaslarinda "Mardin Eski Kapi" chef-owner Sercan, 16 ay modern Mezopotamya mutfagi 2 lokasyon (Mardin + Midyat). CRM tarafinda Pipedrive — webhook https://crm.mardineskikapi.com/webhook/thmenu. 8 ay once subscription paneli "Retry on 5xx with exponential backoff up to 5 attempts" yaziyordu. Mayis ortasi Pazar sales report: 142 rezervasyon (thMenu) vs 128 deal (Pipedrive) = **14 dropped event**. Webhook Delivery Log 14 entry "failed" + attempts: 1 (promised 5). support a yazdi. Engineering forensik wrong theories busted: (1) 5xx retry a aday degildi — apps/web-admin/src/lib/webhooks/dispatch.ts 5xx INSERT pending verifying; (2) handleScheduled wire missing — cloudflare/src/index.ts xx:00 + xx:30 webhook-retry runCronSafe call verifying, Cloudflare Workers logs 8,640 webhook-retry log entry son 6 ay. **(3) Right theory: schema-vs-code drift**. Her log entry icerigi: D1_ERROR: no such column: payload. Cron her tick te aynı crash + runCronSafe try/catch yutuyor. Migration 0011 lansmanda: payload_size_bytes INTEGER, attempt_number INTEGER, status_code INTEGER, delivered_at TEXT, error TEXT, status TEXT. Cron lansmandan beri SELECT: event_id, **payload**, **attempts**, **next_attempt_at**, status, **last_status_code**, **last_response_at** — 5 kolon ismi mismatch + next_attempt_at column hic yok. Schema implementer A mental model, cron implementer B mental model, code review missed. **16 ay silent retry failure**. Cron her tick crash, runCronSafe yutuyor, console.error logging var ama hic kimse log u izlemiyor, Sentry alert yok, dashboard widget yok. SELECT status, COUNT(*) FROM webhook_delivery_log GROUP BY status: 8,247 pending, 0 succeeded/dead. **PR #529 batch II** 3-katmanli fix: **Layer 1 schema align** migration 0074: payload TEXT + attempts INTEGER + next_attempt_at TEXT + last_status_code INTEGER + last_response_at TEXT eklendi, backfill existing rows (attempts = attempt_number etc), eski kolonlar deprecated. **Layer 2 atomic retry claim**: race condition concurrent cron tick ten kacinmak icin `UPDATE ... RETURNING` SQLite 3.35+ D1 destegi. UPDATE webhook_delivery_log SET status = "in_progress", attempts = attempts + 1 WHERE event_id IN (SELECT ... WHERE status = "pending" AND next_attempt_at < ? LIMIT 50) RETURNING event_id, payload, attempts, subscription_id, type. **Layer 3 observability**: runCronSafe catch branch Sentry beacon + alert rule [BEACON:cron_failed] 5+/hour threshold PagerDuty; dashboard cron success rate widget; migration-drift-check cron (PR #333) PRAGMA table_info dump kritik tablolarla manuel list karsi check. **Restore**: 8,247 pending event ilk run pickup, ~480 recovered (still within retry window 5min-6h), ~6,200 already-dead (older than 6h) — Engineering "lost notifications" ZIP archive yaratti + restaurant ownerships DM gonderdi. Sercan 14 fail event ten 2 recovered, 12 manually Pipedrive deal entered + 6-ay ucretsiz Pro tier + sympathetic apology. 23 restaurant Twitter Spaces te thMenu yu credit. Pattern: **D1/PostgreSQL/MySQL de schema migration + code (cron/handler/repository) farkli developer parallel + code review schema drift miss + try/catch wrapper lar silent failure mode. Mitigation: (1) PRAGMA table_info() dump vitest fixture, CI diff; (2) Sentry beacon + alert rule + success-rate widget; (3) atomic UPDATE RETURNING race-safe claim.** Implementation checklist: (1) PRAGMA snapshot fixture; (2) runCronSafe catch Sentry beacon [BEACON:cron_failed]; (3) cron success rate widget; (4) migration-drift-check cron critical column list; (5) pre-merge CI guard migration + repo code; (6) atomic UPDATE RETURNING; (7) quarterly schema-vs-code audit. Mathieu Brussels Saint-Gilles Maison Sainte-Croix HubSpot version ayni flow.