Error Cockpit

Service Health (24h)

Success rate per service in the last 24 hours. A green ring at 100% means every API call succeeded. Below 95% turns yellow (some calls failing), below 80% turns red (service is degraded). Click a card to filter errors below.

If you see a red or yellow ring: That service is experiencing failures right now.

Most likely: The external API (Gemini, Deepgram, Google) is having an outage or rate-limiting us.
Also possible: A recent code change broke how we call that API (wrong params, missing auth).

What to do:

Click the card to filter the Recent Errors table below — read the error messages to diagnose.
Check the provider's status page (e.g. status.cloud.google.com) to see if it's on their end.

Free tier limits to watch:

Supabase: ~30 signups/min, 60 DB connections, 4 auth emails/hr, 500K edge invocations/mo, 500MB DB. If supabase is red, it may be time to upgrade to Pro ($25/mo).
Deepgram: 600 min/month per user, ~10 concurrent WebSocket connections, ~3GB max file upload. If deepgram is red, check if users are recording very long meetings (>4 hours). voice_transcribe errors indicate voice input issues in coach chat (auth, network, or empty recordings).
Audio Capture: If audio_capture is red, check for native CLI errors (permission denied, missing binary) or disk space issues. Common error codes: SPAWN_FAILED, EXIT_CODE_1, LOW_DISK_SPACE, EMPTY_AUDIO_FILE.
Gemini: Rate limits vary by model. If gemini is red during peak hours, multiple users may be analyzing meetings simultaneously. generate_prep_flashcards errors = meeting prep feature (uses gemini-2.0-flash, NOT 2.5-flash which produces empty output).
Resend: Email delivery for meeting prep flashcards. Free tier: 100 emails/day. send_prep_email errors = email delivery failures (check Resend dashboard for bounces/complaints). If resend is red, check RESEND_API_KEY Supabase Secret.
Google Calendar: 1M API calls/day per GCP project.
Outlook Calendar: Microsoft Graph API. oauth_connect errors = user's org may block third-party app consent (common AADSTS65004). token_refresh errors = session expired or revoked. OAUTH_TIMEOUT = user didn't complete auth within 5 min. Web app uses outlook-oauth-exchange Edge Function for token exchange. If Outlook errors spike, check if OUTLOOK_CLIENT_SECRET_WEB is set in Supabase Secrets and if the Azure app registration is still active.
Exa (Learning Feed): $10 free credit (~2500 searches at $0.004/search). generate_feed errors = feed generation failed (Exa search + Gemini curation pipeline). Common error codes: NETWORK_ERROR (can't reach Exa/Gemini), RATE_LIMITED (API quota), TIMEOUT (Edge Function took too long, typical >30s), SQLITE_SAVE_FAILED (local DB write error). Auto-refresh errors (AUTO_REFRESH_ANALYSIS_FAILED, AUTO_REFRESH_DIGEST_FAILED) are non-critical — user can still manual refresh. If exa credit exhausted, check dashboard.exa.ai billing.

Error Timeline

Failed API calls over time, stacked by severity. Red = hard errors (API returned an error response), orange = timeouts (API didn't respond in time), yellow = rate limited (hit quota/throttle), grey = aborted (request cancelled). A tall bar means many failures in that time bucket. Spikes suggest an outage or bug.

If you see a spike: Many API calls failed during that time window.

Most likely: An external API had a temporary outage or we hit a rate limit (e.g. Supabase free tier auth limits, Google Calendar API quota).
Also possible: Multiple users signed up or connected calendars simultaneously, overwhelming shared resources.

What to do:

Hover over the spike to see which services were affected. A single-service spike = that provider had issues. Multi-service spike = likely our network or Supabase was down.
If yellow (rate limited) bars dominate, you're hitting free tier limits. Supabase free tier: ~30 signups/min, 60 DB connections. Consider upgrading to Supabase Pro ($25/mo) or adding request throttling.

Latency Trends

Average response time per service over the selected period. A spike means that service was slow during that window. Sustained high latency (e.g. Gemini >10s, Deepgram >5s) may indicate API issues or large payloads.

If you see a latency spike: API calls to that service were abnormally slow during that window.

Most likely: The provider was under load (Gemini thinking models can take 10-30s during peak). Large payloads (long meeting transcripts) also increase latency.
Also possible: Network issues between the user's machine and the API, or the user was on a slow/unstable connection.

What to do:

If Gemini latency is consistently >15s, check if we're sending oversized prompts or if the model tier has changed.
If Deepgram latency spikes, check if audio files were unusually large (long meetings without live streaming).

Recent Errors

Individual failed API calls, newest first. Each row is one call that returned an error, timed out, was rate-limited, or was aborted. Use the service filter to narrow down. Hover over the message column to see the full error text.

If you see repeated errors here: The same API call is failing multiple times.

Most likely: A persistent issue — wrong API key, expired token, or the provider changed their API (breaking change).
Also possible: A transient issue that's already resolved — check timestamps. If the last error was hours ago and health is green, it's likely fixed.

What to do:

Look at the error_code and message columns. Common patterns: "401" = auth issue (check API keys), "429" / "RATE_LIMITED" = hitting free tier limits (consider upgrading), "CONNECTION_POOL_FULL" = too many simultaneous DB queries (Supabase free tier: 60 connections), "PGRST116" = missing database row (profile backfill needed), "email not confirmed" = user signed up but never confirmed email (check Supabase email delivery or disable confirmation).
Filter by service to isolate the problem. If it's supabase/profile_create — a user may have signed up but their profile wasn't saved. If you see many sign_up successes followed by sign_in errors with "email not confirmed", the Supabase email provider may be rate-limiting (free tier: 4 emails/hr).

Time	Service	Operation	Status	Error Code	Message	Latency

Investigation Results

Automated root-cause analyses from the Gemini-powered error investigator. These run automatically when the app detects error patterns (3+ of the same error in 10 minutes). Red border = critical (needs immediate fix), yellow = warning (should investigate), blue = info (likely transient).

If you see an investigation here: The system detected a repeated error pattern and automatically analyzed it using Gemini.

Most likely: A real issue that needs attention — the same error happened 3+ times in 10 minutes, which rules out one-off glitches.
Also possible: A transient issue (marked with "transient" badge) that resolved itself — e.g. a brief API outage.

What to do:

Read the root cause and recommended action. If severity is "critical", act immediately. If "warning", investigate within 24 hours.
If no investigations appear but errors are present above, the service may not be monitored yet — check MONITORED_SERVICES in limitsMonitor.js.