When you start a focus block with a task context ("writing my essay", "coding the API"), the AI reads a short snippet of your screen every 30 seconds and decides whether what you're doing matches your stated task.
If you drift — say you open Reddit while your task is "studying math" — the AI intervenes. On Nudge mode it force-quits the offending app with a brief overlay. On Hard mode it force-quits and locks you out for 5 minutes.
The AI never takes screenshots. It reads text only — visible DOM text in browsers (via the extension) and window titles / accessibility text on native apps. No images, no screen recording.