- The Next Input by Cylentis AI
- Posts
- š® The Next Input ā Issue #068
š® The Next Input ā Issue #068
Your New "Invisible Intern"
ā” The Briefing ā 60 sec
Google DeepMind unveils Geminiās āComputer Useā model. New day, new modelāGemini just learned how to use a computer better than most humans.
Anthropic expands global operations to India. The Claude empire goes east.
Wall Street explains how AMDās own stock will fund OpenAIās chip bill. The game keeps changingāfinance meets silicon.
š ļø The Playbook ā AI Computer Use Agent: The āInvisible Internā
MissionāDeploy an AI that can autonomously handle on-screen workflows: filling forms, navigating apps, sending emails, and updating dashboardsāno API integrations required.
Difficulty Expertā|āBuild time 3ā5 hours (pilot)
ROIāSaves ā 15ā25 h/week of repetitive manual tasks across ops, admin, and support functions.
0) Why This Matters
Geminiās āComputer Useā model is a breakthroughāAI can now perform any desktop or browser task the same way you do: moving the mouse, clicking buttons, reading screens, and reasoning across windows. Itās the missing piece between ātalkingā and ādoing.ā
1) Architecture
Layer | Tooling | Purpose |
|---|---|---|
Vision Input | Screen capture (Gemini / Rewind / Cursor) | See whatās on screen |
Reasoning Model | Gemini Computer Use / GPT-4V | Understand interface layout & intent |
Controller | PyAutoGUI / Selenium / Playwright | Execute clicks, drags, keystrokes |
Memory | Supabase / Redis | Track past actions, window states |
Orchestrator | LangChain / AgentKit | Decide multi-step plans |
Human-in-Loop | Slack / CLI Approval | Confirm risky actions (send, delete, submit) |
2) Workflow
Trigger
User says: āUpdate our customer dashboard and email todayās metrics.ā
Observe
Model takes screenshot of screen / browser ā identifies open apps.
Plan
LLM breaks task into steps:
Open Excel sheet.
Copy metrics.
Paste into dashboard CMS.
Export summary ā attach ā email team.
Execute
Controller performs actions in order, logging every move.
Verify
Slack message: āā Dashboard updated. Draft email belowāsend?ā
Memory Update
Logs what was done, where, and why for reuse tomorrow.
3) Example Prompt
SYSTEM: You are a computer operator agent.
INPUT: Screenshot + user command.
GOAL: Complete the request using on-screen applications.
RULES:
- Never click destructive actions without user confirmation.
- Always log {step, tool, action, result}.
- When uncertain, ask for clarification.
OUTPUT: JSON plan of clicks/keystrokes + summary of expected outcome.
4) Guardrails
Safety Checks:
Disable file deletions by default.
Require approval for financial or external emails.
Compliance:
Screen logging must redact PII before upload.
Boundaries:
Cap runtime to <10 minutes per command.
Limit accessible apps to a whitelist.
5) Pilot Rollout ā 3 Hours
Choose one repeatable task (e.g., updating CRM or dashboard).
Run Gemini Computer Use model in observation mode (no clicks yet).
Test āread-onlyā step extraction accuracy.
Enable control mode with human confirmation.
Log time saved vs manual execution.
6) Metrics
Time saved per task.
Number of steps automated per workflow.
Accuracy of UI element selection.
Manual review interventions per week.
Pro tip: Pair with AgentKit to make your Computer Use agent a āmulti-surfaceā powerhouseābrowsing, clicking, and API-calling seamlessly.
šÆ The Arsenal ā Tools & Prompts
Asset | What it does | Link |
|---|---|---|
Gemini Computer Use Model | AI that controls real apps visually. | https://blog.google/technology/google-deepmind/gemini-computer-use-model/ |
Playwright / Selenium | Browser automation at code level. | |
Supabase | Tracks agent logs & session states. | |
Prompt Ā· UI Action Plan | Screenshot ā step-by-step plan. |
From screenshot + command, list UI actions in JSON:
[{step, element, action, target, expected_result}]
š” Free Office Hours
Want an AI that clicks, types, and thinks like your best intern?
Book a free 15-minute Office Hours slotāno sales pitch, just workflows solved.
ā Grab a slot: https://calendly.com/aaron-cylentis/the-next-input-office-hours
How Canva, Perplexity and Notion turn feedback chaos into actionable customer intelligence
Support tickets, reviews, and survey responses pile up faster than you can read.
Enterpret unifies all feedback, auto-tags themes, and ties insights to revenue, CSAT, and NPS, helping product teams find high-impact opportunities.
ā Canva: created VoC dashboards that aligned all teams on top issues.
ā Perplexity: set up an AI agent that caught revenueāimpacting issues, cutting diagnosis time by hours.
ā Notion: generated monthly user insights reports 70% faster.
Stop manually tagging feedback in spreadsheets. Keep all customer interactions in one hub and turn them into clear priorities that drive roadmap, retention, and revenue.
š¹ļø Game Over
Ship one āComputer Useā agent todayātomorrow your AI will literally work beside you.
Share your win; you could headline Issue #069.
ā Aaron
Automating the boring. Amplifying the brilliant.
Forwarded this? Subscribe here

