🎮 The Next Input — Issue #073

The AI That Tracks All Other AIs

Aaron Bost
October 16, 2025

In partnership with

⚡ The Briefing — 60 sec

Anthropic launches a new version of its scaled-down Haiku model. These models get better and cheaper every week. At some point, the bubble will pop—but until then? Let’s cook! 🔥
Google unveils Veo 3.1 with upgraded video generation tools. Speaking of model upgrades… we’re definitely living in “update season.”
Timbaland debuts AI artist TaTa—and the internet reacts. Timbo’s beats are legendary. The AI artist? We’ll give this one a polite pass.

🛠️ The Playbook — AI Model Observatory: Build Your Internal Model Tracker

Mission Create an internal system that monitors new model releases, benchmarks capabilities, and recommends potential integrations for your business use cases.
Difficulty Advanced | Build time 3–4 hours (pilot)
ROI Teams save ≈ 10–15 h/week in research and stay ahead of model innovation without getting lost in hype cycles.

0) Why This Matters

Between Haiku, Veo, Sora, Claude, and every week’s new “mini miracle,” the rate of change in AI is unsustainable to track manually.
Your org needs a Model Observatory—a knowledge engine that filters noise, evaluates impact, and translates breakthroughs into actionable upgrades for your stack.

1) Architecture

Layer	Tooling	Purpose
Collector	Feedly AI / Apify / Hugging Face API	Aggregate model news & metadata
Classifier	Claude 3.5 / GPT-4o	Cluster updates by type: “Language”, “Image”, “Video”, “Multi-Modal”, “Infra”
Benchmark Engine	LM-Eval / Open Decomp	Score models against key metrics
Memory	Supabase / Airtable	Store {model_name, date, provider, score, relevance}
Interface	Notion / Looker Studio	Weekly summary dashboards
Alerts	Slack / Email Digest	Notify team when “relevant” models drop

2) Workflow

Collect
- RSS/API feeds → “New Models” database (TechCrunch, Hugging Face, Anthropic, Google, OpenAI).
Classify
- LLM tags:
  - Language (GPT/Claude updates)
  - Image/Video (Veo, Midjourney)
  - Infra (H100s, TPU upgrades)
Score
- Evaluate performance via public benchmarks + context fit (speed, cost, latency).
Contextual Relevance
- LLM prompt filters: “Would this model materially improve our workflow or product?”
Store + Notify
- Append results to Supabase → push Slack digest every Friday.

3) Example Prompt

Relevance Filter Prompt

SYSTEM: You are an AI procurement analyst.
INPUT: {model_description, benchmark_data, company_use_cases}.
TASK: Score 1-5 how relevant this model is to our operations.
If score ≥ 4, summarise:
- Business impact (1 line)
- Suggested integration
- Cost or latency considerations
OUTPUT JSON:
{model_name, provider, category, score, summary, integration_hint}

4) Guardrails

Avoid Vendor Bias: Don’t trust benchmark claims without external data.
Filter Noise: Ignore models <10M params or without release notes.
Data Hygiene: Keep changelog per model (so you know when features actually matter).
Security: Validate any “downloadable” model sources to prevent malware.

5) Pilot Rollout — 2 Hours

Pull 10 most recent model releases via Feedly or Hugging Face API.
Run classification + scoring prompt.
Log top 3 “relevant” models to Airtable with summaries.
Share results via Slack digest (“This Week in Models”).
Review with leadership: which to test internally next week.

6) Metrics

Avg time saved in R&D scanning.
% of identified models later adopted.
Benchmark accuracy vs vendor claims.
“Relevance hit rate” (score ≥ 4 models that become useful).

Pro tip: Add a “Retirement Policy”—flag models that become obsolete (e.g., GPT-3.5, Claude 1) to avoid paying for outdated endpoints.

🎯 The Arsenal — Tools & Prompts

Asset	What it does	Link
Feedly AI	Curates new AI model releases.	https://feedly.com/ai
Hugging Face API	Pulls model metadata + versions.	https://huggingface.co/docs
Supabase	Database for structured model tracking.	https://supabase.com
Prompt · Weekly Digest	Curate and score top 5 new models.

From the last 7 days of releases, select top 5 models by relevance.
Output markdown digest:
- Model Name (Provider) — Category
- Key Upgrade
- Potential Business Use
- Score (1–5)

💡 Free Office Hours

Want to build an AI Model Observatory that filters hype into action?
Book a free 15-minute Office Hours slot—no sales pitch, just workflows solved.

→ Grab a slot: https://calendly.com/aaron-cylentis/the-next-input-office-hours

How to pick the right global payroll mode

Find your fit: Deel’s free guide breaks down 3 global payroll models with key benefits and tradeoffs for HR and finance teams.

Compare Payroll Models

🕹️ Game Over

Launch your Model Observatory today—tomorrow, you’ll stop chasing trends and start choosing winners.
Share your win; you could headline Issue #074.

— Aaron
Automating the boring. Amplifying the brilliant.

Forwarded this? Subscribe here