Tiny Lab — AI Deployment Lab

SPR

SPR · 01

Customer Support Triage Agent

We built an AI agent to automate customer support ticket triage, shipped it live, and measured 40% reduction in response time. The agent categorizes by urgency, routes to the right team, and suggests responses.

✅ Deployed 90 mins Mar 4, 2026

NOTE

NOTE · 09

Why Multi-Agent Coordination Failed

Case study: our multi-agent system looked great in demos but collapsed in production. The problem wasn't the agents—it was the coordination layer. Agents were waiting on each other, creating cascading timeouts. Fixed by making agents fully async and adding circuit breakers.

Case Study Mar 1, 2026

SPR

SPR · 02

Sales Outreach Automation

Deployed an AI system that personalizes cold emails based on prospect data scraped from LinkedIn and company websites. Measured 25% lift in reply rate and 15% increase in meeting bookings within first week.

✅ Deployed 2 weeks Feb 20, 2026

EXP

EXP · 02

Local LLMs vs API Calls

We spent 3 months testing whether local LLMs could replace API calls for our use case. Tried 6 different models (Llama, Mistral, Qwen, Gemma) across code generation, data extraction, and summarization. Local models matched API quality for 2/3 use cases while cutting costs by 80%.

Completed 3 months Feb 15, 2026

NOTE

NOTE · 07

3-Step AI Validation Pattern

After running 5 different AI deployments, we noticed they all required a similar 3-step validation pattern: (1) Schema validation before model call, (2) Output format check after model call, (3) Business logic validation before using results. This catches 95% of model errors.

Pattern Feb 10, 2026

NOTE

NOTE · 05

Model Selection Framework

Framework we use to pick which model for which task: (1) Does it need reasoning? Use o1/o3. (2) Is latency critical? Use Haiku. (3) Is accuracy critical? Use Opus. (4) Is cost critical? Use local. Start with this decision tree before optimizing.

Framework Feb 5, 2026

NOTE

NOTE · 04

Token Limits Break Everything

Quick observation: models consistently fail when given inputs with special characters. Turns out we were hitting token limits we didn't know existed. Emoji and Unicode characters tokenize into way more tokens than expected. Always use tiktoken to count tokens before sending.

Observed Jan 28, 2026

NOTE

NOTE · 03

Always Validate JSON Schema Early

Model output parsing was our biggest source of errors until we added strict JSON schema validation. Now we validate schema before even calling the model (catches bad prompts) and after (catches bad outputs). Cut production errors by 70%.

Tip Jan 20, 2026

SPR

SPR · 03

Data Pipeline Builder

Shipped a system that auto-generates ETL pipelines from natural language descriptions. Reduced data pipeline setup time from 3 days to 45 minutes for our analytics team.

✅ Deployed 1 week Jan 15, 2026

EXP

EXP · 01

Can AI Replace Manual Data Entry?

Tested whether vision models (GPT-4V, Claude Vision, Gemini Pro Vision) could extract structured data from scanned invoices and receipts. Ran 500 test cases across 3 document types. Vision models matched human accuracy (98%) but were 10x slower and 5x more expensive than expected.

Completed 6 weeks Jan 8, 2026