EXPERIMENT
Experiment
Testing hypotheses or exploring approaches to learn something new. Long-form investigations that document what we tried and what we learned.
EXP
EXP · 02
Local LLMs vs API Calls
We spent 3 months testing whether local LLMs could replace API calls for our use case. Tried 6 different models (Llama, Mistral, Qwen, Gemma) across code generation, data extraction, and summarization. Local models matched API quality for 2/3 use cases while cutting costs by 80%.
EXP
EXP · 01
Can AI Replace Manual Data Entry?
Tested whether vision models (GPT-4V, Claude Vision, Gemini Pro Vision) could extract structured data from scanned invoices and receipts. Ran 500 test cases across 3 document types. Vision models matched human accuracy (98%) but were 10x slower and 5x more expensive than expected.