Hopper Research

Prefill-first inference for open models

Starting with real-time voice

200ms TTFT on GLM-5.2
80ms TTFT on Qwen3.6-35B-A3B

BLOG

Optimizing TTFT for a Hybrid MoE model
Jashwanth Pedapudi
July 2026
Optimizing the LLM Client for Voice Agents
Jashwanth Pedapudi
June 2026
How to Make Diffusion TTS Faster
Pavan Katta
June 2026