Latest

When Your AI Transcription Turns "Tasty Burger" Into "Nasty Murder"

When Your AI Transcription Turns "Tasty Burger" Into "Nasty Murder"

WER vs SNR for Transcription Models

Announcing Maxim AI’s general availability and seed round

Announcing Maxim AI’s general availability and seed round

Today, we are excited to announce the general availability of Maxim AI’s evaluation platform. We are also thrilled to partner with the incredible team at Elevation Capital and the fantastic set of founders and operators who share our vision to accelerate the future of AI development! How we started

The Discipline Layer: Harnesses as the Missing Piece in Autonomous Coding

The Discipline Layer: Harnesses as the Missing Piece in Autonomous Coding

Introduction If you've been working with AI agents on longer tasks, you've probably developed your own tricks for dealing with context window limits. Maybe you hit /summarize in Cursor when things get bloated or you ask the agent to write a summary.md file at the

Breaking the Context Window: How Recursive Language Models Handle Infinite Input

Breaking the Context Window: How Recursive Language Models Handle Infinite Input

Long-context understanding has been a persistent challenge in language model research. Despite architectural innovations (ALiBi, YaRN, RoPE variants) and massive context window expansions (Claude 3.5 at 200k tokens, GPT-5 at 256k+), models still exhibit performance degradation on long inputs, a phenomenon known as "context rot." The community

Scaling Personalized Sleep Coaching: Rise Science's Journey with Maxim AI

Scaling Personalized Sleep Coaching: Rise Science's Journey with Maxim AI

About Rise Science Rise Science is a sleep management platform that helps people understand and address the root causes of their energy issues. The platform focuses on two key principles: sleep debt (how much sleep a person owes their body) and circadian rhythm (their body’s natural energy schedule). This

Beyond the SDK: Why AI Teams Love HTTP Endpoint-Based Evals

Beyond the SDK: Why AI Teams Love HTTP Endpoint-Based Evals

Since the beginning, HTTP Endpoint-Based Offline Evals have been a core feature of the Maxim platform and a favorite among our users. While our SDKs allow engineers to integrate evaluations directly into their codebase, a purely code-based approach introduces friction, often limiting who can run them and how they are

November 2025 Updates - Maxim AI

✨ Flexible data curation, Cost charts, Reasoning column, and more

🎙️ Feature spotlight 🧩 Fully flexible data curation flows While curating and refining test datasets from logs and test runs, you can now reference and modify any data point from a trace or test run entry; without being limited to predefined fields like input or output. Use Maxim’s DSL in the