Category: Reasoning & Inference Scaling

Week 22, 2026 — Reasoning & Reinforcement Learning for LLMs

Test-time compute and reasoning methods dominated this week’s research, with breakthroughs in self-verification, efficient sampling, and working memory mechanisms. Self-Trained Verification Unlocks Both Test-Time and Training-Time Gains Self-Trained Verification (STV) by Chen Henry Wu and Aditi Raghunathan addresses the central bottleneck in LLM self-improvement: the verifier. The key insight is that while a model cannot…

31 May 2026
Beyond the One-Shot: How Dynamic Inference Compute Is Reshaping AI Reasoning

34 papers surveyed | A year of progress in reasoning and inference-time compute scaling (May 2025 – May 2026) — For most of the last decade, the AI inference pipeline looked the same: you train a model, deploy it, and every query costs the same amount of compute. A simple factual lookup gets the same…

25 June 2025

Week 22, 2026 — Reasoning & Reinforcement Learning for LLMs