Case studies | project lazarus

Recent projects

Scaling laws for diffusion models

Diffusion beats autoregressive in data-constrained settings.

SpatialReasoner

Builds an explicit 3D scene and reasons over it step by step, boosting accuracy and generalization on 3D spatial QA benchmarks.

Tensor decomposition for force-field prediction

Replaces heavy tensor operations in molecular force-field models with low-rank approximation, reducing compute while keeping accuracy.

Bifrost-1

Aligns VLMs with diffusion models through shared CLIP patch embeddings, enabling controllable high-quality generation while preserving reasoning.

BLEUBERI

Using simple BLEU scores as feedback on hard instructions can train instruction-following models that rival those tuned with expensive learned rewards.

OverLayBench

A dataset that stress-tests layout-to-image models on heavily overlapping scenes, exposing current failures and offering an improved baseline.

Industries we have touched

Bold ideas, funded and refined through the Lambda Research Grant. These are the projects shaping how AI learns, reasons, and scales — built by the researchers defining what’s next.

SAEBench

A comprehensive benchmark for sparse autoencoders in language model interpretability

Adam Karvonen, Can Rager, Johnny Lin, Curt Tigges, Joseph Bloom, David Chanin, Yeu-Tong Lau, Eoin Farrell, Callum McDougall, Kola Ayonrinde, Demian Till, Matthew Wearden, Arthur Conmy, Samuel Marks, and Neel Nanda — ICML 2025

VideoHallu

Evaluating and mitigating multi-modal hallucinations on synthetic video understanding

Zongxia Li, Xiyang Wu, Guangyao Shi, Yubin Qin, Hongyang Du, Tianyi Zhou, Dinesh Manocha, and Jordan Lee Boyd-Graber — NeurIPS 2025

VLM2Vec-V2

Advancing multimodal embedding for videos, images, and visual documents

Meng, Rui and Jiang, Ziyan and Liu, Ye and Su, Mingyi and Yang, Xinyi and Fu, Yuepeng and Qin, Can and Chen, Zeyuan and Xu, Ran and Xiong, Caiming, and others — arXiv preprint 2025

Think, prune, train, improve

Scaling reasoning without scaling models

Caia Costello, Simon Guo, Anna Goldie, and Azalia Mirhoseini — ICLR 2025 workshop

NeoBERT

A next-generation BERT

Lola Le Breton, Quentin Fournier, Mariam El Mezouar, and Sarath Chandar — TMLR 2025

Software archaeology

reading room

the orphaned-software lifecycle

the eight systems every trades company runs on

a field guide to sql database audits

why the 60-year-old contractor is right to be afraid

three automations every mid-sized shop quietly needs

what “ai front end” actually means when your developer is gone

How we bring old systems back

Case study: Lorem ipsum manufacturing

Recent projects

Scaling laws for diffusion models

SpatialReasoner

Tensor decomposition for force-field prediction

Bifrost-1

BLEUBERI

OverLayBench

Case study: Lorem ipsum logistics

What gets reanimated

Field notes

Common patterns we see

Diffusion from scratch

Text2Video pretraining

GPU benchmarks

MLCommon benchmark

What clients say

Latent thought models

Product of experts with LLMs

DepR

Video MMLU

Latent adaptive planner

AimBot

Word salad chopper

DEL-ToM for theory-of-mind reasoning

VeriFastScore