Skip to content
Nuro AI Labs
Research

Open foundation research on the path to general intelligence.

We publish weights, training code and synthetic-data pipelines wherever the research permits. Apache 2.0 is the default. We don't believe the systems on the path to general intelligence should be closed boxes held by three companies.

TeamsSelf-RAGMemory architecturesReasoningOpen Science
Featured paper
Released · April 2026Apache 2.0Sub-3B · Self-RAG

AVALON-2B

The first sub-3B language model that knows what it doesn't know.

1.88B parameters built on Qwen 3.5 2B. A five-token reflection vocabulary — [Retrieval] [No Retrieval] [Relevant] [Utility:5] — and a 22M-parameter MiniLM router at 90.5% accuracy. 82.5% Self-RAG token accuracy under LoRA, 40 tok/s on Apple M3, 12 tok/s on iPhone 15 Pro.

Akhil Ponnada · Naga Sri Arvapalli · Nuro AI Labs
REFLECT architecture · live
router 90.5%
What's in the news today about lithium prices?
router · MiniLM
22M params · 5ms latency
decides:
[Retrieval]
generation · Qwen 3.5 2B + LoRA
1.88B · 18 GDN + 6 softmax
emits:
[Utility:5]
Publications

Everything the lab has put out.

Papers, preprints and technical notes. Open-source by default — Apache 2.0, weights, code and data on Hugging Face.
Open by default

Apache 2.0 is the default for our research output. AVALON-2B ships with weights, GGUF quants and the synthetic-data recipe on Hugging Face and Ollama. PLMR and Hydra inherit the same posture. We don't believe the systems on the path to general intelligence should be held inside three companies.

Hiring

Come build the next paper with us. Research engineers, applied engineers, GTM.

We hire on a rolling basis. The bar is the work — show us a paper, a model, a system you shipped. The lab is small, the surface area is large, and every hire moves a research line.