NewIntroducing AVALON-2B — the first sub-3B Self-RAG language model.Introducing AVALON-2B→

Nuro AI Labs · 2026London · Companies House #16079959

Building personal and general intelligence.

An AI research lab. Personal intelligence is the through-line — but we work across language models, architectures, reasoning, agents and the systems on top.

We don't pre-announce. We ship.

Read AVALON-2B Try Hypersave

Now live · AVALON-2B · Hypersave · Khyaa

Scroll ↓

Supported by

Featured · Research

AVALON-2B — the first sub-3B language model that knows what it doesn't know.

1.88B parameters built on Qwen 3.5 2B. A five-token reflection vocabulary. A 22M-parameter MiniLM router at 90.5% accuracy. 82.5% Self-RAG token accuracy under LoRA fine-tune. Apache 2.0. Live on Hugging Face and Ollama.

[Retrieval][NoRetrieval][Utility][Relevant][Continue]

Read the paper Hugging Face ↗Akhil Ponnada · Naga Sri Arvapalli

REFLECT architecture · live

router 90.5%

›What's in the news today about lithium prices?

router · MiniLM

22M params · 5ms latency

decides:

[Retrieval]

generation · Qwen 3.5 2B + LoRA

1.88B · 18 GDN + 6 softmax

emits:

[Utility:5]

Latest from the lab

In production, in flight, and in development.

All research →

Released · April 2026

AVALON-2B

The first sub-3B language model with Self-Reflective Retrieval-Augmented Generation.

Apache 2.0Sub-3BSelf-RAG

Preprint imminent

PLMR

Pre-tokenizer Latent Memory Routing for byte-level language models.

Byte-levelMethodologyBLT extension

In active development

Hydra

A next-generation model architecture in active development.

ArchitectureIn flight

All research →

How we work · three concentric layers

Research → infrastructure → applied.

Same shape as Anthropic, OpenAI and DeepMind. Open-weights at the research layer, commercial at the product layer. The applied work isn't a distraction from the research — it's the test of it.

01Open by default

Research

Apache 2.0 papers and weights. AVALON-2B is live; PLMR preprint imminent; Hydra in active development.

AVALON-2BPLMRHydra

02Commercial

Infrastructure

The cognitive memory layer for AI agents. SOC 2 Type II. Sub-200ms p95. Beats every adjacent system on LoCoMo.

Hypersave

03Commercial

Applied

Vertical and consumer products built on the same stack. Compliance-aware, citation-backed, on-device first.

KhyaaNuro StudioNuro ChatNuro One

Thesis · 01

The path to general intelligence runs through personal intelligence.

Read the full thesis →

We're a research lab — first and foremost. That means we work on the open questions across language models, architectures, reasoning, agents and the systems on top of them, and we publish what we learn.

Personal intelligence is the through-line that ties our work together — persistent memory, self-reflection, per-agent cognition. It's the part of intelligence the current frontier mostly skips, and we think it's a credible bottom-up route to the general intelligence everyone is racing toward top-down.

But personal intelligence isn't the only thing we work on. Some of our research is on language-model reflection (AVALON-2B). Some is on byte-level architectures and routing (PLMR). Some is on systems we haven't named yet (Hydra). Some ships as open weights; some ships as commercial infrastructure (Hypersave) or applied products (Khyaa, the Nuro stack). — one lab, many bets.

agent.pypython

from hypersave import Memory
mem = Memory(agent_id="researcher")

# every turn, the agent both remembers and reflects
mem.write("Sarah's youngest is heading to Stanford in August.",
          sector="episodic")

ctx = mem.recall("What's top of mind for Sarah right now?")
# → cited, ranked, decay-weighted answer — not chunks

Hypersave SDK · TypeScript and Python · npm i @hypersave/sdk

Flagship infrastructure

Memory that thinks.

Hypersave is the cognitive memory layer for AI agents. Brain-inspired architecture: five cognitive sectors, each with its own Ebbinghaus decay curve. Knowledge graph + vector + keyword + RRF hybrid retrieval. Ten-stage query pipeline. Answer synthesis — not chunks.

LoCoMo

86%

p95 latency

<200ms

Token savings

92%

Fact accuracy

94%

Try Hypersave Read the docs

Cognitive sectors · Ebbinghaus decay

S(t) = S₀ · e^-λt

0d700d

LoCoMo · long-context memory benchmark

86%

Hypersave v1 · audited March 2026

Reported in good faith from internal benchmarks. Reproducible script will ship with the v1.1 release notes.

Sample brief · cited

14:45 → 15:00 meeting

Sarah Chen joins at 15:00. Last note flagged inflation; today's CPI print came in 30 bps below consensus^[1]. Recent touchpoints: Roth conversion (Oct 12), Stripe RSU vesting (Oct 4), 529 contribution limits (Sep 21)^[2]. Open commitments: Q3 statement due Friday; estate-attorney intro by Nov 1^[3].

[1] CRM note · 2026-03-12

[2] Meeting transcripts · Sep–Oct

[3] Voice capture · 2026-04-22

Khyaa · sample

Applied · Khyaa

Never walk into a client meeting cold.

Personal intelligence for US financial advisors. Pre-meeting briefs, on-device voice capture, natural-language Ask across your entire book — every answer cites its source.

200-word brief 15 minutes before each external meeting
1-button voice recorder — recap leaves your phone as text, not audio
Ask the book in English; every claim has a source
SEC marketing rule, Reg BI, Rule 204-2 retention aware

Try the beta — free How it works →

Products · the personal-intelligence stack

Built on the same research stack.

All products →

Infrastructurev1.0 live

Hypersave

Memory that thinks. The cognitive memory layer for AI agents.

Hypersave, Inc.platform.hypersave.io

AppliedFree in beta

Khyaa

Never walk into a client meeting cold.

Khyaakhyaa.com

AppliedLive

Nuro Chat

Multi-model chat with persistent memory.

Nuronuro.chat

AppliedLive

Nuro Studio

The studio for building on the Nuro stack.

Nurostudio.nuro.one

AppliedLive

Nuro One

The consumer surface of the Nuro stack.

Nuronuro.one

Coming soonIn development

Prodclip

In active development.

Nuro

Open-weightsReleased April 2026

AVALON-2B

Our first released foundation model. Apache 2.0.

Nuro AI Labshuggingface.co/nuroai/Avalon-2B

Principles · how we operate

Seven things we hold ourselves to.

Stated up front so the next time you read a Nuro release, you can hold us to them.

and so we don't lose the plot

Open research, commercial products.

Our research output is open by default — weights, training code, and the synthetic-data pipelines that produced them. AVALON-2B is the proof. Our products are commercial: Hypersave, Khyaa and the Nuro stack are how we fund the lab. The split is deliberate, and it's how Anthropic, OpenAI, Mistral and DeepMind all actually operate.

Personal first.

Every advance we ship has to make at least one specific mind better — one user, one agent, one team — before it makes the world better. We do not chase abstract benchmarks at the cost of the unit we serve.

Compounding, not theatrical.

We ship in public, incrementally. We do not pre-announce. We do not promise products we have not built. The published research is the marketing.

Accountable.

Every claim is auditable. Every benchmark we publish has reproducible code. Every Khyaa answer cites its source. Every Hypersave query carries a confidence score. We treat citation as a first-class product feature.

Useful before grand.

A frontier lab that doesn't ship what it builds is a research department. We ship Hypersave, Khyaa and Nuro Studio alongside AVALON and PLMR. The applied work is not a distraction from the research — it is the test of it.

For every mind.

Personal intelligence is not luxury infrastructure. The cognitive memory layer should run on a phone, in a regulated bank, in an SME, in a classroom — not just in a hyperscaler. Geography of compute should not gate access to intelligence.

Patient about the destination.

General intelligence is the goal. We don't think it's three months away, and we don't think the path runs through a single 10-trillion-parameter pretraining run. We think it runs through compounding — through architecture, through memory, through reflection. We are willing to take a decade.

The labs that ship across the whole stack — open research, infrastructure, applied products — define the next decade of AI. The ones that pick one and stop don't.

The case for a research-first lab · Excerpt · Nuro AI Labs internal note · 2026

By the numbers · live

Research lines

AVALON · PLMR · Hydra

Shipped products

infra + applied + open-weights

86%

LoCoMo (Hypersave)

head-to-head SOTA

1.88B

AVALON-2B params

Apache 2.0 · on-device

Our journey · a short version

From an idea about memory to a shipping research lab.

Four chapters, in order. The longer one is below.

2024

Incorporated

Nuro AI Labs Limited registered in London (Companies House #16079959). The thesis: personal intelligence as the bottom-up route to general intelligence.

2025

Hypersave research

Brain-inspired memory architecture — five cognitive sectors, Ebbinghaus decay, hybrid retrieval, ten-stage query pipeline, answer synthesis.

How Hypersave works →

Q1 2026

Hypersave v1.0 GA

SOC 2 Type II. TypeScript + Python SDKs. 86% LoCoMo — beats every adjacent agent-memory system head-to-head.

Apr 2026

AVALON-2B released

The first sub-3B Self-RAG language model. Apache 2.0. Live on Hugging Face and Ollama.

Read the paper →

Trajectory · in public

Frontier-grade work, incrementally.

We don't pre-announce. Every milestone is a thing that shipped.

Nov 2024
Nuro AI Labs incorporated
Companies House #16079959. London-registered private limited.
Q1 2025
Hypersave research kicks off
Brain-inspired cognitive memory architecture for AI agents — five sectors, Ebbinghaus decay curves, hybrid retrieval.
Q3 2025
Khyaa first design partners
Personal intelligence for US financial advisors. Free beta with named RIA design partners.
Q1 2026
Hypersave v1.0 GA
SOC 2 Type II. TypeScript + Python SDKs. 86% LoCoMo — beats every adjacent agent-memory system head-to-head.
Apr 2026
AVALON-2B released
First sub-3B Self-RAG language model. Apache 2.0. Live on Hugging Face and Ollama.
Next
PLMR preprint · AVALON-3 · Hydra
Continued open research. Continued commercial product. Same compounding.

Team · the lab

A small lab. High signal.

Founded November 2024. We hire research engineers, applied engineers and GTM operators on rolling basis.

01 / 03

Akhil Ponnada

Founder & CEO

Author on AVALON-2B and PLMR. Sets research direction and overall company strategy. MSc International Business Management, Heriot-Watt University; BBA, Amity University.

X ↗LinkedIn ↗GitHub ↗

02 / 03

Naga Sri Arvapalli

CTO

Co-author on AVALON-2B; led the AVALON training pipeline. Owns the technical architecture across research, Hypersave infrastructure and the applied product stack.

03 / 03

Naveen Yelloji

Executive Director

Senior CXO-level operator with three decades of leadership across AI, media & entertainment, telecom, infrastructure and technology. IIM Ahmedabad alumnus; University of Hull.

Hiring · 01

Research

Pretraining, fine-tuning, evals. AVALON, PLMR, Hydra. Open by default.

Open roles →

Hiring · 02

Infrastructure

Hypersave platform engineering. Distributed memory, latency at p95, SOC 2.

Open roles →

Hiring · 03

Applied

Khyaa, Nuro Studio, Nuro Chat, Nuro One. Vertical and consumer surfaces.

Open roles →

Hiring · 04

Operations

GTM, partnerships, finance. Help us turn frontier research into a durable lab.

Open roles →

Where we live

London. Quietly, deliberately.

Registered office at 128 City Road. Companies House #16079959. We're hiring across research, infrastructure and applied — remote-friendly, London-anchored.

See open roles Say hello

Registered office

London

128 City Road, EC1V 2NX · United Kingdom

Founded

Nov 2024

Nuro AI Labs Limited · #16079959

Posture

Open + Commercial

Research is open by default. Products fund the lab.

Press

press@

press@nuroailabs.com

Press inquiries

press@nuroailabs.com

Other inquiries

hello@nuroailabs.com

Press kit

Boilerplate · bios · logos →

FAQ · honest answers

We get asked these a lot.

If your question isn't here, ask us directly.

What does "personal intelligence" actually mean?

Persistent memory + self-reflection + per-agent cognition. The set of capabilities a current frontier LLM does not have because it is restarted from scratch every conversation. Hypersave is the cognitive memory layer that adds those capabilities to any agent. AVALON-2B is the first runtime that knows when it doesn't know.

How is this different from OpenAI, Anthropic or DeepMind?

They are scaling top-down — bigger models, more compute, hoping general intelligence emerges from sheer parameter count. We are building bottom-up — the memory and reflection systems that compound personal intelligence into general intelligence. Both paths matter. Ours is under-served.

Are your products open source?

Our research is open. AVALON-2B is Apache 2.0; weights, GGUF quants and the paper are public. PLMR and Hydra inherit the posture. Our products are commercial — Hypersave, Khyaa and the Nuro stack are how we fund the research. The split is deliberate.

Where are you based?

London. Companies House #16079959. Registered office at 128 City Road, EC1V 2NX.

Who is the team?

Akhil Ponnada (Founder & CEO). Naga Sri Arvapalli (CTO, AVALON-2B co-author). Naveen Yelloji (Executive Director). We hire research engineers, applied engineers and GTM operators on rolling basis.

Are you raising?

We are talking to a small set of seed investors who share the personal-intelligence thesis. Reach press@nuroailabs.com.

Newsletter · slow channel

One email per release. That's it.

Papers, weights, and product updates as they ship. No marketing, no tracking, unsubscribe anytime.

Newsletter

Slow updates from the lab.

One email per release. No marketing, no tracking, unsubscribe anytime.

Personal intelligence. General intelligence.

Build with us. Read the research. Try the products. Or just say hello.

Read AVALON-2B Try Hypersave Press & partners

nuroailabs.com·@nurolabs·huggingface.co/nuroai

Building personal and general intelligence.

AVALON-2B — the first sub-3B language model that knows what it doesn't know.

In production, in flight, and in development.

AVALON-2B

PLMR

Hydra

Research → infrastructure → applied.

Research

Infrastructure

Applied

The path to general intelligence runs through personal intelligence.

Memory that thinks.

Never walk into a client meeting cold.

Built on the same research stack.

Hypersave

Khyaa

Nuro Chat

Nuro Studio

Nuro One

Prodclip

AVALON-2B

Seven things we hold ourselves to.

Open research, commercial products.

Personal first.

Compounding, not theatrical.

Accountable.

Useful before grand.

For every mind.

Patient about the destination.

From an idea about memory to a shipping research lab.

Incorporated

Hypersave research

Hypersave v1.0 GA

AVALON-2B released

Frontier-grade work, incrementally.

Nuro AI Labs incorporated

Hypersave research kicks off

Khyaa first design partners

Hypersave v1.0 GA

AVALON-2B released

PLMR preprint · AVALON-3 · Hydra

A small lab. High signal.

Akhil Ponnada

Naga Sri Arvapalli

Naveen Yelloji

Research

Infrastructure

Applied

Operations

London. Quietly, deliberately.

London

We get asked these a lot.

What does "personal intelligence" actually mean?

How is this different from OpenAI, Anthropic or DeepMind?

Are your products open source?

Where are you based?

Who is the team?

Are you raising?

One email per release. That's it.

Slow updates from the lab.

Personal intelligence. General intelligence.