{*}
Add news
March 2010 April 2010 May 2010 June 2010 July 2010
August 2010
September 2010 October 2010 November 2010 December 2010 January 2011 February 2011 March 2011 April 2011 May 2011 June 2011 July 2011 August 2011 September 2011 October 2011 November 2011 December 2011 January 2012 February 2012 March 2012 April 2012 May 2012 June 2012 July 2012 August 2012 September 2012 October 2012 November 2012 December 2012 January 2013 February 2013 March 2013 April 2013 May 2013 June 2013 July 2013 August 2013 September 2013 October 2013 November 2013 December 2013 January 2014 February 2014 March 2014 April 2014 May 2014 June 2014 July 2014 August 2014 September 2014 October 2014 November 2014 December 2014 January 2015 February 2015 March 2015 April 2015 May 2015 June 2015 July 2015 August 2015 September 2015 October 2015 November 2015 December 2015 January 2016 February 2016 March 2016 April 2016 May 2016 June 2016 July 2016 August 2016 September 2016 October 2016 November 2016 December 2016 January 2017 February 2017 March 2017 April 2017 May 2017 June 2017 July 2017 August 2017 September 2017 October 2017 November 2017 December 2017 January 2018 February 2018 March 2018 April 2018 May 2018 June 2018 July 2018 August 2018 September 2018 October 2018 November 2018 December 2018 January 2019 February 2019 March 2019 April 2019 May 2019 June 2019 July 2019 August 2019 September 2019 October 2019 November 2019 December 2019 January 2020 February 2020 March 2020 April 2020 May 2020 June 2020 July 2020 August 2020 September 2020 October 2020 November 2020 December 2020 January 2021 February 2021 March 2021 April 2021 May 2021 June 2021 July 2021 August 2021 September 2021 October 2021 November 2021 December 2021 January 2022 February 2022 March 2022 April 2022 May 2022 June 2022 July 2022 August 2022 September 2022 October 2022 November 2022 December 2022 January 2023 February 2023 March 2023 April 2023 May 2023 June 2023 July 2023 August 2023 September 2023 October 2023 November 2023 December 2023 January 2024 February 2024 March 2024 April 2024 May 2024 June 2024 July 2024 August 2024 September 2024 October 2024 November 2024 December 2024 January 2025 February 2025 March 2025 April 2025 May 2025 June 2025 July 2025 August 2025 September 2025 October 2025 November 2025 December 2025 January 2026 February 2026 March 2026
1 2 3 4 5 6 7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
News Every Day |

Google PM open-sources Always On Memory Agent, ditching vector databases for LLM-driven persistent memory

Google senior AI product manager Shubham Saboo has turned one of the thorniest problems in agent design into an open-source engineering exercise: persistent memory.

This week, he published an open-source “Always On Memory Agent” on the official Google Cloud Platform Github page under a permissive MIT License, allowing for commercial usage.

It was built with Google's Agent Development Kit, or ADK introduced last Spring in 2025, and Gemini 3.1 Flash-Lite, a low-cost model Google introduced on March 3, 2026 as its fastest and most cost-efficient Gemini 3 series model.

The project serves as a practical reference implementation for something many AI teams want but few have productionized cleanly: an agent system that can ingest information continuously, consolidate it in the background, and retrieve it later without relying on a conventional vector database.

For enterprise developers, the release matters less as a product launch than as a signal about where agent infrastructure is headed.

The repo packages a view of long-running autonomy that is increasingly attractive for support systems, research assistants, internal copilots and workflow automation. It also brings governance questions into sharper focus as soon as memory stops being session-bound.

What the repo appears to do — and what it does not clearly claim

The repo also appears to use a multi-agent internal architecture, with specialist components handling ingestion, consolidation and querying.

But the supplied materials do not clearly establish a broader claim that this is a shared memory framework for multiple independent agents.

That distinction matters. ADK as a framework supports multi-agent systems, but this specific repo is best described as an always-on memory agent, or memory layer, built with specialist subagents and persistent storage.

Even at this narrower level, it addresses a core infrastructure problem many teams are actively working through.

The architecture favors simplicity over a traditional retrieval stack

According to the repository, the agent runs continuously, ingests files or API input, stores structured memories in SQLite, and performs scheduled memory consolidation every 30 minutes by default.

A local HTTP API and Streamlit dashboard are included, and the system supports text, image, audio, video and PDF ingestion. The repo frames the design with an intentionally provocative claim: “No vector database. No embeddings. Just an LLM that reads, thinks, and writes structured memory.”

That design choice is likely to draw attention from developers managing cost and operational complexity. Traditional retrieval stacks often require separate embedding pipelines, vector storage, indexing logic and synchronization work.

Saboo's example instead leans on the model to organize and update memory directly. In practice, that can simplify prototypes and reduce infrastructure sprawl, especially for smaller or medium-memory agents. It also shifts the performance question from vector search overhead to model latency, memory compaction logic and long-run behavioral stability.

Flash-Lite gives the always-on model some economic logic

That is where Gemini 3.1 Flash-Lite enters the story.

Google says the model is built for high-volume developer workloads at scale and priced at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens.

The company also says Flash-Lite is 2.5 times faster than Gemini 2.5 Flash in time to first token and delivers a 45% increase in output speed while maintaining similar or better quality.

On Google’s published benchmarks, the model posts an Elo score of 1432 on Arena.ai, 86.9% on GPQA Diamond and 76.8% on MMMU Pro. Google positions those characteristics as a fit for high-frequency tasks such as translation, moderation, UI generation and simulation.

Those numbers help explain why Flash-Lite is paired with a background-memory agent. A 24/7 service that periodically re-reads, consolidates and serves memory needs predictable latency and low enough inference cost to avoid making “always on” prohibitively expensive.

Google’s ADK documentation reinforces the broader story. The framework is presented as model-agnostic and deployment-agnostic, with support for workflow agents, multi-agent systems, tools, evaluation and deployment targets including Cloud Run and Vertex AI Agent Engine. That combination makes the memory agent feel less like a one-off demo and more like a reference point for a broader agent runtime strategy.

The enterprise debate is about governance, not just capability

Public reaction shows why enterprise adoption of persistent memory will not hinge on speed or token pricing alone.

Several responses on X highlighted exactly the concerns enterprise architects are likely to raise. Franck Abe called Google ADK and 24/7 memory consolidation “brilliant leaps for continuous agent autonomy,” but warned that an agent “dreaming” and cross-pollinating memories in the background without deterministic boundaries becomes “a compliance nightmare.”

ELED made a related point, arguing that the main cost of always-on agents is not tokens but “drift and loops.”

Those critiques go directly to the operational burden of persistent systems: who can write memory, what gets merged, how retention works, when memories are deleted, and how teams audit what the agent learned over time?

Another reaction, from Iffy, challenged the repo’s “no embeddings” framing, arguing that the system still has to chunk, index and retrieve structured memory, and that it may work well for small-context agents but break down once memory stores become much larger.

That criticism is technically important. Removing a vector database does not remove retrieval design; it changes where the complexity lives.

For developers, the tradeoff is less about ideology than fit. A lighter stack may be attractive for low-cost, bounded-memory agents, while larger-scale deployments may still demand stricter retrieval controls, more explicit indexing strategies and stronger lifecycle tooling.

ADK broadens the story beyond a single demo

Other commenters focused on developer workflow. One asked for the ADK repo and documentation and wanted to know whether the runtime is serverless or long-running, and whether tool-calling and evaluation hooks are available out of the box.

Based on the supplied materials, the answer is effectively both: the memory-agent example itself is structured like a long-running service, while ADK more broadly supports multiple deployment patterns and includes tools and evaluation capabilities.

The always-on memory agent is interesting on its own, but the larger message is that Saboo is trying to make agents feel like deployable software systems rather than isolated prompts. In that framing, memory becomes part of the runtime layer, not just an add-on feature.

What Saboo has shown — and what he has not

What Saboo has not shown yet is just as important as what he's published.

The provided materials do not include a direct Flash-Lite versus Anthropic Claude Haiku benchmark for agent loops in production use.

They also do not lay out enterprise-grade compliance controls specific to this memory agent, such as: deterministic policy boundaries, retention guarantees, segregation rules or formal audit workflows.

And while the repo appears to use multiple specialist agents internally, the materials do not clearly prove a larger claim about persistent memory shared across multiple independent agents.

For now, the repo reads as a compelling engineering template rather than a complete enterprise memory platform.

Why this matters now

Still, the release lands at the right time. Enterprise AI teams are moving beyond single-turn assistants and into systems expected to remember preferences, preserve project context and operate across longer horizons.

Saboo's open-source memory agent offers a concrete starting point for that next layer of infrastructure, and Flash-Lite gives the economics some credibility.

But the strongest takeaway from the reaction around the launch is that continuous memory will be judged on governance as much as capability.

That is the real enterprise question behind Saboo's demo: not whether an agent can remember, but whether it can remember in ways that stay bounded, inspectable and safe enough to trust in production.

Ria.city






Read also

Markwayne Mullin set to step into limelight with DHS nod

FBI Discovers ‘Suspicious Activities’ on Its Own Networks

BBC Middle East editor’s ‘Israeli propaganda’ court case begins

News, articles, comments, with a minute-by-minute update, now on Today24.pro

Today24.pro — latest news 24/7. You can add your news instantly now — here




Sports today


Новости тенниса


Спорт в России и мире


All sports news today





Sports in Russia today


Новости России


Russian.city



Губернаторы России









Путин в России и мире







Персональные новости
Russian.city





Friends of Today24

Музыкальные новости

Персональные новости