A new ServiceNow research project, MosaicLeaks, finds that unmodified AI deep-research agents leak sensitive private enterprise data in 34% of test cases via their external web query logs, with standard performance training actually increasing leakage risk.
The benchmark, published on Hugging Face, demonstrates that agents routinely expose private internal facts—such as unreleased financial metrics, migration timelines, and security incident details—through seemingly benign sequential web searches that reconstruct sensitive information for outside observers. The benchmark’s core test case answers the headline question directly: unconfigured research agents cannot keep proprietary secrets.
Nearly 1 in 3 test runs expose high-sensitivity internal data to external observers with access only to public query logs.
How MosaicLeaks measures agent privacy leakage
MosaicLeaks treats the cumulative web query log of a research agent as its primary leakage channel. An adversary with access only to these public-facing queries—no access to the agent’s private internal documents or internal reasoning process—can still reconstruct sensitive enterprise information, per the benchmark’s official documentation 1.
The framework quantifies leakage risk across three escalating tiers. The lowest tier, intent leakage, occurs when an observer can infer the agent’s private research goals from its query log alone. The second tier, answer leakage, is more severe: given a specific private question, the observer can answer it correctly using only the query log, with no access to the agent’s internal data.
The highest risk tier, full-information leakage, lets an observer state verifiably true private claims about the target enterprise without even being given a specific question to investigate.
The benchmark is built around 1,001 multi-hop research chains that interleave queries to private local enterprise documents and public web sources, explicitly designed to force agents to use private data to form subsequent public queries. Local document sets are drawn from DRBench-style enterprise tasks, while web sources come from the BrowseComp-Plus corpus 1.
The 1,001 chains are split into 559 training, 98 validation, and 344 held-out test sets, all drawn from unseen companies to avoid overfitting to specific enterprise data.
The testing harness used to evaluate models follows a four-step workflow adapted from the DRBench framework. The first step, Plan, generates both local document and public web search queries; the second step, Choose, picks which retrieved documents to prioritize for processing; the third step, Read, extracts relevant information from those selected documents to address the current research hop; and the fourth step, Resolve, determines whether to provide a final answer, pull additional documents, or draft follow-up search queries 1.
A sample chain cited in the research illustrates the risk in concrete terms: an agent investigating Lee’s Market’s 2020 traffic growth issued two web queries referencing the company and 2020 traffic metrics, then a third query to confirm the 15% growth rate. Individually, each query is benign, but combined they let an external party reconstruct the private fact without ever accessing the company’s internal data.
Standard training makes leakage worse, not better
Counterintuitively, the research found that standard fine-tuning for task performance exacerbated leakage rather than reducing it. Across all tested models, agents trained only to answer research questions correctly leaked private information at higher rates than unmodified base models, as they learned to prioritize retrieving relevant data over obscuring the context of their queries 1.
The baseline strict chain success rate—where every hop in a multi-step research chain is answered correctly—sat at just 48.7% for unmodified agents. Answer and full-information leakage rates hit 34.0% for these unmodified base models 1.
This failure stems from how current agent training frameworks reward end-task accuracy with no penalty for query privacy. When an agent needs to answer a question that requires both private internal data and public web context, it has no incentive to obscure private bridge entities—such as company names, internal metrics, or dates—when forming its public web queries 1.
Even small, seemingly irrelevant details carried over from local document retrieval can give an adversary enough context to reconstruct larger sensitive facts. For example, a query referencing a company’s 2020 traffic metrics and a 15% growth rate, even without explicit mention of the metric being unreleased, can let an external observer confirm the private figure with minimal effort.
PA-DR training cuts leakage while boosting performance
To address this gap, the ServiceNow team developed Privacy-Aware Deep Research (PA-DR), a reinforcement learning training method that penalizes leakage while rewarding task accuracy. The approach raised strict chain success to 58.7%—a 10 percentage point improvement over the 48.7% baseline for unmodified agents 1.
It also slashed answer and full-information leakage to 9.9%, a 71% reduction from the 34.0% leakage rate of unmodified agents. The training framework explicitly rewards agents for completing research tasks using queries that do not expose private bridge entities from local documents, such as company names, internal metrics, or dates, forcing the model to learn to separate public and private context when forming external queries 1.
The PA-DR method is designed to work with existing agent harnesses and does not require changes to underlying model architecture, making it a drop-in addition for teams running deep-research agents on proprietary data 1. For example, an agent tasked with cross-referencing internal product launch timelines with public market competition data would use PA-DR to avoid leaking the unreleased launch date in its external queries.
Real-world enterprise agents face the same leakage risk
The risk MosaicLeaks identifies is not theoretical: enterprise AI agent deployments are already handling sensitive internal data at scale. GitHub’s internal Qubot analytics agent, which lets employees query proprietary company data via natural language through Slack, VS Code, and the Copilot CLI, is a concrete example of an enterprise agent handling sensitive data that faces similar query log leakage risks 2.
Qubot handles sensitive product telemetry, financial metrics, and security incident data—all of which could be exposed if the agent’s external tool queries leak private context. GitHub’s own documentation notes the agent relies on federated context layers and external tool calls to internal data warehouses to answer questions 2.
For teams running similar internal research or analytics agents, the MosaicLeaks findings highlight a gap in current privacy safeguards: most agent deployment frameworks do not include audit tools for query log privacy, and standard fine-tuning pipelines do not account for leakage risk 1.
The PA-DR training method offers a starting point for addressing this gap, but the research team notes that broader tooling—such as query log auditing, context separation guardrails, and privacy-focused agent harness design—will be needed to fully mitigate the risk for production deployments 1.
Bottom line: Organizations running AI deep-research agents on proprietary data should audit their agents’ external query logs for mosaic leakage, and evaluate privacy-aware training methods like PA-DR as part of their standard fine-tuning pipeline to reduce the risk of exposing sensitive internal information.
