Blog · AI

Onyx: A serious solution for enterprise RAG deployment?

Jun 30, 20268 min readby Scroll

Onyx helps CIOs deploy enterprise RAG connected to internal data, with search, connectors, and permissions.

Onyx: Enterprise RAG reaches a new level

The topic of enterprise RAG is increasingly appearing on executive committees’ agendas. And for good reason. Employees already use AI to write, summarize, compare, or generate ideas. But as soon as they want to query real internal data, things get complicated.

Where is the correct procedure? Which contract version is authoritative? Which Jira ticket explains this decision? Which sales note mentions this client? Which SharePoint document truly answers the question?

This is precisely where RAG adds value.

RAG, or Retrieval-Augmented Generation, enables an AI assistant to search through a company’s documents before responding. Instead of relying solely on its general knowledge, the model draws from internal sources. It can thus produce a more useful, contextualized, and verifiable answer.

For a CIO, the issue is not just technical. It involves security, access rights, data quality, user experience, model costs, oversight, and change management.

In this landscape, Onyx clearly deserves attention.

Onyx presents itself as an open-source AI chat connected to an organization’s documents, applications, and people. The platform highlights advanced RAG features, hybrid search, contextual search, business connectors, and access control.

Put simply: Onyx aims to become the internal AI interface that allows teams to retrieve, understand, and leverage knowledge scattered across the company’s tools.

Why CIOs are interested in enterprise RAG

The initial problem is rarely “we need a chatbot.”

The real problem is more like this: the information exists, but it’s scattered everywhere.

It’s in Google Drive, SharePoint, Confluence, Notion, Slack, Jira, GitHub, Salesforce, HubSpot, Gmail, support tickets, PDFs, meeting notes, or business knowledge bases. Teams therefore spend too much time searching. They keep consulting the same experts. They sometimes recreate documents that already exist. And they make decisions with only a partial view of the information.

Enterprise RAG addresses this problem by adding an intelligent layer between users and internal data.

An employee no longer searches by keyword alone. They ask a question in natural language. The system identifies relevant sources, extracts useful passages, and then prompts the LLM to formulate a clear answer. When the system is well designed, the response cites its sources and respects the user’s permissions.

This last point is what changes everything for a CIO.

A demo RAG might work with ten PDFs uploaded to an interface. An enterprise RAG must work with permissions, groups, sensitive documents, changing sources, duplicates, conflicting versions, and users who don’t all have access to the same information.

This is where the difference lies between a simple AI prototype and a true RAG architecture.

To explore this topic further, Scroll has published a comprehensive guide on enterprise RAG, covering architecture, cost, compliance, and scoping challenges.

What exactly is Onyx?

Onyx is an open-source AI platform designed to connect LLMs to an organization’s internal knowledge. Its goal is to provide a single interface for interacting with AI, searching through company data, conducting in-depth searches, and creating agents tailored to specific business use cases.

On its GitHub repository, Onyx describes the platform as an application layer for LLMs. It includes capabilities such as RAG, web search, code execution, file creation, deep research, and custom agents.

This is significant because Onyx is not just a chat interface. It is also a search and orchestration layer. For a CIO, this means the tool can fit into a broader enterprise AI strategy.

The platform notably offers:

An AI chat interface for internal users.

Connectors to the company’s tools.

A document search and RAG layer.

Self-hosted deployment options.

Access controls on data sources.

The ability to connect different LLM models.

Agents with actions and MCP integrations.

Onyx also states that its Community Edition is available under an MIT license and covers core features such as chat, RAG, agents, and actions. The Enterprise Edition adds functions primarily aimed at large organizations.

For a CIO, the key advantage is clear: Onyx allows starting with an open-source foundation while maintaining a path toward more advanced use cases.

What Onyx brings to enterprise RAG

Onyx’s promise is based on a simple principle: connecting AI to the tools teams already use.

Onyx doesn’t just require manually importing files. The documentation mentions connectors designed to bridge the gap between the organization’s data sources and the platform’s generative AI functions. These connectors can synchronize changes from the sources.

This is a major point for enterprise RAG. A knowledge base is never static. Procedures change. Confluence pages evolve. Jira tickets close. SharePoint folders are moved. Sales documents are updated. A good RAG system must therefore reflect this reality.

Onyx lists connectors for tools such as Confluence, SharePoint, Notion, Google Drive, Jira, Zendesk, Airtable, Slack, Microsoft Teams, Gmail, Salesforce, HubSpot, Gong, GitHub, GitLab, Bitbucket, and other sources.

This positions Onyx on very practical ground for CIOs: data isn’t just in a clean document folder. It lives within business applications.

This is often the major limitation of early AI projects. The team tests an assistant on a controlled corpus. The results are good. Then they try to scale and discover that useful knowledge is fragmented across ten different tools. Without connectors, an indexing strategy, and permission management, the project remains stuck at the POC stage.

Onyx partially addresses this issue with a connector-oriented approach, search capabilities, and access control.

The key issue: permissions

For a CIO, RAG is not just about relevance. It’s also about access.

An AI assistant connected to internal data can be highly valuable. But it can also become risky if it exposes information to the wrong person.

A salesperson shouldn’t necessarily read all HR documents. An employee shouldn’t access sensitive legal files. A manager shouldn’t see data belonging to another entity. And an AI assistant should never become a shortcut to bypass existing permissions.

Onyx emphasizes a “permission-aware” logic. Its documentation states that some connectors can synchronize permissions from the sources, ensuring users only see data they have access to. This synchronization is available for Confluence, Jira, Google Drive, Gmail, Slack, Salesforce, GitHub, and SharePoint, with conditions depending on the connector.

This is a genuine architectural challenge. Semantic search alone is not enough. A result may be relevant but forbidden. The system must therefore filter data before or during document retrieval.

This is also why an enterprise RAG project must be framed with the IT department from the outset. Permissions cannot be “added cleanly” at the end. They must guide the choice of data sources, connectors, service accounts, user groups, logs, and refusal rules.

Onyx vs OpenWebUI: two similar visions, but different use cases

OpenWebUI is often mentioned in discussions about self-hosted AI. And rightfully so.

OpenWebUI is a self-hosted AI interface, designed for local or sovereign use, that allows you to connect models like Ollama, OpenAI-compatible APIs, Anthropic, vLLM, and other providers. Its documentation highlights RAG, vector databases, hybrid search, and multiple document extraction engines.

In a dedicated article, Scroll presents OpenWebUI as a self-hosted interface that delivers an experience close to ChatGPT, but using your own models and infrastructure: read the article on OpenWebUI.

Comparing Onyx to OpenWebUI is interesting because both tools appeal to organizations seeking to regain control over their AI. However, they do not share the same center of gravity.

OpenWebUI excels at providing a self-hosted AI interface, testing models, connecting Ollama, centralizing multiple LLM providers, and giving teams an internal chat experience.

Onyx appears more focused on “enterprise search” and RAG connected to business data sources. Its value is greater when the priority is to connect AI to internal data, with connectors, permissions, and a more structured document search logic.

The choice therefore depends on the objective.

If your main challenge is to provide a simple interface for your teams to use local models or LLM APIs, OpenWebUI can be highly relevant.

If your main challenge is to create an AI assistant connected to internal data—with search across business tools, synchronization, and permission management—Onyx deserves a closer look.

In many companies, the real question is not “Onyx or OpenWebUI?”. The real question is: what level of RAG do we want to deploy, for which users, with which data, and under what level of control?

The most credible Onyx use cases for an IT department

The first use case is an augmented internal search engine. This is often the easiest to understand. Employees ask a question, and Onyx searches the connected sources. For an IT department, this use case allows for quick value assessment: time saved, reduced requests to experts, and better knowledge reuse.

The second use case is the support assistant. In a support team, useful knowledge is found in past tickets, knowledge bases, product sheets, escalation procedures, and internal conversations. A well-designed enterprise RAG can suggest an answer, cite sources, and help agents resolve requests faster.

The third use case is the IT assistant. Internal teams can query technical documentation, security procedures, runbooks, Jira tickets, GitHub or GitLab repositories, and incident histories. For an IT department, this is a natural fit, as technical data is often rich but difficult to navigate.

The fourth use case is the sales assistant. Sales teams often look for arguments, customer cases, objection responses, offer sheets, or proposal elements. An AI assistant connected to internal data can reduce preparation time and standardize responses.

The fifth use case is onboarding. A new employee can ask questions to an assistant linked to validated internal documents. This reduces the burden on managers and accelerates skill acquisition.

These use cases are similar to those Scroll details in its approach to AI assistants connected to your data. The challenge is not just to “build a chatbot.” The challenge is to create a useful, sourced, controlled, and measurable assistant.

Key considerations before choosing Onyx

Onyx is an interesting option, but it’s not a magic wand.

The first point to consider is document quality. If the documents are outdated, contradictory, or poorly named, the RAG will produce mediocre answers. An AI assistant can’t guess which document is authoritative if the company itself doesn’t know.

The second point is the connector strategy. Connecting all sources from the start is tempting but rarely useful. It’s better to start with a clear scope: one team, one corpus, one use case, a few well-chosen sources. This makes evaluation easier and security simpler.

The third point is rights management. You need to check how permissions are synchronised, which sources support this synchronisation, which service accounts are used, which groups are created, and which logs are available.

The fourth point is model selection. Onyx is compatible with multiple approaches, but the choice of LLM remains structural. A proprietary model can offer very high quality. A self-hosted open-source model may better meet sovereignty constraints. An IT department must decide based on concrete criteria: answer quality, cost per query, latency, confidentiality, supervision, and maintainability.

The fifth point is evaluation. An enterprise RAG isn’t judged by a demo. It’s judged by a set of real questions, expected answers, validated sources, and business users. Scroll often recommends benchmarking multiple models and settings on the actual corpus, with metrics like accuracy rate, cited sources, hallucinations, cost per query, and user feedback. This approach is part of Scroll’s AI methodology, as presented on the page dedicated to data-connected assistants.

Typical architecture with Onyx

An Onyx architecture for enterprise RAG can remain fairly readable.

At the core, Onyx acts as the interface and search layer. It connects to internal sources via connectors. Documents are indexed, chunked, enriched, and stored to enable semantic or hybrid search. When a user asks a question, Onyx retrieves the relevant passages, checks permissions, then passes the context to the LLM.

Around this foundation, the IT department must frame several components.

Identity and access: SSO, groups, roles, permissions per source.

Sources: SharePoint, Drive, Confluence, Jira, Slack, GitHub, CRM, document databases.

Models: OpenAI, Anthropic, Mistral, open-source models via Ollama or vLLM depending on the strategy.

Infrastructure: managed cloud, self-hosted, internal network, Kubernetes or Docker depending on the context.

Security: logs, audits, refusal rules, filtering of sensitive data, retention policy.

Evaluation: test sets, human review, hallucination measurement, satisfaction tracking.

Support: documentation, training, governance, error reporting process.

This diagram may seem simple. In practice, the difficulty lies in the details: inherited permissions, duplicates, obsolete files, confidential documents, overly broad prompts, token costs, latency, user adoption.

This is why an enterprise RAG project must be treated as an IT project, not as an isolated experiment in a corner.

When Onyx is a good choice

Onyx is likely a strong candidate if your company checks several boxes.

You have a lot of knowledge scattered across internal tools. You want smarter search than traditional keyword-based search. You need an AI assistant connected to internal data. You want to keep an open-source option. You have security and rights management requirements. You want to connect multiple LLM models. You’re looking for a more structured platform than a simple AI chat.

For an IT department, Onyx can also be interesting if the goal is to create an official “AI gateway” for the company.

This is a governance issue. If teams use their own AI tools, data ends up everywhere. Practices become hard to control. Costs are unclear. Confidentiality risks increase.

An internal platform like Onyx can help channel usage. It doesn’t solve everything, but it provides a framework: a single interface, known sources, controlled models, access rules, and metrics.

When Onyx isn’t necessarily the right choice

Onyx isn’t always necessary.

If you simply want to test a local model on a workstation, OpenWebUI will often be simpler.

If your need is for structured business automation—such as populating a CRM, following up with a client, or routing a ticket—a n8n workflow or a custom business application may sometimes be more suitable than RAG.

If your data is poorly organized, unreliable, or outdated, Onyx shouldn’t be the first priority. The first step is document governance.

If you only need a marketing chatbot or a public FAQ, a lighter solution may suffice.

Finally, if your use case requires absolute guarantees—such as in legal, medical, financial, or regulatory contexts—RAG must remain supported by human oversight and strict rules. It can help retrieve and synthesize information, but it doesn’t replace critical validation.

How to scope an Onyx POC without wasting time

A good Onyx POC should be short but rigorous.

The classic pitfall is connecting too many sources and testing vague questions. The results become hard to interpret. You can no longer tell whether the issue lies with the model, the corpus, the connector, the prompt, the permissions, or the question itself.

A useful POC starts with a specific use case.

For example: “Enable the support team to retrieve procedures and tickets related to level 2 customer incidents.”

From there, select three or four sources. Define user groups. Prepare around fifty real questions. Set success criteria. Measure response quality, source relevance, time saved, errors, and limitations.

The goal isn’t to prove how impressive AI is. The goal is to determine whether enterprise RAG delivers measurable value in a real-world context.

At Scroll, this approach often involves a short scoping phase, followed by a POC focused on a core use case. The method outlined on Scroll’s AI page describes a multi-stage project: scoping, model selection, POC, evaluation, and then industrialization if the criteria are met.

This is exactly the approach to take with Onyx.

What a CIO should know before deploying Onyx

Onyx arrives at the right time. Companies want to use AI, but they also want to retain control over their data, rights, models, and costs. Enterprise RAG is a serious answer to this need—provided it isn’t reduced to a simple chatbot demo.

For a CIO, Onyx is compelling because it combines several key components: AI interface, document search, connectors, RAG, agents, self-hosted options, and permission management. This is precisely the kind of foundation that can help an organization transition from AI experimentation to more structured internal use.

But success doesn’t depend solely on the tool.

It depends on scoping, source selection, document quality, permissions, evaluation, and team adoption. A poor corpus will produce a poor assistant. Poorly managed permissions will create risk. An overly broad POC will yield unclear results.

The right approach is incremental: one use case, one corpus, real questions, clear metrics, and then industrialization if the results are good.

At Scroll, we help companies scope and deploy AI assistants connected to their internal data, with a strong focus on security, reliability, cost, and production deployment. Onyx may be one of the options to consider, alongside OpenWebUI, LlamaIndex, LangChain, pgvector, Qdrant, Supabase, n8n, or a custom architecture.

The goal isn’t to pick the trendiest tool. The goal is to build a useful, controlled, and maintainable AI system that truly integrates with the IS.

Questions fréquentes

Is Onyx open source?

Yes, Onyx offers a Community Edition available under the MIT license. This edition covers core chat, RAG, agents, and actions features, while the Enterprise Edition adds more functions tailored to large organizations.

Does Onyx replace ChatGPT Enterprise?

Not exactly. Onyx can serve as an internal AI interface connected to company data. ChatGPT Enterprise provides a powerful, managed AI experience, but the strategy for connecting to data sources, governance, permissions, and SI integration must be assessed based on the specific context.

Can Onyx be self-hosted?

Yes. Onyx provides documentation for local deployment options using Docker, Kubernetes, Terraform, and cloud deployments. The documentation also includes a quick-install script for Docker Compose.