RAGAuth • Grepedia

RAGAuth is a specialized security layer designed to address critical data leakage vulnerabilities in Retrieval-Augmented Generation (RAG) pipelines. Developed by Praalak Tech Solutions, it introduces a permission-aware architecture that enforces access control at the retrieval stage rather than through post-hoc filtering. By integrating with OpenFGA, it ensures that only authorized content is ever retrieved from a vector database, effectively preventing unauthorized documents from being injected into the Large Language Model (LLM) prompt. This approach is intended for organizations operating in multi-tenant or sensitive environments where strict data isolation is a regulatory or security requirement.

Functionality centers on acting as an intermediary that evaluates user permissions via JSON Web Tokens (JWT) before any vector similarity search takes place. The system utilizes OpenFGA to maintain complex relationship-based access control, translating these policies into specific document allow-lists. These allow-lists function as strict payload filters during the HNSW graph traversal, ensuring the vector engine only considers documents the user is explicitly permitted to access. This mechanism eliminates the risk of unauthorized data surfacing in AI responses, regardless of the similarity score of the documents.

Some of the key features are:

Pre-filter Architecture: Enforces security policies before vector similarity scoring to ensure restricted documents are never part of the search candidate pool.
OpenFGA Integration: Leverages robust relationship-based access control models to manage permissions across teams, roles, and tenants.
Identity-Based Scoping: Automatically uses JWT information to restrict retrieval scope, making cross-tenant data access architecturally impossible.
Instant Revocation: Reflects permission changes immediately at query time without requiring costly re-indexing of vector data.
LLM-Agnostic: Operates exclusively at the retrieval layer, allowing compatibility with OpenAI, Ollama, and various other local or cloud-based LLM providers.
Self-Hostable: Offers an open-source core released under the MIT license, with simple deployment via Docker Compose.
Compliance Support: Includes features such as audit log exports designed to assist with EU AI Act Article 12 compliance requirements.

In operation, the system intercepts an incoming request, extracts user identity details from the JWT, and queries the Authorizer/OpenFGA engine to determine the document allow-list for that specific request. This allow-list is then applied as a filter to the vector retrieval process. Consequently, when the LLM performs the retrieval, it is logically unable to "see" any document outside of the generated list, ensuring that the model output remains confined to authorized information. This process is transparent to the end-user while providing developers with a hardened interface for their AI applications.

Some common use cases include:

Multi-tenant SaaS: Ensuring that customer documents remain strictly isolated within a single RAG pipeline to prevent cross-tenant data leaks.
Healthcare Platforms: Managing patient data access to strictly adhere to HIPAA requirements by ensuring AI only retrieves records belonging to the authenticated patient.
ERP/Database Integration: Mirroring database-level access control lists (ACLs) directly into the vector retrieval flow to maintain enterprise security standards.
Legal and Compliance: Protecting attorney-client privilege by ensuring that sensitive matter documents from one client are never surfaced in queries related to another client.
Internal Knowledge Assistants: Restricting access to internal documentation based on employee roles, preventing staff from retrieving highly confidential financial or board-level information without proper authorization.