Data sources
Requirements
Section titled “Requirements”Data sources are available starting with the Business license. The available storage quota depends on your license:
| License | Storage quota |
|---|---|
| Business | 1 GB per license, shared in the workspace |
| Max | 10 GB per license, shared in the workspace |
The quota is shared across the workspace, all data sources (personal and workspace data sources) count together. You can see the current usage on the data sources page under Settings, Data sources.
If you want to connect larger data sources, get in touch with us, we can enable additional quotas.
What are data sources?
Section titled “What are data sources?”Data sources connect external file storage with 9brains. Documents from these stores are indexed automatically and become available to the AI as searchable knowledge.
This approach is also known as RAG, short for Retrieval-Augmented Generation. The idea: before the AI formulates an answer, it first searches the connected documents for relevant information. That way, answers are not only based on the general language model but on the actual content from your files, with source references.
Without data sources the AI only uses knowledge from manually maintained knowledge bases.
With data sources it can additionally:
- Search documents in OneDrive or SharePoint
- Provide answers with source references to the original files
- Automatically detect and index new and changed files
Two types of data sources
Section titled “Two types of data sources”Personal data sources
Section titled “Personal data sources”Personal data sources are visible only to the respective user. Other users and administrators cannot view the file listings.
Example: An employee connects their personal OneDrive. Only they can search the files from it in the chat.
Workspace data sources
Section titled “Workspace data sources”Workspace data sources are available to all users in the workspace. They are set up by administrators and are well suited for shared files.
Example: An administrator connects a SharePoint Document Library with company documents. All employees can search these documents in the chat.
Outlook: more storage types in planning
Section titled “Outlook: more storage types in planning”Additional storage types will soon be supported as native data sources with automatic indexing:
- Windows File Server, NAS systems, and SMB shares
- S3-compatible storage (e.g. Amazon S3, MinIO)
Already possible today, but without automatic indexing: Using the On-premises connector you can already establish the network connection to internal file servers, NAS, or SMB systems. Until native indexing is available, you need a custom skill that makes the files readable for the AI.
Overview: What can I do?
Section titled “Overview: What can I do?”| Action | Who can do it? |
|---|---|
| Connect a personal data source (OneDrive) | All users |
| Search files in the chat | All users |
| Connect a workspace data source (SharePoint) | Administrators |
| Trigger a manual sync | Owner / Admin |
| Delete a data source | Owner / Admin |
Navigation
Section titled “Navigation”The data source management is accessible via the Settings:
Settings, Data sources
The page shows three areas:
- My data sources: personal data sources of the current user
- Workspace data sources: available to all users in the workspace (SharePoint)
- Personal data sources (admins only), data sources of other users
Each data source shows:
- Name and type (OneDrive, SharePoint)
- Status: active, sync running, error
- Last sync: time of the last successful sync
- File count: how many files are indexed
Supported file types
Section titled “Supported file types”Data sources index all file types that 9brains can process:
| Category | File types |
|---|---|
| Documents | PDF, DOCX, PPTX, XLSX |
| Text | TXT, MD, CSV, HTML |
| Images | PNG, JPG, JPEG, TIFF, BMP, GIF |
Unsupported file types (e.g. videos, ZIP archives) are skipped automatically.
Synchronization
Section titled “Synchronization”Data sources are synchronized automatically:
- Initial indexing: When the data source is created, all files are indexed
- Real-time updates: OneDrive and SharePoint are connected via webhook, changes to files are typically detected immediately and processed via delta sync
- Regular reconciliation: In addition, a full reconciliation runs every 60 minutes to make sure no changes are missed
- Change detection: Only new, changed, or deleted files are processed, no re-indexing of unchanged files
You can also trigger the sync manually: click on the data source and choose “Start sync”.
Frequently asked questions
Section titled “Frequently asked questions”Which license do I need?
Section titled “Which license do I need?”Data sources are available starting with the Business license. The feature is not accessible with the Pro or Starter license.
What happens when the storage quota is exhausted?
Section titled “What happens when the storage quota is exhausted?”Existing data sources remain searchable, but new files are no longer indexed until storage becomes available again. You can delete data sources to free up space.
Can I connect multiple OneDrive accounts?
Section titled “Can I connect multiple OneDrive accounts?”No, one personal OneDrive can be connected per user. For shared files, SharePoint data sources are the right fit.
Can I connect multiple SharePoint sites?
Section titled “Can I connect multiple SharePoint sites?”Yes. Administrators can connect any number of SharePoint Document Libraries as separate workspace data sources, as long as the storage quota is sufficient.
Are deleted files removed from the index immediately?
Section titled “Are deleted files removed from the index immediately?”Deleted files are typically detected immediately via webhook and removed from the index. At the latest, they are removed during the next regular reconciliation (every 60 minutes).
Who can see my personal data sources?
Section titled “Who can see my personal data sources?”No one but you. In the overview, administrators only see that a personal data source exists, the file listings are not visible for privacy reasons.