⚠️
Caution! The following list are concerns when using the Automatic Document Sync feature preview.
- Increased Embedder use or cost if using third party embedder
- Corruption of local database
- Corruption of local vector database
About Automatic document sync
The Automatic Document Sync feature for AnythingLLM allows you to “watch” a document for active changes. When changes are detected the file will be re-embed and all workspaces using this file will automatically be updated.
This enables you to reference a document and have it’s content consistently updates so answers are always accurate to the original source.
Scope of documents
Docker
- Any website link
- Any file collected via a Data connector (eg: Confluence, Github, and YouTube)
- Manually uploaded files are not synced since the browser cannot read from your computer
Desktop
- Any manually uploaded local file
- Any website link
- Any file collected via a Data connector (eg: Confluence, Github, and YouTube)
Enable the feature
First, you need to enable the feature from the feature preview management page.
How to watch a file for changes
Once enabled, you will see an “eye” icon on an currently embedded file. You currently cannot watch an entire directory. If this option on the row is not available - this file is not available for watching.
If you add the same file in any other workspace you will notice the file is automatically watched. If you delete the document totally from the system, it will automatically be unwatched.
Manage and observe watched files easily
Any watched file is checked hourly if it is stale. A stale file is any file that has not had its content refreshed in the last 7 days.
In the future, you will be able to force-refresh a document or change its default stale time.
Summary and notes
Watching a file with AnythingLLM’s Automatic Document Sync will periodically fetch and replace all embeddings of that document across all of your active workspace.
This requires use of the connected embedder and therefore you may want to only watch a few files for resource reasons or cost concerns.
Currently, if you close the application or docker container, the watched files will not be synced as the background worker does not run if the process is killed.
Troubleshooting
If you are having issue with the document sync feature simply disable the toggle for the feature and it will not run any background workers while using AnythingLLM or on reboots.
Please ping the core team with a GitHub issue or Discord message for any questions or bug reports.
How does document sync work with local files?
While in beta, you should use this feature with files that update frequently. Otherwise, it wont help much!
Reminder: this is only available on AnythingLLM Desktop
On AnythingLLM Desktop you can now “watch” any locally uploaded file! Functionally, this works the exact same as watching content that comes from a website or elsewhere, but there are some tips and things to know before watching every locally uploaded file.
Only watch relevant files!
While you can watch any local files it only really makes sense to use this feature on files that can or do change a lot. For example, PDF files don’t change that often.
How often does it sync?
Files will be checked for new content every 10 minutes. The app must be open for this to occur as AnythingLLM does not minimize to the tray or taskbar when closed. If changes are found, the document content and all workspaces will be updated automatically.
How can I check it synced?
Open the feature dashboard and see when the last sync was. Currently there is no easy way to verify the content synced - it will be live soon.
How can I change how often documents sync?
You cannot modify the sync time currently.
What if I move or delete the original document from where it was during upload?
AnythingLLM cannot and does not know where a file is relocated should you move it. On the next interval sync the document will be marked as “Not Found” and it will become automatically unwatched. The existing content and embeddings will not change. You cannot update its current location reference.
Why can’t I watch a file I already uploaded?
Prior to v1.5.9 the required changes to track file locations did not exist. Any files uploaded prior are not available to be watched and should be uploaded & embedded again.