Database
The following covers available content storage configuration options.
content
content: boolean|sqlite|duckdb|client|url|custom
Enables content storage. When true, the default storage engine, sqlite
will be used to save metadata alongside embeddings vectors.
Client-server connections are supported with either client
or a full connection URL. When set to client
, the CLIENT_URL environment variable must be set to the full connection URL. See the SQLAlchemy documentation for more information on how to construct connection strings for client-server databases.
Add custom storage engines via setting this parameter to the fully resolvable class string.
Content storage specific settings are set with a corresponding configuration object having the same name as the content storage engine (i.e. duckdb or sqlite). None of these are required and are set to defaults if omitted.
sqlite
sqlite:
wal: enable write-ahead logging - allows concurrent read/write operations,
defaults to false
objects
objects: boolean|image|pickle
Enables object storage. Supports storing binary content alongside embeddings vectors and metadata. Requires content storage to also be enabled.
Object encoding options are:
standard
: Default encoder when boolean set. Encodes and decodes objects as byte arrays.image
: Image encoder. Encodes and decodes objects as image objects.pickle
: Pickle encoder. Encodes and decodes objects with the pickle module. Supports arbitrary objects.
functions
functions: list
List of functions with user-defined SQL functions, only used when content is enabled. Each list element must be one of the following:
- function
- callable object
- dict with fields for name, argcount and function
query
query:
path: sets the path for the query model - this can be any model on the
Hugging Face Model Hub or a local file path.
prefix: text prefix to prepend to all inputs
maxlength: maximum generated sequence length
Query translation model. Translates natural language queries to txtai compatible SQL statements.