Examples

examples examples

See below for a comprehensive series of example notebooks and applications covering txtai.

Build semantic/similarity/vector/neural search applications.

NotebookDescription
Introducing txtai ▶️Overview of the functionality provided by txtaiOpen In Colab
Build an Embeddings index with Hugging Face DatasetsIndex and search Hugging Face DatasetsOpen In Colab
Build an Embeddings index from a data sourceIndex and search a data source with word embeddingsOpen In Colab
Add semantic search to ElasticsearchAdd semantic search to existing search systemsOpen In Colab
Similarity search with imagesEmbed images and text into the same space for searchOpen In Colab
Custom Embeddings SQL functionsAdd user-defined functions to Embeddings SQLOpen In Colab
Model explainabilityExplainability for semantic searchOpen In Colab
Query translationDomain-specific natural language queries with query translationOpen In Colab
Build a QA databaseQuestion matching with semantic searchOpen In Colab
Semantic GraphsExplore topics, data connectivity and run network analysisOpen In Colab
Topic Modeling with BM25Topic modeling backed by a BM25 indexOpen In Colab

LLM

Prompt-driven search, retrieval augmented generation (RAG), pipelines and workflows that interface with large language models (LLMs).

NotebookDescription
Prompt-driven search with LLMsEmbeddings-guided and Prompt-driven search with Large Language Models (LLMs)Open In Colab
Prompt templates and task chainsBuild model prompts and connect tasks together with workflowsOpen In Colab

Pipelines

Transform data with language model backed pipelines.

NotebookDescription
Extractive QA with txtaiIntroduction to extractive question-answering with txtaiOpen In Colab
Extractive QA with ElasticsearchRun extractive question-answering queries with ElasticsearchOpen In Colab
Extractive QA to build structured dataBuild structured datasets using extractive question-answeringOpen In Colab
Apply labels with zero shot classificationUse zero shot learning for labeling, classification and topic modelingOpen In Colab
Building abstractive text summariesRun abstractive text summarizationOpen In Colab
Extract text from documentsExtract text from PDF, Office, HTML and moreOpen In Colab
Text to speech generationGenerate speech from textOpen In Colab
Transcribe audio to textConvert audio files to textOpen In Colab
Translate text between languagesStreamline machine translation and language detectionOpen In Colab
Generate image captions and detect objectsCaptions and object detection for imagesOpen In Colab
Near duplicate image detectionIdentify duplicate and near-duplicate imagesOpen In Colab

Workflows

Efficiently process data at scale.

NotebookDescription
Run pipeline workflows ▶️Simple yet powerful constructs to efficiently process dataOpen In Colab
Transform tabular data with composable workflowsTransform, index and search tabular dataOpen In Colab
Tensor workflowsPerformant processing of large tensor arraysOpen In Colab
Entity extraction workflowsIdentify entity/label combinationsOpen In Colab
Workflow SchedulingSchedule workflows with cron expressionsOpen In Colab
Push notifications with workflowsGenerate and push notifications with workflowsOpen In Colab
Pictures are a worth a thousand wordsGenerate webpage summary images with DALL-E miniOpen In Colab
Run txtai with native codeExecute workflows in native code with the Python C APIOpen In Colab

Model Training

Train NLP models.

NotebookDescription
Train a text labelerBuild text sequence classification modelsOpen In Colab
Train without labelsUse zero-shot classifiers to train new modelsOpen In Colab
Train a QA modelBuild and fine-tune question-answering modelsOpen In Colab
Train a language model from scratchBuild new language modelsOpen In Colab
Export and run models with ONNXExport models with ONNX, run natively in JavaScript, Java and RustOpen In Colab
Export and run other machine learning modelsExport and run models from scikit-learn, PyTorch and moreOpen In Colab

Scale

Run distributed txtai, integrate with the API and cloud endpoints.

NotebookDescription
API GalleryUsing txtai in JavaScript, Java, Rust and GoOpen In Colab
Distributed embeddings clusterDistribute an embeddings index across multiple data nodesOpen In Colab
Embeddings in the CloudLoad and use an embeddings index from the Hugging Face HubOpen In Colab

Architecture

Deep dives into project architecture, data formats and performance.

NotebookDescription
Anatomy of a txtai indexDeep dive into the file formats behind a txtai embeddings indexOpen In Colab
Embeddings componentsComposable search with vector, SQL and scoring componentsOpen In Colab
Customize your own embeddings databaseWays to combine vector indexes with relational databasesOpen In Colab
Building an efficient sparse keyword index in PythonFast and accurate sparse keyword indexingOpen In Colab
Benefits of hybrid searchImprove accuracy with a combination of semantic and keyword searchOpen In Colab

Releases

New functionality added in major releases.

NotebookDescription
What’s new in txtai 4.0Content storage, SQL, object storage, reindex and compressed indexesOpen In Colab
What’s new in txtai 6.0Sparse, hybrid and subindexes for embeddings, LLM improvementsOpen In Colab

Applications

Series of example applications with txtai. Links to hosted versions on Hugging Face Spaces are also provided, when available.

ApplicationDescription
Basic similarity searchBasic similarity search example. Data from the original txtai demo.🤗
Baseball statsMatch historical baseball player stats using vector search.🤗
BenchmarksCalculate performance metrics for the BEIR datasets.Local run only
Book searchBook similarity search application. Index book descriptions and query using natural language statements.Local run only
Image searchImage similarity search application. Index a directory of images and run searches to identify images similar to the input query.🤗
Summarize an articleSummarize an article. Workflow that extracts text from a webpage and builds a summary.🤗
Wiki searchWikipedia search application. Queries Wikipedia API and summarizes the top result.🤗
Workflow builderBuild and execute txtai workflows. Connect summarization, text extraction, transcription, translation and similarity search pipelines together to run unified workflows.🤗