Home

Home

Build AI-powered semantic search applications

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.

Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords.

Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid pace, models can understand concepts in documents, audio, images and video.

Summary of txtai features:

🔎 Large-scale similarity search with multiple index backends (Faiss, Annoy, Hnswlib) and support for external vector databases
📄 Create embeddings for text snippets, documents, audio, images and video
💡 Machine-learning pipelines that run question-answering, labeling, transcription, translation, summarization, LLM prompts and more
↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be microservices or full-fledged indexing workflows.
⚙️ Build with Python or YAML. API bindings available for JavaScript, Java, Rust and Go.
☁️ Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes)

Applications range from similarity search to NLP-driven data extractions that generate structured data. Semantic workflows transform and find data driven by user intent.

The following applications are powered by txtai.

Application	Description
paperai	Semantic search and workflows for medical/scientific papers
codequestion	Semantic search for developers
tldrstory	Semantic search for headlines and story text
neuspo	Fact-driven, real-time sports event and news site

txtai is built with Python 3.7+, Hugging Face Transformers, Sentence Transformers and FastAPI