- Introduction
- Our key features are:
- Powerful and fast full-text searching which works fine for small and big datasets
- Multithreading
- Cost-based query optimizer
- Storage options
- Automatic secondary indexes
- SQL-first
- JSON over HTTP
- Elasticsearch-compatible writes
- Declarative and imperative schema management
- The benefits of C++ and the convenience of PHP
- Real-Time inserts
- Interactive courses for easy learning
- Transactions
- Built-In replication and load balancing
- Built-in backup capabilities
- Out-of-the-box data sync
- Integration options
- Stream filtering made easy
- Our key features are:
- Possible applications:
- Read this first
- Best practices
Introduction
Manticore Search is a multi-storage database specifically designed for search, with robust full-text search capabilities.
As an open-source database (available on GitHub), Manticore Search was created in 2017 as a continuation of Sphinx Search engine. Our development team took all the best features of Sphinx and significantly improved its functionality, fixing hundreds of bugs along the way (as detailed in our Changelog). With nearly complete code rewrites, Manticore Search is now a modern, fast, and light-weight database with full features and exceptional full-text search capabilities.
Our key features are:
Powerful and fast full-text searching which works fine for small and big datasets
- Over 20 full-text operators and over 20 ranking factors
- Custom ranking
- Stemming
- Lemmatization
- Stopwords
- Synonyms
- Wordforms
- Advanced tokenization at character and word level
- Proper Chinese segmentation
- Text highlighting
Multithreading
Manticore Search utilizes a smart query parallelization to lower response time and fully utilize all CPU cores when needed.
Cost-based query optimizer
A cost-based query optimizer uses statistical data about the indexed data to evaluate the relative costs of different execution plans for a given query. This allows the optimizer to determine the most efficient plan for retrieving the desired results, taking into account factors such as the size of the indexed data, the complexity of the query, and the available resources.
Storage options
Manticore offers both row-wise and column-oriented storage options to accommodate datasets of various sizes. The traditional and default row-wise storage option is available for datasets of all sizes - small, medium, and large, while the columnar storage option is provided through the Manticore Columnar Library for even larger datasets. The key difference between these storage options is that row-wise storage requires all attributes (excluding full-text fields) to be kept in RAM for optimal performance, while columnar storage does not, thus offering lower RAM consumption, but with a potential for slightly slower performance (as demonstrated by the statistics on https://db-benchmarks.com/).
Automatic secondary indexes
Manticore Columnar Library uses Piecewise Geometric Model index, which exploits a learned mapping between the indexed keys and their location in memory. The succinctness of this mapping, coupled with a peculiar recursive construction algorithm, makes the PGM-index a data structure that dominates traditional indexes by orders of magnitude in space while still offering the best query and update time performance. Secondary indexes are ON by default for all numeric fields.
SQL-first
Manticore’s native syntax is SQL and it supports SQL over HTTP and MySQL protocol, allowing for connection through popular mysql clients in any programming language.
JSON over HTTP
For a more programmatic approach to managing data and schemas, Manticore provides HTTP JSON protocol, similar to that of Elasticsearch.
Elasticsearch-compatible writes
You can execute Elasticsearch-compatible insert and replace JSON queries which enables using Manticore with tools like Logstash (version < 7.13), Filebeat and other tools from the Beats family.
Declarative and imperative schema management
Easily create, update, and delete tables online or through a configuration file.
The benefits of C++ and the convenience of PHP
The Manticore Search daemon is developed in C++, offering fast start times and efficient memory utilization. The utilization of low-level optimizations further boosts performance. Another crucial component, called Manticore Buddy, is written in PHP and is utilized for high-level functionality that does not require lightning-fast response times or extremely high processing power. Although contributing to the C++ code may pose a challenge, adding a new SQL/JSON command using Manticore Buddy should be a straightforward process.
Real-Time inserts
Newly added or updated documents can be immediately read.
Interactive courses for easy learning
We offer free interactive courses to make learning effortless.
Transactions
While Manticore is not fully ACID-compliant, it supports isolated transactions for atomic changes and binary logging for safe writes.
Built-In replication and load balancing
Data can be distributed across servers and data centers with any Manticore Search node acting as both a load balancer and a data node. Manticore implements virtually synchronous multi-master replication using the Galera library, ensuring data consistency across all nodes, preventing data loss, and providing exceptional replication performance.
Built-in backup capabilities
Manticore is equipped with an external tool manticore-backup, and the BACKUP SQL command to simplify the process of backing up and restoring your data.
Out-of-the-box data sync
The indexer
tool and comprehensive configuration syntax of Manticore make it easy to sync data from sources like MySQL, PostgreSQL, ODBC-compatible databases, XML, and CSV.
Integration options
You can integrate Manticore Search with a MySQL/MariaDB server using the FEDERATED engine or via ProxySQL.
Stream filtering made easy
Manticore offers a special table type, the “percolate“ table, which allows you to search queries instead of data, making it an efficient tool for filtering full-text data streams. Simply store your queries in the table, process your data stream by sending each batch of documents to Manticore Search, and receive only the results that match your stored queries.
Possible applications:
Manticore has a variety of use cases, including:
- Full-text search
- With small data volumes, enjoy the benefits of powerful full-text search syntax and low memory consumption (as low as 7-8 MB).
- With large data, benefit from Manticore’s high availability and ability to handle massive tables.
- OLAP: Use Manticore Search and the Manticore Columnar Library to analyze terabytes of data on a single or multiple servers.
- Faceted search
- Geo-spatial search
- Spell correction
- Autocomplete
- Data stream filtering
Read this first
About this manual
The manual is arranged as a reflection of the most likely way you would use Manticore:
- starting from some basic information about it and how to install and connect
- through some essential things like adding documents and running searches
- to some performance optimization tips and tricks and extending Manticore with help of plugins and custom functions
Do not skip ✔️
Key sections of the manual are marked with sign ✔️ in the menu for your convenience since their corresponding functionality is most used. If you are new to Manticore we highly recommend to not skip them.
Quick start guide
If you are looking for a quick understanding of how Manticore works in general ⚡ Quick start guide section should be good to read.
Using examples
Each query example has a little icon 📋 in the top-right corner:
You can use it to copy examples to clipboard. If the query is an HTTP request it will be copied as a CURL command. You can configure the host/port if you press ️.
Search in this manual
We love search and we’ve made our best to make searching in this manual as convenient as possible. Of course it’s backed by Manticore Search. Besides using the search bar which requires opening the manual first there is a very easy way to find something by just opening mnt.cr/your-search-keyword :
Best practices
There are few things you need to understand about Manticore Search that can help you follow the best practices of using it.
Real-time table vs plain table
- Real-time table allows adding, updating and deleting documents with immediate availability of the changes.
- Plain table is a mostly immutable data structure and a basic element used by real-time tables. Plain table stores a set of documents, their common dictionary and indexation settings. One real-time table can consist of multiple plain tables (chunks), but besides that Manticore provides direct access to building plain tables using tool indexer. It makes sense when your data is mostly immutable, therefore you don’t need a real-time table for that.
Real-time mode vs plain mode
Manticore Search works in two modes:
- Real-time mode (RT mode). This is a default one and allows to manage your data schema imperatively:
- allows managing your data schema online using SQL commands
CREATE
/ALTER
/DROP TABLE
and their equivalents in non-SQL clients - in the configuration file you need to define only server-related settings including data_dir
- allows managing your data schema online using SQL commands
- Plain mode allows to define your data schemas in a configuration file, i.e. provides declarative kind of schema management. It makes sense in three cases:
- when you only deal with plain tables
- or when your data schema is very stable and you don’t need replication (as it’s available only in the RT mode)
- when you have to make your data schema portable (e.g. for easier deployment of it on a new server)
You cannot combine the 2 modes and need to decide which one you want to follow by specifying data_dir in your configuration file (which is the default behaviour). If you are unsure our recommendation is to follow the RT mode as if even you need a plain table you can build it with a separate plain table config and import to your main Manticore instance.
Real-time tables can be used in both RT and plain modes. In the RT mode a real-time table is defined with a CREATE TABLE
command, while in the plain mode it is defined in the configuration file. Plain (offline) tables are supported only in the plain mode. Plain tables cannot be created in the RT mode, but existing plain tables made in the plain mode can be converted to real-time tables and imported in the RT mode.
SQL vs JSON
Manticore provides multiple ways and interfaces to manage your schemas and data, but the two main are:
- SQL. This is a native Manticore’s language which enables all Manticore’s functionality. The best practice is to use SQL to:
- manage your schemas and do other DBA routines as it’s the easiest way to do that
- design your queries as SQL is much closer to natural language than the JSON DSL which is important when you design something new. You can use Manticore SQL via any MySQL client or /sql.
- JSON. Most functionality is also available via JSON domain specific language. This is especially useful when you integrate Manticore with your application as with JSON you can do it more programmatically than with SQL. The best practice is to first explore how to do something via SQL and then use JSON to integrate it into your application.