- $text
- Definition
- Behavior
- Examples
- Search for a Single Word
- Match Any of the Search Terms
- Search for a Phrase
- Exclude Documents That Contain a Term
- Search a Different Language
- Case and Diacritic Insensitive Search
- Perform Case Sensitive Search
- Diacritic Sensitive Search
- Return the Text Search Score
- Sort by Text Search Score
- Return Top 2 Matching Documents
- Text Search with Additional Query and Sort Expressions
$text
Atlas Full-Text Search
MongoDB Atlas Full-Text Search Indexes leverage Apache Lucene topower rich text search with features like language analysis andscoring.
Visit Atlas Full-Text Searchto learn more. You can use the Atlas promotional codeMONGODB4DOT2
for $200 of Atlas credit. For information onredeeming Atlas credit, see Atlas Billing.
Definition
$text
$text
performs a text search on the content of the fieldsindexed with a text index. A$text
expression has the following syntax:
Changed in version 3.2.
- {
- $text:
- {
- $search: <string>,
- $language: <string>,
- $caseSensitive: <boolean>,
- $diacriticSensitive: <boolean>
- }
- }
The $text
operator accepts a text query document with thefollowing fields:
FieldTypeDescription$search
stringA string of terms that MongoDB parses and uses to query the textindex. MongoDB performs a logical OR
search of the terms unlessspecified as a phrase. See Behavior for more information onthe field.$language
stringOptional. The language that determines the list of stop words for the search andthe rules for the stemmer and tokenizer. If not specified, the searchuses the default language of the index. For supported languages, seeText Search Languages.
If you specify a language value of "none"
, then the text searchuses simple tokenization with no list of stop words and no stemming.$caseSensitive
booleanOptional. A boolean flag to enable or disable case sensitive search. Defaultsto false
; i.e. the search defers to the case insensitivity of thetext index.
For more information, see Case Insensitivity.
New in version 3.2.
$diacriticSensitive
booleanOptional. A boolean flag to enable or disable diacritic sensitive searchagainst version 3 text indexes. Defaults tofalse
; i.e. the search defers to the diacritic insensitivity ofthe text index.
Text searches against earlier versions of the text index areinherently diacritic sensitive and cannot be diacritic insensitive.As such, the $diacriticSensitive
option has no effect withearlier versions of the text index.
For more information, see Diacritic Insensitivity.
New in version 3.2.
The $text
operator, by default, does not return resultssorted in terms of the results’ scores. For more information onsorting by the text search scores, see theText Score documentation.
Behavior
Restrictions
- A query can specify, at most, one
$text
expression. - The
$text
query can not appear in$nor
expressions. - The
$text
query can not appear in$elemMatch
queryexpressions or$elemMatch
projection expressions. - To use a
$text
query in an$or
expression, allclauses in the$or
array must be indexed. - You cannot use
hint()
if the query includesa$text
query expression. - You cannot specify
$natural
sort order if the queryincludes a$text
expression. - You cannot combine the
$text
expression, which requires aspecial text index, with a query operatorthat requires a different type of special index. For example youcannot combine$text
expression with the$near
operator. - Views do not support text search.
If using the $text
operator in aggregation, the followingrestrictions also apply.
- The
$match
stage that includes a$text
must bethe first stage in the pipeline. - A
text
operator can only occur once in the stage. - The
text
operator expression cannot appear in$or
or$not
expressions. - The text search, by default, does not return the matching documentsin order of matching scores. Use the
$meta
aggregationexpression in the$sort
stage.
$search Field
In the $search
field, specify a string of words that thetext
operator parses and uses to query the text index.
The text
operator treats most punctuationin the string as delimiters, except a hyphen-minus (-
) that negates term oran escaped double quotes \"
that specifies a phrase.
Phrases
To match on a phrase, as opposed to individual terms, enclose thephrase in escaped double quotes (\"
), as in:
- "\"ssl certificate\""
If the $search
string includes a phrase and individual terms, textsearch will only match the documents that include the phrase.
For example, passed a $search
string:
- "\"ssl certificate\" authority key"
The $text
operator searches for the phrase "sslcertificate"
.
Negations
Prefixing a word with a hyphen-minus (-
) negates a word:
- The negated word excludes documents that contain thenegated word from the result set.
- When passed a search string that only contains negated words, textsearch will not match any documents.
- A hyphenated word, such as
pre-market
, is not a negation. If usedin a hyphenated word,$text
operator treats the hyphen-minus(-
) as a delimiter. To negate the wordmarket
in thisinstance, include a space betweenpre
and-market
, i.e.,pre -market
.
The $text
operator adds all negations to the query with thelogical AND
operator.
Match Operation
Stop Words
The $text
operator ignores language-specific stop words, suchas the
and and
in English.
Stemmed Words
For case insensitive and diacritic insensitive text searches, the$text
operator matches on the complete stemmed word. So if adocument field contains the word blueberry
, a search on the termblue
will not match. However, blueberry
or blueberries
willmatch.
Case Sensitive Search and Stemmed Words
For case sensitive search (i.e.$caseSensitive: true
), if the suffix stem contains uppercaseletters, the $text
operator matches on the exact word.
Diacritic Sensitive Search and Stemmed Words
For diacritic sensitivesearch (i.e. $diacriticSensitive: true
), if the suffix stemcontains the diacritic mark or marks, the $text
operatormatches on the exact word.
Case Insensitivity
Changed in version 3.2.
The $text
operator defaults to the case insensitivity of thetext index:
- The version 3 text index iscase insensitive for Latin characters with or without diacritics andcharacters from non-Latin alphabets, such as the Cyrillic alphabet.See text index for details.
- Earlier versions of the
text
index are case insensitive for Latincharacters without diacritic marks; i.e. for[A-z]
.
$caseSensitive Option
To support case sensitive search where the text
index is caseinsensitive, specify $caseSensitive: true
.
Case Sensitive Search Process
When performing a case sensitive search ($caseSensitive: true
)where the text
index is case insensitive, the $text
operator:
- First searches the
text
index for case insensitive and diacriticmatches. - Then, to return just the documents that match the case of the searchterms, the
$text
query operation includes an additionalstage to filter out the documents that do not match the specifiedcase.
For case sensitive search (i.e. $caseSensitive: true
), ifthe suffix stem contains uppercase letters, the $text
operatormatches on the exact word.
Specifying $caseSensitive: true
may impact performance.
See also
Diacritic Insensitivity
Changed in version 3.2.
The $text
operator defaults to the diacritic insensitivity ofthe text index:
- The version 3 text index isdiacritic insensitive. That is, the index does not distinguishbetween characters that contain diacritical marks and theirnon-marked counterpart, such as
é
,ê
, ande
. - Earlier versions of the
text
index are diacritic sensitive.
$diacriticSensitive Option
To support diacritic sensitive text search against the version 3text
index, specify $diacriticSensitive: true
.
Text searches against earlier versions of the text
index areinherently diacritic sensitive and cannot be diacritic insensitive. Assuch, the $diacriticSensitive
option for the $text
operator has no effect with earlier versions of the text
index.
Diacritic Sensitive Search Process
To perform a diacritic sensitive text search ($diacriticSensitive:true
) against a version 3 text
index, the $text
operator:
- First searches the
text
index, which is diacritic insensitive. - Then, to return just the documents that match the diacritic markedcharacters of the search terms, the
$text
query operationincludes an additional stage to filter out the documents that do notmatch.
Specifying $diacriticSensitive: true
may impact performance.
To perform a diacritic sensitive search against an earlier version ofthe text
index, the $text
operator searches the text
index which is diacritic sensitive.
For diacritic sensitive search, if the suffix stem contains thediacritic mark or marks, the $text
operator matches on theexact word.
See also
Text Score
The $text
operator assigns a score to each document thatcontains the search term in the indexed fields. The score representsthe relevance of a document to a given text search query. The score canbe part of a sort()
method specification as well as part of theprojection expression. The { $meta: "textScore" }
expressionprovides information on the processing of the $text
operation.See $meta
projection operator for details on accessing the score forprojection or sort.
Examples
The following examples assume a collection articles
that has aversion 3 text index on the field subject
:
- db.articles.createIndex( { subject: "text" } )
Populate the collection with the following documents:
- db.articles.insert(
- [
- { _id: 1, subject: "coffee", author: "xyz", views: 50 },
- { _id: 2, subject: "Coffee Shopping", author: "efg", views: 5 },
- { _id: 3, subject: "Baking a cake", author: "abc", views: 90 },
- { _id: 4, subject: "baking", author: "xyz", views: 100 },
- { _id: 5, subject: "Café Con Leche", author: "abc", views: 200 },
- { _id: 6, subject: "Сырники", author: "jkl", views: 80 },
- { _id: 7, subject: "coffee and cream", author: "efg", views: 10 },
- { _id: 8, subject: "Cafe con Leche", author: "xyz", views: 10 }
- ]
- )
Search for a Single Word
The following query specifies a $search
string of coffee
:
- db.articles.find( { $text: { $search: "coffee" } } )
This query returns the documents that contain the term coffee
in theindexed subject
field, or more precisely, the stemmed version ofthe word:
- { "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
- { "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 }
- { "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }
See also
Case Insensitivity,Stemmed Words
Match Any of the Search Terms
If the search string is a space-delimited string, $text
operator performs a logical OR
search on each term and returnsdocuments that contains any of the terms.
The following query specifies a $search
string of three termsdelimited by space, "bake coffee cake"
:
- db.articles.find( { $text: { $search: "bake coffee cake" } } )
This query returns documents that contain either bake
orcoffee
or cake
in the indexed subject
field, or moreprecisely, the stemmed version of these words:
- { "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
- { "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 }
- { "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }
- { "_id" : 3, "subject" : "Baking a cake", "author" : "abc", "views" : 90 }
- { "_id" : 4, "subject" : "baking", "author" : "xyz", "views" : 100 }
See also
Case Insensitivity,Stemmed Words
Search for a Phrase
To match the exact phrase as a single term, escape the quotes.
The following query searches for the phrase coffee shop
:
- db.articles.find( { $text: { $search: "\"coffee shop\"" } } )
This query returns documents that contain the phrase coffee shop
:
- { "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
See also
Exclude Documents That Contain a Term
A negated term is a term that is prefixed by a minus sign -
. Ifyou negate a term, the $text
operator will exclude thedocuments that contain those terms from the results.
The following example searches for documents that contain the wordscoffee
but do not contain the term shop
, or more preciselythe stemmed version of the words:
- db.articles.find( { $text: { $search: "coffee -shop" } } )
The query returns the following documents:
- { "_id" : 7, "subject" : "coffee and cream", "author" : "efg", "views" : 10 }
- { "_id" : 1, "subject" : "coffee", "author" : "xyz", "views" : 50 }
See also
Search a Different Language
Use the optional $language
field in the $text
expressionto specify a language that determines the list of stop words and therules for the stemmer and tokenizer for the search string.
If you specify a language value of "none"
, then the text searchuses simple tokenization with no list of stop words and no stemming.
The following query specifies es
, i.e. Spanish, as the languagethat determines the tokenization, stemming, and stop words:
- db.articles.find(
- { $text: { $search: "leche", $language: "es" } }
- )
The query returns the following documents:
- { "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }
- { "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz", "views" : 10 }
The $text
expression can also accept the language by name,spanish
. See Text Search Languages for the supportedlanguages.
See also
Case and Diacritic Insensitive Search
Changed in version 3.2.
The $text
operator defers to the case and diacriticinsensitivity of the text
index. The version 3 text
index isdiacritic insensitive and expands its case insensitivity to include theCyrillic alphabet as well as characters with diacritics. For details,see text Index Case Insensitivity and text Index DiacriticInsensitivity.
The following query performs a case and diacritic insensitive textsearch for the terms сы́рники
or CAFÉS
:
- db.articles.find( { $text: { $search: "сы́рники CAFÉS" } } )
Using the version 3 text
index, the query matches the followingdocuments.
- { "_id" : 6, "subject" : "Сырники", "author" : "jkl", "views" : 80 }
- { "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }
- { "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz", "views" : 10 }
With the previous versions of the text
index, the query would notmatch any document.
See also
Case Insensitivity,Diacritic Insensitivity,Stemmed Words,Text Indexes
Perform Case Sensitive Search
Changed in version 3.2.
To enable case sensitive search, specify $caseSensitive: true
.Specifying $caseSensitive: true
may impact performance.
Case Sensitive Search for a Term
The following query performs a case sensitive search for the termCoffee
:
- db.articles.find( { $text: { $search: "Coffee", $caseSensitive: true } } )
The search matches just the document:
- { "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg", "views" : 5 }
See also
Case Insensitivity,Case Sensitive Search and Stemmed Words
Case Sensitive Search for a Phrase
The following query performs a case sensitive search for the phraseCafé Con Leche
:
- db.articles.find( {
- $text: { $search: "\"Café Con Leche\"", $caseSensitive: true }
- } )
The search matches just the document:
- { "_id" : 5, "subject" : "Café Con Leche", "author" : "abc", "views" : 200 }
See also
Case Sensitive Search and Stemmed Words,Case Insensitivity
Case Sensitivity with Negated Term
A negated term is a term that is prefixed by a minus sign -
. Ifyou negate a term, the $text
operator will exclude thedocuments that contain those terms from the results. You can alsospecify case sensitivity for negated terms.
The following example performs a case sensitive search for documentsthat contain the word Coffee
but do not contain the lower-caseterm shop
, or more precisely the stemmed version of the words:
- db.articles.find( { $text: { $search: "Coffee -shop", $caseSensitive: true } } )
The query matches the following document:
- { "_id" : 2, "subject" : "Coffee Shopping", "author" : "efg" }
See also
Case Sensitive Search and Stemmed Words,Negations
Diacritic Sensitive Search
Changed in version 3.2.
To enable diacritic sensitive search against a version 3 text index, specify $diacriticSensitive: true
.Specifying $diacriticSensitive: true
may impact performance.
Diacritic Sensitive Search for a Term
The following query performs a diacritic sensitive text search on theterm CAFÉ
, or more precisely the stemmed version of the word:
- db.articles.find( { $text: { $search: "CAFÉ", $diacriticSensitive: true } } )
The query only matches the following document:
- { "_id" : 5, "subject" : "Café Con Leche", "author" : "abc" }
See also
Diacritic Sensitive Search and Stemmed Words,Diacritic Insensitivity,Case Insensitivity
Diacritic Sensitivity with Negated Term
The $diacriticSensitive
option applies also to negated terms. Anegated term is a term that is prefixed by a minus sign -
. If younegate a term, the $text
operator will exclude the documents thatcontain those terms from the results.
The following query performs a diacritic sensitive text search fordocument that contains the term leches
but not the term cafés
,or more precisely the stemmed version of the words:
- db.articles.find(
- { $text: { $search: "leches -cafés", $diacriticSensitive: true } }
- )
The query matches the following document:
- { "_id" : 8, "subject" : "Cafe con Leche", "author" : "xyz" }
See also
Diacritic Sensitive Search and Stemmed Words,Diacritic Insensitivity,Case Insensitivity
Return the Text Search Score
The following query searches for the term cake
and returns thescore assigned to each matching document:
- db.articles.find(
- { $text: { $search: "cake" } },
- { score: { $meta: "textScore" } }
- )
The returned document includes an additional field score
thatcontains the document’s score associated with the text search.[1]
See also
Sort by Text Search Score
To sort by the text score, include the same$meta
expression in both the projection document and the sort expression.[1] The following query searches for the term coffee
and sorts the results by the descending score:
- db.articles.find(
- { $text: { $search: "coffee" } },
- { score: { $meta: "textScore" } }
- ).sort( { score: { $meta: "textScore" } } )
The query returns the matching documents sorted by descending score.
See also
Return Top 2 Matching Documents
Use the limit()
method in conjunction with asort()
to return the top n
matching documents.
The following query searches for the term coffee
and sorts theresults by the descending score, limiting the results to the top twomatching documents:
- db.articles.find(
- { $text: { $search: "coffee" } },
- { score: { $meta: "textScore" } }
- ).sort( { score: { $meta: "textScore" } } ).limit(2)
See also
Text Search with Additional Query and Sort Expressions
The following query searches for documents where the author
equals"xyz"
and the indexed field subject
contains the termscoffee
or bake
. The operation also specifies a sort order ofascending date
, then descending text search score:
- db.articles.find(
- { author: "xyz", $text: { $search: "coffee bake" } },
- { score: { $meta: "textScore" } }
- ).sort( { date: 1, score: { $meta: "textScore" } } )
See also
Text Search in the Aggregation Pipeline
[1] | (1, 2) The behavior and requirements of the $meta projectionoperator differ from that of the $meta aggregationoperator. For details on the $meta aggregation operator,see the $meta aggregation operator reference page. |