Elasticsearch’s features make it easy for you to search natural language text in a language-aware way. It parses natural language text, applies stemming, stop words, and synonyms to make text matches better. And has a built-in notion of scoring and sorting that brings the best results to the top.
The sample data includes a plot field with short descriptions of the movies. This field is processed as natural language text. You can search it by entering the following query:
GET movies/_search
{
"query": {
"match": {
"title": "star man"
}
},
"_source": "title"
}
You’ll see that you get a mix of movies, including Star Wars, Star Trek, and a couple of others. By default the match query uses a union (OR) of the terms from the query, so only one term is required to match. You could change the query to be a bool query and force both star and man to match. Let’s instead experiment with the scoring.
GET movies/_search
{
"query": {
"function_score": {
"query": {
"match": {
"title": "star man"
}
},
"functions": [
{
"exp": {
"year": {
"origin": "1977",
"scale": "1d",
"decay": 0.5
}
}
},
{
"script_score": {
"script": "_score * doc['rating'].value / 100"
}
}
],
"score_mode": "replace"
}
}
}
You’ve added an exponential decay function, centered on the year 1977. You’ve also used a script to multiply Elasticsearch’s base score by a factor of each document’s rating. This brings Star Wars to the top and sorts the rest of the movies mostly by their release date.