Searching for keywords in ElasticSearch

3

I'm registering some objects of the type:

[{
  nome: "bom_atendimento",
  chaves: ["bem atendido", "atendimento bom"]
},
{
  nome: "ruim_atendimento",
  chaves: ["pessimo atendimento", "atendimento ruim"]
}]

I need these keys to be identified in a text.

Examples of input / output text that I need:

1 - "This service was bad"

result:

    {
       nome: "ruim_atendimento",
      chaves: ["pessimo atendimento", "atendimento ruim"]
    }

2 - "Today I was well attended"

result:

{
      nome: "bom_atendimento",
      chaves: ["bem atendido", "atendimento bom"]
    }

How I am indexing:

{
      "settings": {
        "analysis": {
          "analyzer": {
            "custom_keyword_analyzer": {
              "type": "custom",
              "tokenizer": "standard",
              "filter": [
                "asciifolding",
                "lowercase",
                "custom_stopwords",
                "custom_stemmer"
              ]
            },
            "custom_shingle_analyzer": {
              "type": "custom",
              "tokenizer": "whitespace",
              "filter": [
                "custom_stopwords",
                "custom_stemmer",
                "asciifolding",
                "lowercase",
                "custom_shingle"
              ]
            }
          },
          "filter": {
            "custom_stemmer": {
              "type": "stemmer",
              "name": "brazilian"
            },
            "custom_stopwords": {
              "type": "stop",
              "stopwords": [
                "a",
                "as",
                "o",
                "os",
                "fui"
              ],
              "ignore_case": true
            },
            "custom_shingle": {
              "type": "shingle",
              "min_shingle_size": 2,
              "output_unigrams": false,
              "output_unigrams_if_no_shingles": true
            }
          }
        }
      },
      "mappings": {
        "meutipo": {
          "properties": {
            "nome": { "type": "keyword" },
            "chaves": {
              "type": "text",
              "analyzer": "custom_keyword_analyzer",
              "search_analyzer": "custom_shingle_analyzer"
            }
          }
        }
      }
    }

My search:      {            index: 'myindex',            type: 'mytype',

      body:{
         "query": {
            "match": { "chaves": "texto"} }
          }
      }
 }

I'm not getting results. When I remove custom_shingle from the filter custom_shingle_analyzer , any text that has "answer" returns the two registers of the ES.

I need only results if the text contains at least one expression exactly equal to chave registered in the ES. In my example:

To get the result:

 {
      nome: "bom_atendimento",
      chaves: ["bem atendido", "atendimento bom"]
    }

The text should contain "well attended" or "good service".

What is the best way to do this? Using synonyms and shingle?

Elasticsearch version: 5.1.2

    
asked by anonymous 22.09.2017 / 17:33

1 answer

0

You can get the result using Function Score Query .

Documentation: link

Assuming you are using elasticsearch-js as an Elasticsearch client:

1 - The first part is to use query for terms

a>, since the "exact phrase" can never match your query:

var terms = "Hoje eu fui bem atendido".split(" ");

es.search({
  index: 'seu-index',
  type: 'seu tipo',
  body: {
    query: {
      terms: { chaves: terms }
    }
  }
}, function(error, result) {
  console.log(JSON.stringify(result, null, 2));
});

Even using the query for terms, the ES will return all keys because the two contain terms in common ("attendance" and etc.).

2 - Adding the Function Score to filter the keys:

Note in the score of the first query, the key "good_atentimento" obtained a score greater than "bad_atentimento", therefore we can define a minimum score.

var terms = "Hoje eu fui bem atendido".split(" ");

es.search({
  index: 'seu-index',
  type: 'seu tipo',
  body: {
    query: {
      function_score: {
        query: { terms: { chaves: terms } },
        min_score: 0.7
      }
    }
  }
}, function(error, result) {
  console.log(JSON.stringify(result, null, 2));
});

That way only the key that hits the minimum score is returned.

Tip : Why do you need to save the term "care" in the key?

Saving just relevant terms to search the phrase would make your search much simpler, eg:

 {
   nome: "ruim_atendimento",
   chaves: ["pessimo", "ruim", "horrivel"]
 }
    
03.10.2017 / 16:24