How to know which word is the most used in all registers

4

I'm doing a top trendig system, however I'm very confused: How do I make the select pull only the 10 most used words in all registers.

However, the words are in a column and separated by a comma.

NOTE: I'm using PHP 5.3

Tabela posts
Estrutura: 
id
post_id
post
author_id
author
hashtags
date
    
asked by anonymous 13.07.2017 / 08:03

3 answers

6

You need a dictionary.

Interestingly, PHP arrays can function as dictionaries. I found this out now, researching to answer the question. So let's get started:

$dicionario = array();

The logic is this: make a dictionary where keys are words, and value is the number of occurrences.

Since each record has multiple words separated by commas, let's start by extracting words from each record. In most languages we call it split ( split ), but since PHP is a terrorist thing in it we use the explode function:

$palavras = explode(",", $registro);

And then we include the words in the dictionary as follows: if the key does not exist, it is created with a value of zero. Then, regardless of existing or not, we increase its value.

foreach ($palavras as $chave) {
    if (!$dicionario[$chave]) {
        $dicionario[$chave] = 0;
    }
    $dicionario[$chave]++;
}

Note that we have to blast and add to the dictionary once for each record.

Finally, we need to get the records with the ten largest values. The algorithm below removes the ten largest values from the dictionary while including them in an orderly manner in another dictionary.

$dezMaiores = array();
for ($i = 1; $i <= 10; $i++) {
    $maiorValor = 0;
    $maiorChave = "";
    foreach ($dicionario as $chave => $valor) {
        if ($valor > $maiorValor) {
            $maiorValor = $valor;
            $maiorChave = $chave;
        }
    }
    $dezMaiores[$i] = $maiorChave;
    unset($dicionario[$maiorChave]);
}

Now you can use the ten most used expressions on your system:)

    
13.07.2017 / 14:18
3

Another solution running all logic in PHP is to use the native functions array_merge and array_map to create the word list, function array_count_values to calculate the frequency of each word and the arsort to sort them in descending order.

<?php

$rows = [
    "abacate,banana,caqui",
    "banana,melão,goiaba",
    "laranja,banana,abacate"
];

$palavras = array_merge(
    ...array_map(
        function ($value) { 
            return explode(',', $value); 
        }, 

        $rows
    )
);

$palavras = array_count_values($palavras);

arsort($palavras);

print_r($palavras);
  

The above solution uses splat operator , so it only works on PHP 5.6 + versions.

Output output is:

Array
(
    [banana] => 3
    [abacate] => 2
    [caqui] => 1
    [melão] => 1
    [goiaba] => 1
    [laranja] => 1
)
    
13.07.2017 / 16:55
0
$link = new mysqli ("localhost", "USUARIO", "SENHA", "NOME_BANCO");

$query = "SELECT Nome_Coluna FROM Nome_Tabela";

$results = mysqli_query($link,$query);

while($row = mysqli_fetch_assoc($results)) {

  $words .= $row["Nome_Coluna"].",";

}

$palavras = explode(',', $words);

//numero total de palavras
//echo count($palavras); 

$ocorrencias = array();

for($i = 0; $i<count($palavras); $i++){
    $palavra = $palavras[$i];

      // Evitar erros Undefined index vide OBS no final da resposta
      if (!isset($ocorrencias[$palavra])) {
          $ocorrencias[$palavra] = 0;
      }

    $ocorrencias[$palavra]++;
}

arsort($ocorrencias);

//mostra todo o array estruturado
//var_dump($ocorrencias);

    //mostra os dez primeiros valores e respectivas chaves
    $i=0;
     foreach($ocorrencias as $chave => $valor){
            if($i < 10){  
                echo $chave.' - '.$valor.'<br>';
            }
            $i=$i+1;
        }
  

NOTE: In the first word, $ocorrencias is an empty list and when you do $ocorrencias[$palavra] you will be accessing a nonexistent key. To avoid the error, simply verify that the key exists and, if it does not exist, create it with the value zero.
Source

    

13.07.2017 / 20:58