How to select all DB data and make a count of repeated words?

1

Well, the part of counting words is relatively easy. But I can only count one word at a time. Would it be possible to do a complete analysis in the database and count all the repeated words, and put the number of times it is repeated next to the word?

Type, I have 5 records in the database and I want to parse only the TAGS column

1st Registration: Php, Mysql, Java

2nd Registration: MYSQL, Apostila, Java

3rd Registration: Microsoft, C ++, Java

4th Registration: C ++, Apostille, Java

5th Registration: Apostila, Mysql, Java

As a result I would like something like this:

Result of the analysis: Java (5); Mysql (3); Apostille (3); C ++ (2); Microsoft (1);

As you can see, in the possible result, the code compares all existing words and counts the same, then displays them in a "grouped" way

    
asked by anonymous 04.12.2014 / 03:09

2 answers

2

What you describe can be done via MySQL without recourse to PHP.

Given the way you have records, my suggestion is to use a temporary table to insert a record for each value that is separated by a comma and then select from that temporary table with the count:

  • Simulate your table:

    CREATE TABLE minhaTabela
        ('id' int, 'nome' varchar(100), 'tags' varchar(255))
    ;
    
    INSERT INTO minhaTabela
        ('id', 'nome', 'tags')
    VALUES
        (1, "Gabe♦", 'Php, Mysql, Java'),
        (2, "bfavaretto♦", 'MYSQL, Apostila, Java'),
        (3, "utluiz♦", 'Microsoft, C++, Java'),
        (3, "John", 'C++, Apostila, Java'),
        (3, "Doe", 'Apostila, Mysql, Java')
    ;
    
  • Pass values to a temporary table:

    CREATE TEMPORARY TABLE temp (val CHAR(255));
    SET @S1 = CONCAT("INSERT INTO temp (val) VALUES ('",REPLACE((SELECT GROUP_CONCAT( DISTINCT 'tags') AS data FROM 'minhaTabela'), ",", "'),('"),"');");
    PREPARE stmt1 FROM @s1;
    EXECUTE stmt1;
    
  • Select the words and count the number of occurrences:

    SELECT val, count(*) FROM temp GROUP BY LOWER(TRIM(val));
    
  • Result obtained:

    SQL Fiddle

    ┌───────────┬──────────┐
    │ val       │ count(*) │
    ├───────────┼──────────┤
    │ Apostila  │    3     │
    ├───────────┼──────────┤
    │ C++       │    2     │
    ├───────────┼──────────┤
    │ Java      │    5     │
    ├───────────┼──────────┤
    │ Microsoft │    1     │
    ├───────────┼──────────┤
    │ Mysql     │    3     │
    ├───────────┼──────────┤
    │ Php       │    1     │
    └───────────┴──────────┘
    

    To use on the PHP side using PDO:

    /* Dados da ligação à base de dados
     */
    $dbcon = array(
        "host"     => 'localhost',
        "dbname"   => 'minhaBaseDados',
        "username" => 'utilizador',
        "password" => 'password'
    );
    
    /* Ligar à base de dados
     */
    $dbh = new PDO(
        'mysql:host='.$dbcon['host'].';dbname='.$dbcon['dbname'].';',
        $dbcon['username'],
        $dbcon['password'],
        array(
            PDO::ATTR_PERSISTENT               => false,
            PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true,
            PDO::ATTR_ERRMODE                  => PDO::ERRMODE_EXCEPTION,
            PDO::MYSQL_ATTR_INIT_COMMAND       => "SET NAMES utf8"
        )
    );
    
    /* Ler os dados para uma tabela temporária
     */
    $sql = "
    CREATE TEMPORARY TABLE temp (val CHAR(255));
    SET @S1 = CONCAT(\"INSERT INTO temp (val) VALUES ('\",REPLACE((SELECT GROUP_CONCAT( DISTINCT 'tags') AS data FROM 'minhaTabela'), \",\", \"'),('\"),\"');\");
    PREPARE stmt1 FROM @s1;
    EXECUTE stmt1";
    
    $sth = $dbh->prepare($sql);
    $sth->execute();
    $sth->closeCursor();
    
    /* Recolher a informação
     */
    $sth = $dbh->query("SELECT val, count(*) AS total FROM temp GROUP BY LOWER(TRIM(val))");
    $sth->setFetchMode(PDO::FETCH_OBJ);
    
    while ($row = $sth->fetch()) {
        echo '<p>A palavra '.$row->val.' está repetida '.$row->total.' '.($row->total==1?'vez':'vezes').'.</p>';
    }
    
    /* Matar a ligação e o script
     */
    $dbh = null;
    die();
    
        
    04.04.2015 / 20:33
    0

    Suppose the $registros array is the records you pulled from the database. We will create a $tags array with the count for each tag.

    $tagCount = [];
    foreach ($registros as registro) {
        $tags = explode(',', $registro['tags']);
        foreach ($tags as $tag) {
            $tag = trim($tag);
            if (!isset($tagCount[$tag])) {
                $tagCount[$tag] = 0;
            }
            $tagCount[$tag]++;
        }
    }
    

    Then just order:

    usort($tagCount, function($a, $b) {
        return $a > $b;
    } );
    
        
    04.12.2014 / 12:04