Assuming you've already done the work of separating the text into words, the data structure you're looking for is java.util.Map
(also known as" dictionary "). Its function is just to map an object (the "key") to another object (the "value") - or in its case, the word to the number of times it appears.
Map<String,Integer> quantas = new HashMap<String,Integer>();
quantas.put("foo", 1); // Diz que a palavra "foo" apareceu 1 vez
int x = quantas.get("foo"); // Pega o número de vezes que ela apareceu
quantas.put("foo", x+1); // Atualiza o valor (sobrescreve)
// Enumera todas as palavras do dicionário
for ( Map.Entry<String,Integer> par : quantas.entrySet() )
System.out.println(par.getKey() + " apareceu " + par.getValue() + " vezes.");
If you're using Java 8, you can do this even more simply by using lambdas :
for ( String palavra : listaPalavras )
quantas.compute(palavra, (k, v) -> v == null ? 1 : v+1);
(i.e. if the word is not yet in the dictionary - v == null
- check that it appeared 1 time, otherwise increment 1 in the number of times it appeared)
As for a "smart" way to find the N words that popped up the most, I suggest a priority queue ( PriorityQueue
) where:
- The comparator used in the creation order pairs
(palavra, nº ocorrências)
in ascending order of number of occurrences;
- Once you've created and populated Map
quantas
, you'll go through adding elements to that priority queue, keeping your size limited to the number you want (ie remove the smaller ones when the queue grows beyond the point you want - so the performance of the algorithm will be good, because it will avoid comparing elements that do not matter).
I'm not going to post a full example because even the simplest Java tasks take 10x more code than a decent language . But if you find it difficult to use the above method you can ask what I explain better. Or, if you're not concerned about performance and just want a simple, straightforward method, put those pairs in a list and order them.
Update: until it was not so bad , but damn, it lacks an inference of types do not ...
int max = 3;
PriorityQueue<Map.Entry<String,Integer>> fila =
new PriorityQueue<Map.Entry<String,Integer>>(max+1, new Comparator<Map.Entry<String,Integer>>() {
public int compare(Map.Entry<String,Integer> a, Map.Entry<String,Integer> b) {
return a.getValue() < b.getValue() ? -1 :
a.getValue() > b.getValue() ? 1 :
a.getKey().compareTo(b.getKey()); // Desempate
}
});
for ( Map.Entry<String,Integer> entry : quantas.entrySet() ) {
fila.add(entry);
if ( fila.size() > max )
fila.poll(); // Remove o menor
}