How to see if there are two equal items in the ArrayList and remove them?

Question

How to see if there are two equal items in the ArrayList and remove them?

Navigation

#1 by (3 votes)

2

I am doing the "Tokenization" of a TXT file.

I need the code to save all tokens in an ArrayList, but there can be no duplicate token .

I would like to know how to remove duplicate tokens , or check if token already exists and in this case do not add it.

My current code:

for (org.cogroo.text.Token token : sentence.getTokens()) { // lista de tokens

    token.getStart(); token.getEnd(); // caracteres onde o token comeca e termina
    token.getLexeme(); // o texto do token (palavra que ele separa e pega exp: "clinico"
    token.getLemmas(); // um array com os possiveis lemas para o par lexeme+postag
    token.getPOSTag(); // classe morfologica de acordo com o contexto("coloca "prp, adj,n(noun))
    token.getFeatures(); // genero, numero, tempo etc
    contadorTokens++;
    System.out.println(expandirAcronimos(token.getLexeme()) + "_" + token.getPOSTag() + "_" + token.getFeatures());// imprime a palavra com o tag
    gravarArq.println(token.getLexeme() + "_" + token.getPOSTag() + "_" + token.getFeatures());// grava no arquivo txt cada palavra tokenizada
    gravarArquivo.println(token.getPOSTag() + "_" + token.getFeatures());// grava no arquivo "Tokens.txt" cada token

    listaTokens.add(token.getPOSTag()); //ADICIONA as tags para dentro de uma lista 

    for(int s=0;s<listaTokens.size();s++){  //PERCORRE A LISTA
        if (!listaTokens.equals(token.getPOSTag())) {

        }
    }
}

java lista

asked by anonymous 08.11.2014 / 15:43

1 answer

Error Building Json of an Object retrieved by Hibernate How do I show a loading image while the iframe contents are loaded? [closed]

score 3 · Accepted Answer

To store elements without repeating, it is best to use a "set" data type instead of "list". I suggest HashSet , or maybe LinkedHashSet if the order of the tokens should be preserved :

Set conjuntoTokens = new HashSet(); // Pode ser genérico, i.e. Set<Tipo>

for (org.cogroo.text.Token token : sentence.getTokens()) { // lista de tokens
    ...

    //listaTokens.add(token.getPOSTag()); //ADICIONA as tags para dentro de uma lista 
    boolean mudou = conjuntoTokens.add(token.getPOSTag()); // adiciona as tags no conjunto
                                                           // em vez da lista
    if ( !mudou ) {
        ... // O elemento já existia no conjunto
    }
}

listaTokens.addAll(conjuntoTokens); // adiciona todos os elementos do conjunto na lsta