Read each existing row in StringBuffer or String

2

I have a text with the following format:

Data            Valor
20140901        278
20140902        248
20140903        458
20141004        545
20141005        125
20141106        1020
20141207        249

The same is stored in an object of type StringBuffer and would like to read line by line, does anyone have any idea how to do it?

Note: If there is a way to do likewise with a String it will be equally valid.

What I already have:

At the moment, I do not have an implementation itself, but the idea I have is to make a split (in this case, converting StringBuffer to String ) and pass the \n character as an argument. But the idea of this idea is "performance", because I intend to process lines in the thousands.

    
asked by anonymous 04.10.2014 / 10:19

1 answer

2

I have two suggestions, but to say which one will have the best performance just by testing ...

Note: Before you begin, I suggest using a StringBuilder instead of a StringBuffer - the first is not thread-safe , so its performance in a single-threaded use will certainly be better than the second. Your API is pretty much the same, so your code does not need to be significantly changed.

  • Using StringTokenizer (assuming texto is a String ):

    StringTokenizer st = new StringTokenizer(texto, "\n");
    while ( st.hasMoreTokens() ) {
        String linha = st.nextToken();
        ...
    }
    

    The original text will be kept intact (as opposed to a texto.split("\n") , for example, that would double the amount of memory used), and only small strings would be created - one per line - so that in total double the memory original would be used.

  • Using CharBuffer . Unfortunately my knowledge of nio is quite limited, I do not know if I can set a good example. But the potential of CharBuffer over StringTokenizer is that it would not be necessary to create a new String for each line of text - one could simply adjust the position ( position ) and limit ( limit ) of buffer to designate the "current line", and use the buffer itself as CharSequence (ie as if it were a String ).

    The example below worked on ideone , just do not guarantee - as I said - that is a good way to implement (assuming that texto is any CharSequence , including String , StringBuffer and StringBuilder ):

    CharBuffer buffer = CharBuffer.wrap(texto);
    int inicio = 0, fim = 0;
    while ( fim < buffer.capacity() ) {
        if ( buffer.get(fim) == '\n' ) {
            buffer.position(inicio).limit(fim);
            // Usa-se buffer como se fosse uma String (i.e. a "próxima linha")
            ...
            buffer.position(0).limit(buffer.capacity());
            inicio = fim+1;
        }
        fim++;
    }
    

    If I'm not mistaken, the initial wrap creates a copy of the entire text, but once done that can be discarded and no additional object creation operation will be performed. Or better yet, have texto already start being a CharBuffer - for example by reading it from the input file directly in that format.

  • Reiterating, although method 2 seems better, in practice the creation and continuous destruction of String objects should not have too negative an impact on performance - since modern JVMs use a collector of efficient waste in this area. In addition, other copies may be made inadvertently, voiding the benefit. That is, to know which is the "best", just testing ...

        
    04.10.2014 / 11:32