How to loop openMp to count rows in a text file?

0

How to loop through the OpenMP library to count the rows one file?

#pragma omp parallel for
for (string line; getline(file, line); ) {
    count++;
}

In this way it does not execute, it seems that it only accepts for in normal way, where it should be done loop of a number such to such.     

asked by anonymous 11.02.2014 / 17:14

3 answers

1

The problem is that it is not possible to determine the number of iterations at the beginning of the loop. Knowing where each line begins requires having already read the previous line and knowing the total number of lines requires that all have already been read. Inside the loop you wrote count++; . That is, to know the value of count and to increment it requires that the previous iteration has already been completed. Finally, there's nothing parallelizable about this code.

Some solutions to this can be:

  • Read all the lines of the file in an array beforehand and iterate over the array in a parallel way similar to what you intended to do.

  • Map the file to pages in memory (the operating system has functions for this), identify the beginning and the size of each line, and store that pair of integers in an array. Finally, iterate over the array in parallel.

  • Create a production thread that will read the file line by line and one or more consumer threads, which will process the rows read. Here I do not think OpenMP will help, but the default library has primitive classes that can help.

  • I'm assuming, of course, that processing each line is much more expensive than reading the file line. That is, however, unlikely. Disk read / write operations are the bottleneck in most cases, and you can not parallelize the disk.

        
    11.02.2014 / 18:16
    0

    You can do this with "pure" C ++:

    #include <fstream>
    #include <string>
    int main()
    {
        std::ifstream exemplo;
        exemplo.open("exemplo.txt");
        std::string linha;
        for(int i = 0; std::getline(exemplo /*envia*/,linha/*recebe*/); i++);
    }
    
        
    11.02.2014 / 18:02
    0

    Hello, try the following solution, move all the contents of the file to an array, then parallelize the operations in the array, or try to parallelize it using mpi-IO, I hope it helped.

        
    13.12.2014 / 12:58