I would like to collect statistics from an MPI application
To write the application running on a single machine, I can use the command:
perf record mpirun -np $NUMBER_OF_CORES app_name
However, when executing this command to distribute the task on more than one machine, the logged events are only from the local machine - which I triggered the execution
To register each of the processes, I created a script that calls the perf in each of the subprocesses
#!/bin/bash
app=$1
perf record -m 512G -o "${app}-${RANDOM}.perf.data ${app}
And I execute as follows:
mpirun -np $NUMBER_OF_CORES perf_record.sh app_name
However, the generated files did not record any events
The app_name-123.perf.data file has no samples!
1 - What am I doing wrong?
2 - Is there any other tool that is best suited for use with MPI applications?
Obs, I'm really interested in the time spent on each of the functions of the program and not on the time of communication