MPI tutorial: take 3
note: part 3 of a series on using MPI in python, I suggest reading the previous one before continuing
Refresher
In the previous part of this series, we continued to try to do some very simple parallel computing (without being “embarrassingly parallel”) using MPI. The task was simple: have several independent processes(procs) guess random integers from 1 to 10, scoring points for integers haven’t been guessed yet. We achieved something kind of close to this, but we weren’t able to de-sync the procs. Basically, even by making some procs slower tha others– each was still getting the same number of guesses over the full computation. The solution was to step away from the usual results you get when you google how to use MPI: allreduce, gather, broadcast, etc…. Instead, we are going to use nonblockng MPI message passing.
Explicitly nonblocking communication in MPI
Primarily, our savior is comm.Iprobe(source=i), which will return a boolean depending on wether the given source has incoming messages for the proc or not. This is a built in way to implement the workaround that we tried previously ( recall our “message” boolean) but in a nonblocking way. It will not hang if there isn’t a message yet, it will simply return a False and keep going. With this tool, we can manually control the messages that flow in and out of proc 0 in a way that allows all procs to go at their own pace. Let’s use comm.Iprobe to retool our code:
One more catch!
Ok, so the script above still isn’t quite going to cut it. If you run it you will find that, while the loops can now happen independently, the local lists do not stay up to date with the newest iteration. This can be fixed by changing one word only. Before looking at the solution, maybe take some time to analyze what you think is happening. Run the code a few times and pay attention to what is printed…
Solution
Now, that you have done that, here is the answer. All you need to do is change
to
You see, the issue is that the list update messages from proc 0 are queued up, and if you just check if there is a message or not and then update the list one time you will only get the oldest message in the queue. The solution is, then, to completely clear out the queue each time you get to it. A while loop does this perfectly, because it will keep returning true until there are no more messages in the queue. This means you have received the most recent list and are good to go!