In the second part of this series, we tried to do some very simple parallel computing (without being “embarassingly parallel”) using MPI. The task was simple: have several independent processes(procs) guess random integers from 1 to 10, scoring points for integers havent been guessed yet. We achieved somethig kind of close to this, but we weren’t able to de-sync the procs. Basically, even by making some procs slower tha others– each was still getting the same number of guesses over the full computation. An obvious reason for this is that the previous treatment was symmetric with repoect to the procs. Each was treated as the same as all the others and, once per loop, we used comm.allreduce to sync up the lists.
So, let’s try something different. We will re-write the code so that proc 0 is special. It will not be guessing, but just servng as a hub for the other processes to send information to. We will assign proc 0 to be this middle man. Here is some updated code, based on where we left off:
Does this approach seem plausible to you? Take some time to make sure you understand what we are trying to do. Go ahead and try to run it with just 3 processes (it isn’t going to work, but try it anyway fr the experience.)
Why did we fail again?
You should have found that the program will hang unless every guessng procedure sends a message. The reason? MPI really really really needs to have a one-to-one correspondence with messages sent and messages recieved. So, when it comes to a line like comm.recv(source=1) and proc 1 hasnt sent a message, it is just going to hang there until it gets a message. But, this also stops the loop from progessing, and so it totally halts all the other processes. Thus, no message will ever come. The most straightforward soluton to this is to have a boolean flag for each procesure that proc 0 can look at to see if there is or is not a message incoming. Then, we can make it only comm.recv on procs that have messages to be recieved. See below for the inclusion of this boolean flag.
Ok, now this is starting to look more like it! Give it a go. Can you get it to run? (this time, it should)
Did we fail yet again?
Unfortunately, it looks like we did! Here is a sample of a run of this code:
They are still getting the same number of guesses, when one of the processes should be able to guess about two times faster than the other! The issue is that despite conceptually disentangling proc 0 from the other procs, the message passing that we are usng is fundamentally what is called “blocking” communication. This means that, even on the very first loop, proc 0 cannot continue until it gets a message from every other proc. We have avoided the error by using the “message” boolean– but we have kicked the parallelization can down the road because it still synchronizes the lops. The loop cannot go on to the next iteration until proc0 receves that boolean message from each procedure.
Another issue is the comm.bcast, which also requies that all the different processes “catch up” before beng able to continue. This could be worked around as well, using more explicit comm.send and comm.recv calls; but again, it wont solve the fundamental issue.
The solution is to use explicitly “nonblocking” communication instead. This is a set of MPI bindings that are structured to not hold up other processes. I think we have finally pinpointed the issue, so my next post will definitely be about how we can finally play the unfair guessing game we deserve!