Investigate whether we can support graceful loss of a small number of EventBuilders
In our current use of mpirun, if one DAQ process dies, there is a very good chance that the full MPI program will shut down.
We should investigate if it is possible to have a small number of EventBuilder processes to die and continue running. (Here "running" means "take data" or "continue the current run in progress".)
If we determine that this is possible, then it will entail modifying the MPI program so that individual process failures do not bring the full system down And implementing a way to tell the BoardReaders that an EventBuilder has died and they should no longer send Fragments to that EB.
I'm going to only estimate the investigation portion of this task, for now. If we decide to go ahead with an implementation, then that work will need to be added, either in this Issue or an additional one.