[aspect-devel] Aspect problem on stampede cluster

Robert Martin-Short rmartin-short at berkeley.edu
Mon Feb 29 10:02:15 PST 2016


Dear aspect development team

I'm running ASPECT 1.3 on the TACC stampede cluster via and XSEDE
allocation, but am encountering some problems when running parallel
simulations

My runs always fail with errors that look like this

c492-604.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process
(rank: 7, pid: 103496) terminated with signal 9 -> abort job
[c492-604.stampede.tacc.utexas.edu:mpispawn_0][readline] Unexpected
End-Of-File on file descriptor 14. MPI process died?
[c492-604.stampede.tacc.utexas.edu:mpispawn_0][mtpmi_processops] Error
while reading PMI socket. MPI process died?
[c492-604.stampede.tacc.utexas.edu:mpirun_rsh][process_mpispawn_connection]
mpispawn_0 from node c492-604 aborted: MPI process error (1)
TACC: MPI job exited with code: 1

I'm running parameter files that are known to work on another cluster, and
the simulation will proceed if run in serial on Stampede, so this seems
like an MPI problem? Perhaps I just don't have the correct modules loaded?

Here is a list of my loaded module files:

  1) xalt/0.6   2) TACC   3) git/2.7.0   4) gcc/4.7.1   5) mvapich2/1.9a2
6) mkl/13.0.2.146   7) cmake/3.1.0--

If anyone has seen this problem before and could give me some advice, that
would be great!

Best wishes

Robert

Robert Martin-Short
Graduate Student
Department of Earth and Planetary Science
U.C Berkeley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.geodynamics.org/pipermail/aspect-devel/attachments/20160229/44fb3697/attachment.html>


More information about the Aspect-devel mailing list