[CIG-SHORT] PETSc error when running Pylith on a cluster

Hongfeng Yang hyang at whoi.edu
Thu Feb 16 12:40:12 PST 2012


Each node on the cluster has 8 cores.
We ran 24 tests for each of 5 configurations. (95 total runs)
1 node, 2 nodes, 4 nodes, 6 nodes, 8 nodes.

The 1 node, 8 core jobs were 100% successful (24 passes)
The 2 node, 16 core jobs were 33% successful (8 passes)
The higher node/core count jobs all failed

Attached is the stdout file.

The full run command is the following:

/usr/mpi/gcc/openmpi-1.4.3/bin/mpirun --hostfile $PBS_NODEFILE -np $NPROCS  /home/username/pylith57/bin/mpinemesis --pyre-start /home/username/pylith57/bin:/home/username/pylith57/lib/python2.6/site-packages/pythia-0.8.1.12-py2.6.egg:/home/username/pylith57/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg:/home/username/pylith57/lib/python2.6/site-packages/merlin-1.7.egg:/home/username/pylith57/lib/python2.6/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/src/pylith/examples/3d/hex8:/home/username/pylith57/lib/python26.zip:/home/username/pylith57/lib/python2.6/lib-dynload:/home/username/pylith57/lib/python2.6:/home/username/pylith57/lib/python2.6/plat-linux2:/home/username/pylith57/lib/python2.6/lib-tk:/home/username/pylith57/lib/python2.6/lib-old:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages::/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/src/pylith/examples/3d/hex8:/home/username/pylith57/lib/python26.zip:/home/username/pylith57/lib/python2.6/plat-linux2:/home/username/pylith57/lib/python2.6/lib-tk:/home/username/pylith57/lib/python2.6/lib-old pythia mpi:mpistart pylith.apps.PyLithApp:PyLithApp step14.cfg --nodes=$NPROCS --petsc.start_in_debugger --launcher.dry --nodes=$NPROCS --macros.nodes=$NPROCS --macros.job.name= --macros.job.id=8403>&  ./$PBS_JOBID.log


Thanks,

Hongfeng

On 02/16/2012 10:46 AM, Brad Aagaard wrote:
> Hongfeng,
>
> Please send everything that was written to stdout. Also please indicate
> what NPROCS is (how many processes you are using). It also helps when
> you state what command you entered on the command line so that we can
> see if we can reproduce what you did.
>
> The error message you list only indicates that one of the processes
> aborted because another process already aborted due to an error. The
> message associated with the real error should have been written earlier.
>
> Brad
>
>
> On 02/16/2012 07:26 AM, Hongfeng Yang wrote:
>> Hi All,
>>
>> The cluster is running CentOS 5.7. Options to build Pylith are
>>
>>
>> $HOME/src57/pylith/pylith-installer-1.6.2-0/configure \
>> --enable-python --with-make-threads=2 \
>> --with-petsc-options="--download-chaco=1 --download-ml=1 --download-f-blas-lapack=1 --with-debugging=yes" \
>> --prefix=$HOME/pylith
>>
>>
>> However, the following error message appears when running an example on
>> the cluster.
>>
>> [30]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>>
>> So, we have successfully built debugging into petsc, but it is not enabled.
>>
>> Here is the full run command:
>>
>> /usr/mpi/gcc/openmpi-1.4.3/bin/mpirun --hostfile $PBS_NODEFILE -np $NPROCS  /home/username/pylith57/bin/mpinemesis --pyre-start /home/username/pylith57/bin:/home/username/pylith57/lib/python2.6/site-packages/pythia-0.8.1.12-py2.6.egg:/home/username/pylith57/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg:/home/username/pylith57/lib/python2.6/site-packages/merlin-1.7.egg:/home/username/pylith57/lib/python2.6/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/src/pylith/examples/3d/hex8:/home/username/pylith57/lib/python26.zip:/home/username/pylith57/lib/python2.6/lib-dynload:/home/username/pylith57/lib/python2.6:/home/username/pylith57/lib/python2.6/plat-linux2:/home/username/pylith57/l
> ib
>>    /python2.6/lib-tk:/home/username/pylith57/lib/python2.6/lib-old:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/lib/python/site-packages::/home/username/pylith57/lib/python/site-packages:/home/username/pylith57/lib64/python/site-packages:/home/username/pylith57/src/pylith/examples/3d/hex8:/home/username/pylith57/lib/python26.zip:/home/username/pylith57/lib/python2.6/plat-linux2:/home/username/pylith57/lib/python2.6/lib-tk:/home/username/pylith57/lib/python2.6/lib-old pythia mpi:mpistart pylith.apps.PyLithApp:PyLithApp step14.cfg --nodes=$NPROCS --petsc.start_in_debugger --launcher.dry --nodes=$NPROCS --macros.nodes=$NPROCS --macros.job.name= --macros.job.id=8403>&    ./$PBS_JOBID.log
>>
>> Here is the full error message which states that we are not in debugging mode:
>>
>> [35]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [30]PETSC ERROR: ------------------------------------------------------------------------
>> [30]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
>> [30]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [30]PETSC ERROR: or seehttp://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[30]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [30]PETSC ERROR: likely location of problem given in stack below
>> [30]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>
>>
>>
>>
>> Anyone could help? Thanks!
>>
>> Hongfeng Yang
>>
> _______________________________________________
> CIG-SHORT mailing list
> CIG-SHORT at geodynamics.org
> http://geodynamics.org/cgi-bin/mailman/listinfo/cig-short
>


-- 
Postdoc Investigator
Woods Hole Oceanographic Institution
Dept. Geology and Geophysics
360 Woods Hole Rd, MS 24
Woods Hole, MA 02543

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output.txt
Url: http://geodynamics.org/pipermail/cig-short/attachments/20120216/49411ecb/attachment-0001.txt 


More information about the CIG-SHORT mailing list