[CIG-SHORT] Problem attempting to run on mutiple processors

Leif Strand leif at geodynamics.org
Mon Jun 18 12:09:44 PDT 2007


Oliver,

I apologize for that rather cryptic error message. It usually means it 
can't find 'mpirun'.

Naturally, you could alter your PATH so that it can find 'mpirun' in the 
context of the job. However, there is a good chance that 'mpirun' isn't 
even the right command to use on your cluster.

So, I recommend creating the file ~/.pyre/pylith3d/pylith3d.cfg and 
inserting something similar to the following:

[pylith3d.launcher]
command = /full/path/to/mpirun -np ${nodes}


Using a full pathname for 'mpirun' avoids the PATH environment problem. 
But more importantly, you will have to edit 'command' so that it 
produces the right command for your cluster. For example, if you are 
using MPICH2, you would replace 'mpirun' with 'mpiexec'. On many 
clusters, there is a special script to use (on our it's "mpirun.lsf") 
and "-np ${nodes}" is omitted:

[pylith3d.launcher]
/opt/lsfhpc/6.2/linux2.6-glibc2.3-x86_64/bin/mpirun.lsf

To debug the "launcher" command, run PyLith as follows:

pylith3dapp.py pylith3d.cfg --launcher.dry

This will simply print the launcher command to the console, instead of 
actually running it.

--Leif


Oliver Boyd wrote:
>
> Hi Leif,
>
> It looks like when I reinstalled a few of the compute nodes, I 
> neglected to update gcc. I’ve now done this, and it appears to work 
> fine. Thanks for the hint.
>
> As an aside, could you tell me how I can use the SGE batch system? I 
> am not exactly sure why it doesn’t work. I use a program called qsub 
> to submit the job. I give it an argument which is a script containing 
> the command and a few options, e.g.
>
> > qsub pylith-1.sh
>
> where pylith-1.sh looks like
>
> #!/bin/bash
>
> #$ -cwd
>
> #$ -j y
>
> #$ -S /bin/bash
>
> pylith3dapp.py pylith3d.cfg
>
> The result of this attempt produces
>
> --pyre-start: mpirun: exit 127
>
> /usr/local/bin/pylith3dapp.py: 
> /data/Software/Store/pylith3d-0.8.2/pylith3d/pypylith3d: exit 1
>
> Oliver
>



More information about the CIG-SHORT mailing list