[CIG-MC] problem launching citcoms without Batch system

Robert Moucha rmoucha at gmail.com
Wed May 14 14:01:27 PDT 2008


Hi Leif,

Thanks for the quick response, I tried your suggestion by adding
~/.pyre/CitcomS/CitcomS.cfg file with:

[CitcomS.launcher]
command = mpirun -nolocal -np ${nodes}

It appears that I also need to pass the machine file to mpirun,
because the error I get now is:

Could not find enough machines for architecture LINUX
--pyre-start: mpirun: exit 1

LINUX is the default cluster MPICH machine file that only contains the
head node.  What variable name should I use for the machine file in
the CitcomS.cfg file?

Thanks again,
Rob

On Wed, May 14, 2008 at 4:32 PM, Leif Strand <leif at geodynamics.org> wrote:
> Hi Rob,
>
> I would start by adding "--launcher.dry" to your command-line arguments:
>
> citcoms example1.cfg mymachines4.cfg --solver.datadir=/state/partition1/test
> --launcher.dry
>
> This will print the 'mpirun' command used by CitcomS (without actually
> executing it).  It will look something like this:
>
> mpirun -np 4 /path/to/mpipycitcoms --pyre-start [...lots of arguments...]
>
> So, for example, if "-nolocal" is missing, you would then add the following
> to your ~/.pyre/CitcomS/CitcomS.cfg file:
>
> [CitcomS.launcher]
> command = mpirun -nolocal -np ${nodes}
>
> Once the 'mpirun' command looks right, go ahead and remove "--launcher.dry"
> to perform an actual run.
>
> --Leif
>
> Robert Moucha wrote:
>>
>> Hello all,
>>
>> Just wondering if anyone else had a problem launching citcoms without
>> the use of a Batch system.  It appears that the job is launched only
>> on the head node. I installed the latest 3.0.2 version and issued the
>> following command in a working directory (the path to the
>> CitcomS-3.0.2/bin is set)
>>
>> $ citcoms example1.cfg mymachines4.cfg
>> --solver.datadir=/state/partition1/test
>>
>> Cannot make new directory '/state/partition1/test'
>> Cannot make new directory '/state/partition1/test'
>> Cannot make new directory '/state/partition1/test'
>> Cannot make new directory '/state/partition1/test'
>> --pyre-start: mpirun: exit 8
>> /home/moucha/CitcomS-3.0.2/bin/citcoms:
>> /home/moucha/CitcomS-3.0.2/bin/pycitcoms: exit 1
>>
>> The file mymachines4.cfg contains:
>>
>> [CitcomS.launcher]
>> nodegen = c0-%g
>> nodelist = [1-4]
>>
>> The mpirun.nodes file has:
>>
>> c0-1
>> c0-2
>> c0-3
>> c0-4
>>
>> which is correct for our cluster.
>>
>> The above error makes sense, because I don't have write privileges to
>> /state directory on the head node.  If I changed the datadir parameter
>> to a directory that I have write access to on the head node, the
>> program runs without a problem (but only on the head node).
>> Incidentally, the following command runs without problem on the
>> compute nodes:
>>
>> mpirun -nolocal -np 4 -machinefile mpirun.nodes
>> ~/CitcomS-3.0.2/bin/CitcomSRegional example1b.cfg
>>
>> I'm using Python 2.4.5, gcc-3.4.6, mpich 1.2.7p1 (can provide further
>> info if need be).
>>
>> Thanks
>> Rob
>> _______________________________________________
>> CIG-MC mailing list
>> CIG-MC at geodynamics.org
>> http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc
>
>


More information about the CIG-MC mailing list