[CIG-MC] Fwd: MPI_Isend error

Eh Tan tan2 at geodynamics.org
Tue Nov 17 17:02:38 PST 2009


Hi Magali,

How many processors are you using? If more than 100 processors are used,
you are seeing this bug:
http://www.geodynamics.org/pipermail/cig-mc/2008-March/000080.html


Eh



Magali Billen wrote:
> One correction to the e-mail below, we've been compiling CitcomCU
> using openmpi on our old
> cluster, so the compiler on the new cluster is the same. The big
> difference is that the cluster
> is about twice as fast as the 5-year old cluster. This suggests that
> this change to a much faster
> clsuter may have exposed an existing race condition in CitcomCU??
> Magali
>
>
> Begin forwarded message:
>
>> *From: *Magali Billen <mibillen at ucdavis.edu
>> <mailto:mibillen at ucdavis.edu>>
>> *Date: *November 17, 2009 4:23:45 PM PST
>> *To: *cig-mc at geodynamics.org <mailto:cig-mc at geodynamics.org>
>> *Subject: **[CIG-MC] MPI_Isend error*
>>
>> Hello,
>>
>> I'm using CitcomCU and am having a strange problem with problem
>> either hanging (no error, just doesn't 
>> go anywhere) or it dies with an MPI_Isend error (see below).  I seem
>> to recall having problems with the MPI_Isend 
>> command and the lam-mpi version of mpi, but I've not had any problems
>> with mpich-2.
>> On the new cluster we are compling with openmpi instead of MPICH-2.
>>
>> The MPI_Isend error seems to occur during Initialization in the call
>> to the function mass_matrix, which then 
>> calls exchange_node_f20, which is where the call to MPI_Isend is.
>>
>> --snip--
>> ok14: parallel shuffle element and id arrays
>> ok15: construct shape functions
>> [farm.caes.ucdavis.edu:27041] *** An error occurred in MPI_Isend
>> [farm.caes.ucdavis.edu:27041] *** on communicator MPI_COMM_WORLD
>> [farm.caes.ucdavis.edu:27041] *** MPI_ERR_RANK: invalid rank
>> [farm.caes.ucdavis.edu:27041] *** MPI_ERRORS_ARE_FATAL (your MPI job
>> will now abort)
>>
>> Has this (or these) types of error occurred for other versions of
>> Citcom using MPI_Isend (it seems that CitcomS uses
>> this command also).   I'm not sure how to debug this error,
>> especially since sometimes it just hangs with no error.
>>
>> Any advice you have would be hepful,
>> Magali
>>
>>
>> -----------------------------
>> Associate Professor, U.C. Davis
>> Department of Geology/KeckCAVEs
>> Physical & Earth Sciences Bldg, rm 2129
>> Davis, CA 95616
>> -----------------
>> mibillen at ucdavis.edu <mailto:mibillen at ucdavis.edu>
>> (530) 754-5696
>> *-----------------------------*
>> *** Note new e-mail, building, office*
>> *    information as of Sept. 2009 ***
>> -----------------------------
>>
>> _______________________________________________
>> CIG-MC mailing list
>> CIG-MC at geodynamics.org <mailto:CIG-MC at geodynamics.org>
>> http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc
>
> -----------------------------
> Associate Professor, U.C. Davis
> Department of Geology/KeckCAVEs
> Physical & Earth Sciences Bldg, rm 2129
> Davis, CA 95616
> -----------------
> mibillen at ucdavis.edu <mailto:mibillen at ucdavis.edu>
> (530) 754-5696
> *-----------------------------*
> *** Note new e-mail, building, office*
> *    information as of Sept. 2009 ***
> -----------------------------
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> CIG-MC mailing list
> CIG-MC at geodynamics.org
> http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc
>   

-- 
Eh Tan
Staff Scientist
Computational Infrastructure for Geodynamics
California Institute of Technology, 158-79
Pasadena, CA 91125
(626) 395-1693
http://www.geodynamics.org



More information about the CIG-MC mailing list