<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18852">
<STYLE></STYLE>
</HEAD>
<BODY
style="WORD-WRAP: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space"
bgColor=#ffffff>
<DIV><FONT face=Arial>Hi, Magali,</FONT></DIV>
<DIV><FONT face=Arial></FONT> </DIV>
<DIV><FONT face=Arial>You can try to add the following to the subroutine: <SPAN
style="FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10.5pt; mso-ansi-language: EN-US; mso-fareast-font-family: 宋体; mso-font-kerning: 1.0pt; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA"
lang=EN-US><FONT size=3>void parallel_domain_decomp1(struct All_variables *E)
in Parallel_related.c:</FONT></SPAN></FONT></DIV>
<DIV><FONT face=Arial><SPAN
style="FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10.5pt; mso-ansi-language: EN-US; mso-fareast-font-family: 宋体; mso-font-kerning: 1.0pt; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA"
lang=EN-US>----------------------------------------------------------------------------</SPAN><FONT
size=3> </FONT></FONT></DIV>
<DIV>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-ansi-language: DA"
lang=DA><FONT face=Arial><SPAN style="mso-tab-count: 1">
</SPAN>for(j = 0; j < E->parallel.nproc; j++)<BR><SPAN
style="mso-tab-count: 2"> </SPAN>for(i
= 0; i <= E->parallel.nproc; i++)<BR><SPAN
style="mso-tab-count: 3">
</SPAN>{<BR><SPAN
style="mso-tab-count: 4">
</SPAN>E->parallel.mst1[j][i] = 1;<BR><SPAN
style="mso-tab-count: 4">
</SPAN>E->parallel.mst2[j][i] = 2;</FONT></SPAN></P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-ansi-language: DA"
lang=DA><FONT face=Arial><SPAN
style="mso-tab-count: 4">
</SPAN>E->parallel.mst2[j][i] = 3;<BR><SPAN
style="mso-tab-count: 3">
</SPAN></FONT></SPAN><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial>}<BR>----------------------------------------------------------------------------</FONT></SPAN></P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial></FONT></SPAN> </P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial>I'm not sure if it works, but I thought it deserve a try. This is a
machine-dependent issue. </FONT></SPAN></P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial></FONT></SPAN> </P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial>Good luck!</FONT></SPAN></P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial></FONT></SPAN> </P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial></FONT></SPAN> </P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoPlainText><SPAN
style="mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体" lang=EN-US><FONT
face=Arial>Jinshui Huang<BR>---------------------------------------<BR>School of
Earth and Space Sciences<BR>University of Science and Technology of
China<BR>Hefei, Anhui 230026,
China<BR>0551-3606781<BR>---------------------------------------</FONT></SPAN></P></DIV>
<BLOCKQUOTE
style="BORDER-LEFT: #000000 2px solid; PADDING-LEFT: 5px; PADDING-RIGHT: 0px; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="FONT: 10pt arial; BACKGROUND: #e4e4e4; font-color: black"><B>From:</B>
<A title=mibillen@ucdavis.edu href="mailto:mibillen@ucdavis.edu">Magali
Billen</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A title=tan2@geodynamics.org
href="mailto:tan2@geodynamics.org">Eh Tan</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Cc:</B> <A title=cig-mc@geodynamics.org
href="mailto:cig-mc@geodynamics.org">cig-mc@geodynamics.org</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Wednesday, November 18, 2009 10:23
AM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> [?? Probable Spam] Re: [CIG-MC]
Fwd: MPI_Isend error</DIV>
<DIV><BR></DIV>Hello Eh,
<DIV><BR></DIV>
<DIV>This is a run on 8 processors. <SPAN style="FONT-FAMILY: monospace"
class=Apple-style-span>If I print the stack I get:</SPAN></DIV>
<DIV><FONT class=Apple-style-span face=monospace><BR></FONT></DIV>
<DIV><SPAN style="FONT-FAMILY: monospace" class=Apple-style-span>(gdb)
bt<BR>#0 0x00002b943e3c208a in opal_progress ()
from<BR>/share/apps/openmpisb-1.3/gcc-4.4/lib/libopen-pal.so.0<BR>#1
0x00002b943def5c85 in ompi_request_default_wait_all ()
from<BR>/share/apps/openmpisb-1.3/gcc-4.4/lib/libmpi.so.0<BR>#2
0x00002b943df229d3 in PMPI_Waitall ()
from<BR>/share/apps/openmpisb-1.3/gcc-4.4/lib/libmpi.so.0<BR>#3
0x0000000000427ef5 in exchange_id_d20 ()<BR>#4 0x00000000004166f3
in gauss_seidel ()<BR>#5 0x000000000041884b in multi_grid ()<BR>#6
0x0000000000418c44 in solve_del2_u ()<BR>#7 0x000000000041b151 in
solve_Ahat_p_fhat ()<BR>#8 0x000000000041b9a1 in
solve_constrained_flow_iterative ()<BR>#9 0x0000000000411ca6 in
general_stokes_solver ()<BR>#10 0x0000000000409c21 in main ()<BR></SPAN>
<DIV>
<DIV><BR></DIV>
<DIV>I've attached the version of Parallel_related.c that is used... I have
not modified this in anyway</DIV>
<DIV>from the CIG release of CitcomCU.</DIV>
<DIV><BR></DIV>
<DIV></DIV></DIV></DIV>
<P>
<HR>
<P></P>
<DIV>
<DIV>
<DIV></DIV>
<DIV>Luckily, there are commented fprintf statements in just that part of the
code... we'll continue to dig...</DIV>
<DIV><BR></DIV>
<DIV>Oh, and just to eliminate the new cluster from suspicion, we downloaded,
compiled and ran CitcomS</DIV>
<DIV>example1.cfg on the same cluster with the same compilers, and their was
not problem.</DIV>
<DIV><BR></DIV>
<DIV>Maybe this is the sign that I'm suppose to finally switch from CitcomCU
to CitcomS... :-(</DIV>
<DIV>Magali</DIV>
<DIV><BR></DIV>
<DIV>On Nov 17, 2009, at 5:02 PM, Eh Tan wrote:</DIV><BR
class=Apple-interchange-newline>
<BLOCKQUOTE type="cite">
<DIV>Hi Magali,<BR><BR>How many processors are you using? If more than 100
processors are used,<BR>you are seeing this bug:<BR><A
href="http://www.geodynamics.org/pipermail/cig-mc/2008-March/000080.html">http://www.geodynamics.org/pipermail/cig-mc/2008-March/000080.html</A><BR><BR><BR>Eh<BR><BR><BR><BR>Magali
Billen wrote:<BR>
<BLOCKQUOTE type="cite">One correction to the e-mail below, we've been
compiling CitcomCU<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">using openmpi on our old<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">cluster, so the compiler on the new cluster is the
same. The big<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">difference is that the cluster<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">is about twice as fast as the 5-year old cluster.
This suggests that<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">this change to a much faster<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">clsuter may have exposed an existing race
condition in CitcomCU??<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">Magali<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">Begin forwarded message:<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">*From: *Magali Billen
<mibillen@ucdavis.edu<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE
type="cite"><mailto:mibillen@ucdavis.edu>><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">*Date: *November 17, 2009 4:23:45 PM
PST<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">*To: *cig-mc@geodynamics.org
<mailto:cig-mc@geodynamics.org><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">*Subject: **[CIG-MC] MPI_Isend
error*<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Hello,<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">I'm using CitcomCU and am having a strange
problem with problem<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">either hanging (no error, just doesn't
<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">go anywhere) or it dies with an MPI_Isend error
(see below). I seem<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">to recall having problems with the MPI_Isend
<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">command and the lam-mpi version of mpi, but I've
not had any problems<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">with mpich-2.<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">On the new cluster we are compling with openmpi
instead of MPICH-2.<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">The MPI_Isend error seems to occur during
Initialization in the call<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">to the function mass_matrix, which then
<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">calls exchange_node_f20, which is where the call
to MPI_Isend is.<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">--snip--<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">ok14: parallel shuffle element and id
arrays<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">ok15: construct shape
functions<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">[farm.caes.ucdavis.edu:27041] *** An error
occurred in MPI_Isend<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">[farm.caes.ucdavis.edu:27041] *** on
communicator MPI_COMM_WORLD<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">[farm.caes.ucdavis.edu:27041] *** MPI_ERR_RANK:
invalid rank<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">[farm.caes.ucdavis.edu:27041] ***
MPI_ERRORS_ARE_FATAL (your MPI job<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">will now abort)<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Has this (or these) types of error occurred for
other versions of<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Citcom using MPI_Isend (it seems that CitcomS
uses<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">this command also). I'm not sure how
to debug this error,<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">especially since sometimes it just hangs with no
error.<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Any advice you have would be
hepful,<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Magali<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE
type="cite">-----------------------------<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Associate Professor, U.C.
Davis<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Department of
Geology/KeckCAVEs<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Physical & Earth Sciences Bldg, rm
2129<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">Davis, CA 95616<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">-----------------<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">mibillen@ucdavis.edu
<mailto:mibillen@ucdavis.edu><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">(530) 754-5696<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE
type="cite">*-----------------------------*<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">*** Note new e-mail, building,
office*<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">* information as of Sept. 2009
***<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE
type="cite">-----------------------------<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE
type="cite">_______________________________________________<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">CIG-MC mailing list<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE type="cite">CIG-MC@geodynamics.org
<mailto:CIG-MC@geodynamics.org><BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite">
<BLOCKQUOTE
type="cite">http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR></BLOCKQUOTE></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">-----------------------------<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">Associate Professor, U.C. Davis<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">Department of Geology/KeckCAVEs<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">Physical & Earth Sciences Bldg, rm
2129<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">Davis, CA 95616<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">-----------------<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">mibillen@ucdavis.edu
<mailto:mibillen@ucdavis.edu><BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">(530) 754-5696<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">*-----------------------------*<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">*** Note new e-mail, building,
office*<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">* information as of Sept. 2009
***<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">-----------------------------<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE>
<BLOCKQUOTE
type="cite">------------------------------------------------------------------------<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE>
<BLOCKQUOTE
type="cite">_______________________________________________<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">CIG-MC mailing list<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite">CIG-MC@geodynamics.org<BR></BLOCKQUOTE>
<BLOCKQUOTE
type="cite">http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR></BLOCKQUOTE>
<BLOCKQUOTE type="cite"><BR></BLOCKQUOTE><BR>-- <BR>Eh Tan<BR>Staff
Scientist<BR>Computational Infrastructure for Geodynamics<BR>California
Institute of Technology, 158-79<BR>Pasadena, CA 91125<BR>(626)
395-1693<BR>http://www.geodynamics.org<BR><BR>_______________________________________________<BR>CIG-MC
mailing
list<BR>CIG-MC@geodynamics.org<BR>http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR></DIV></BLOCKQUOTE></DIV><BR>
<DIV apple-content-edited="true">
<DIV
style="WORD-WRAP: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space"><SPAN
style="WIDOWS: 2; TEXT-TRANSFORM: none; TEXT-INDENT: 0px; BORDER-COLLAPSE: separate; FONT: medium 'Lucida Grande'; WHITE-SPACE: normal; ORPHANS: 2; LETTER-SPACING: normal; COLOR: rgb(0,0,0); WORD-SPACING: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px"
class=Apple-style-span>
<DIV
style="WORD-WRAP: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space">
<DIV>
<DIV>-----------------------------</DIV>
<DIV>Associate Professor, U.C. Davis</DIV>
<DIV>Department of Geology/KeckCAVEs</DIV>
<DIV>Physical & Earth Sciences Bldg, rm 2129</DIV>
<DIV>Davis, CA 95616</DIV>
<DIV>-----------------</DIV>
<DIV><A href="mailto:mibillen@ucdavis.edu">mibillen@ucdavis.edu</A></DIV>
<DIV>(530) 754-5696</DIV>
<DIV><B>-----------------------------</B></DIV>
<DIV><B>** Note new e-mail, building, office</B></DIV>
<DIV><B> information as of Sept. 2009 **</B></DIV>
<DIV>-----------------------------</DIV></DIV></DIV></SPAN></DIV></DIV><BR></DIV>
<P>
<HR>
<P></P>Hello Eh,<BR><BR>This is a run on 8 processors. If I print the stack I
get:<BR><BR>(gdb) bt<BR>#0 0x00002b943e3c208a in opal_progress ()
from<BR>/share/apps/openmpisb-1.3/gcc-4.4/lib/libopen-pal.so.0<BR>#1
0x00002b943def5c85 in ompi_request_default_wait_all ()
from<BR>/share/apps/openmpisb-1.3/gcc-4.4/lib/libmpi.so.0<BR>#2
0x00002b943df229d3 in PMPI_Waitall ()
from<BR>/share/apps/openmpisb-1.3/gcc-4.4/lib/libmpi.so.0<BR>#3
0x0000000000427ef5 in exchange_id_d20 ()<BR>#4 0x00000000004166f3 in
gauss_seidel ()<BR>#5 0x000000000041884b in multi_grid ()<BR>#6
0x0000000000418c44 in solve_del2_u ()<BR>#7 0x000000000041b151 in
solve_Ahat_p_fhat ()<BR>#8 0x000000000041b9a1 in
solve_constrained_flow_iterative ()<BR>#9 0x0000000000411ca6 in
general_stokes_solver ()<BR>#10 0x0000000000409c21 in main ()<BR><BR>I've
attached the version of Parallel_related.c that is used... I have
<BR>not modified this in anyway<BR>from the CIG release of
CitcomCU.<BR><BR><BR>Luckily, there are commented fprintf statements in just
that part of <BR>the code... we'll continue to dig...<BR><BR>Oh, and
just to eliminate the new cluster from suspicion, we <BR>downloaded,
compiled and ran CitcomS<BR>example1.cfg on the same cluster with the same
compilers, and their <BR>was not problem.<BR><BR>Maybe this is the sign
that I'm suppose to finally switch from <BR>CitcomCU to CitcomS...
:-(<BR>Magali<BR><BR>On Nov 17, 2009, at 5:02 PM, Eh Tan wrote:<BR><BR>> Hi
Magali,<BR>><BR>> How many processors are you using? If more than 100
processors are <BR>> used,<BR>> you are seeing this bug:<BR>>
http://www.geodynamics.org/pipermail/cig-mc/2008-March/000080.html<BR>><BR>><BR>>
Eh<BR>><BR>><BR>><BR>> Magali Billen wrote:<BR>>> One
correction to the e-mail below, we've been compiling CitcomCU<BR>>>
using openmpi on our old<BR>>> cluster, so the compiler on the new
cluster is the same. The big<BR>>> difference is that the
cluster<BR>>> is about twice as fast as the 5-year old cluster. This
suggests that<BR>>> this change to a much faster<BR>>> clsuter may
have exposed an existing race condition in CitcomCU??<BR>>>
Magali<BR>>><BR>>><BR>>> Begin forwarded
message:<BR>>><BR>>>> *From: *Magali Billen
<mibillen@ucdavis.edu<BR>>>>
<mailto:mibillen@ucdavis.edu>><BR>>>> *Date: *November 17,
2009 4:23:45 PM PST<BR>>>> *To: *cig-mc@geodynamics.org
<mailto:cig-mc@geodynamics.org><BR>>>> *Subject: **[CIG-MC]
MPI_Isend error*<BR>>>><BR>>>>
Hello,<BR>>>><BR>>>> I'm using CitcomCU and am having a
strange problem with problem<BR>>>> either hanging (no error, just
doesn't<BR>>>> go anywhere) or it dies with an MPI_Isend error (see
below). I seem<BR>>>> to recall having problems with the
MPI_Isend<BR>>>> command and the lam-mpi version of mpi, but I've not
had any <BR>>>> problems<BR>>>> with
mpich-2.<BR>>>> On the new cluster we are compling with openmpi
instead of MPICH-2.<BR>>>><BR>>>> The MPI_Isend error seems
to occur during Initialization in the call<BR>>>> to the function
mass_matrix, which then<BR>>>> calls exchange_node_f20, which is
where the call to MPI_Isend is.<BR>>>><BR>>>>
--snip--<BR>>>> ok14: parallel shuffle element and id
arrays<BR>>>> ok15: construct shape functions<BR>>>>
[farm.caes.ucdavis.edu:27041] *** An error occurred in
MPI_Isend<BR>>>> [farm.caes.ucdavis.edu:27041] *** on communicator
MPI_COMM_WORLD<BR>>>> [farm.caes.ucdavis.edu:27041] *** MPI_ERR_RANK:
invalid rank<BR>>>> [farm.caes.ucdavis.edu:27041] ***
MPI_ERRORS_ARE_FATAL (your MPI job<BR>>>> will now
abort)<BR>>>><BR>>>> Has this (or these) types of error
occurred for other versions of<BR>>>> Citcom using MPI_Isend (it
seems that CitcomS uses<BR>>>> this command also). I'm
not sure how to debug this error,<BR>>>> especially since sometimes
it just hangs with no error.<BR>>>><BR>>>> Any advice you
have would be hepful,<BR>>>>
Magali<BR>>>><BR>>>><BR>>>>
-----------------------------<BR>>>> Associate Professor, U.C.
Davis<BR>>>> Department of Geology/KeckCAVEs<BR>>>> Physical
& Earth Sciences Bldg, rm 2129<BR>>>> Davis, CA
95616<BR>>>> -----------------<BR>>>> mibillen@ucdavis.edu
<mailto:mibillen@ucdavis.edu><BR>>>> (530)
754-5696<BR>>>> *-----------------------------*<BR>>>> ***
Note new e-mail, building, office*<BR>>>> *
information as of Sept. 2009 ***<BR>>>>
-----------------------------<BR>>>><BR>>>>
_______________________________________________<BR>>>> CIG-MC mailing
list<BR>>>> CIG-MC@geodynamics.org
<mailto:CIG-MC@geodynamics.org><BR>>>>
http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR>>><BR>>>
-----------------------------<BR>>> Associate Professor, U.C.
Davis<BR>>> Department of Geology/KeckCAVEs<BR>>> Physical &
Earth Sciences Bldg, rm 2129<BR>>> Davis, CA 95616<BR>>>
-----------------<BR>>> mibillen@ucdavis.edu
<mailto:mibillen@ucdavis.edu><BR>>> (530) 754-5696<BR>>>
*-----------------------------*<BR>>> *** Note new e-mail, building,
office*<BR>>> * information as of Sept. 2009
***<BR>>> -----------------------------<BR>>><BR>>>
------------------------------------------------------------------------<BR>>><BR>>>
_______________________________________________<BR>>> CIG-MC mailing
list<BR>>> CIG-MC@geodynamics.org<BR>>>
http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR>>><BR>><BR>>
-- <BR>> Eh Tan<BR>> Staff Scientist<BR>> Computational
Infrastructure for Geodynamics<BR>> California Institute of Technology,
158-79<BR>> Pasadena, CA 91125<BR>> (626) 395-1693<BR>>
http://www.geodynamics.org<BR>><BR>>
_______________________________________________<BR>> CIG-MC mailing
list<BR>> CIG-MC@geodynamics.org<BR>>
http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR><BR>-----------------------------<BR>Associate
Professor, U.C. Davis<BR>Department of Geology/KeckCAVEs<BR>Physical &
Earth Sciences Bldg, rm 2129<BR>Davis, CA
95616<BR>-----------------<BR>mibillen@ucdavis.edu<BR>(530)
754-5696<BR>-----------------------------<BR>** Note new e-mail, building,
office<BR> information as of Sept. 2009
**<BR>-----------------------------<BR><BR>
<P>
<HR>
<P></P>_______________________________________________<BR>CIG-MC mailing
list<BR>CIG-MC@geodynamics.org<BR>http://geodynamics.org/cgi-bin/mailman/listinfo/cig-mc<BR></BLOCKQUOTE></BODY></HTML>