[CIG-SEISMO] SPECFEM3D Cartesian MPI failure due to gatherv_all_cr() in fault_solver_dynamic.f90

Junwei Huang jwhuang1982 at gmail.com
Sat Mar 14 13:38:21 PDT 2015


Hi Kangchen,
Thanks for your reply. I git cloned the devel version again and the problem
still exists. I need to modified some part of the code to get it compiled
successfully with FC=gfortran and MPIFC=mpif90. Here are what I did.
1. in earth_chunk_all_Utils.f90,
-character text*80,cnlay*2,form*11
+character(len=10) text
+character(len=2) cnlay
+character(len=11) form
Otherwise, my gfortran compiler would say text, cnlay, form are not
defined.
2. in fault_scotch.f90, fault_solver_kinematic.f90,
fault_solver_dynamic.f90, fault_solver_common.f90,
fault_generate_database.f90
change all "../DATA/" to "./DATA/". Otherwise xgenerate_database and
xspecfem can't find the Par_file_faults file and run as no faults.

After that I can run xspecfem using single processor. With mpirun -np 2, or
more processors, I get the same MPI failure. I have attached the
output_solver and output_mesher files. Please let me know if I get any
settings wrong. Thanks.

On Fri, Mar 13, 2015 at 7:06 PM, Kangchen Bai <kbai at caltech.edu> wrote:

> Hi Junwei,
>
> We have tested the latest devel version of specfem3d with fault solver and
> didn't run into the same problem .
> We did fix a bug in generate_databases program hours ago but may not be
> relevant to this issue which is found in the solver itself.
> So could you please provide more information including your
> ./OUTPUT_FILES/output_solver.txt , output_mesher.txt and stdout stderr
> files so that we can better diagnose your problem.
> Kangchen
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.geodynamics.org/pipermail/cig-seismo/attachments/20150314/4edb583e/attachment.html>
-------------- next part --------------

 ******************************************
 *** Specfem3D MPI Mesher - f90 version ***
 ******************************************

 This is process            0
 There are            2  MPI processes
 Processes are numbered from 0 to            1

 There is a total of            2  slices

 NGLLX =            5
 NGLLY =            5
 NGLLZ =            5

 Shape functions defined by NGNOD =            8  control nodes
 Surface shape functions defined by NGNOD2D =            4  control nodes
 Beware! Curvature (i.e. HEX27 elements) is not handled by our internal mesher

 velocity model:   default 


 suppressing UTM projection

 no attenuation

 no anisotropy

 no oceans

 incorporating Stacey absorbing conditions

 using a CMTSOLUTION source


 using a Gaussian source time function


 **************************
 creating mesh in the model
 **************************

 external mesh points :        42240
 defined materials    :            1
 undefined materials  :            0
 total number of spectral elements:        37212
 absorbing boundaries: 
   xmin,xmax :          882         882
   ymin,ymax :          882         882
   bottom,top:         1772        1772
 total number of C-PML elements in the global mesh:            0
 number of MPI partition interfaces:            2

   minimum memory used so far     :    64.927567     MB per process
   minimum total memory requested :    301.67948150634766      MB per process

 create regions: 

   ...allocating arrays 
   ... reading            1  faults from file DATA/Par_file_faults
   ...setting up jacobian 
   ...indexing global points
   ... resetting up jacobian in fault domains
   ...preparing MPI interfaces 
      total MPI interface points:        30770
      total assembled MPI interface points:       30770
   ...setting up absorbing boundaries 
      absorbing boundary:
      total number of free faces =         1772
      total number of faces =         5300
   ...determining velocity model
               10  % time remaining:  3.91040852742714599E-007 s
               20  % time remaining:  3.53678073740143397E-007 s
               30  % time remaining:  3.14934544669187469E-007 s
               40  % time remaining:  2.62514008907355689E-007 s
               50  % time remaining:  2.61015730063034299E-007 s
               60  % time remaining:  2.55026482487403323E-007 s
               70  % time remaining:  2.10196207989413701E-007 s
               80  % time remaining:  1.52214806125626555E-007 s
               90  % time remaining:  8.19325334114913287E-008 s
              100  % time remaining:  2.20803043136808056E-010 s
   ...detecting acoustic-elastic-poroelastic surfaces 
      total acoustic elements   :           0
      total elastic elements    :       37212
      total poroelastic elements:           0
   ...element inner/outer separation 
      for overlapping of communications with calculations:
      percentage of   edge elements    5.0279312     %
      percentage of volume elements    94.972069     %
   ...element mesh coloring 
      use coloring =  F
   ...external binary models 
      no external binary model used 
   ...creating mass matrix 
   ...saving databases
   ...saving fault databases
   ...checking mesh resolution

 ********
 minimum and maximum number of elements
 and points in the CUBIT + SCOTCH mesh:

 NSPEC_global_min =        18417
 NSPEC_global_max =        18795
 NSPEC_global_max / NSPEC_global_min imbalance =    1.0205245      =    2.0524516      %
 NSPEC_global_sum =        37212

 NGLOB_global_min =      1214905
 NGLOB_global_max =      1245080
 NGLOB_global_max / NGLOB_global_min imbalance =    1.0248374      =    2.4837332      %
 NGLOB_global_sum =      2459985

 If you have elements of a single type (all acoustic, all elastic, all poroelastic, and without CPML)
 in the whole mesh, then there should be no significant imbalance in the above numbers.
 Otherwise, it is normal to have imbalance in elements and points because the domain decomposer
 compensates for the different cost of different elements by partitioning them unevenly among processes.
 ********


 ********
 Model: P velocity min,max =    6000.0000       6000.0000    
 Model: S velocity min,max =    3464.0000       3464.0000    
 ********

 *********************************************
 *** Verification of simulation parameters ***
 *********************************************

 *** Xmin and Xmax of the model =   -21000.000       21000.000    
 *** Ymin and Ymax of the model =   -21000.000       21000.000    
 *** Zmin and Zmax of the model =   -21000.000       0.0000000    

 *** Max GLL point distance =    438.15924    
 *** Min GLL point distance =    108.94147    
 *** Max/min ratio =    4.0219688    

 *** Max element size =    1365.3062    
 *** Min element size =    630.90759    
 *** Max/min ratio =    2.1640351    

 *** Minimum period resolved =   0.49267685    
 *** Maximum suggested time step =   9.07845609E-03

 Elapsed time for checking mesh resolution in seconds =   0.13074898719787598     

 min and max of topography included in mesh in m is    0.0000000000000000          0.0000000000000000     


 Repartition of elements:
 -----------------------

 total number of elements in mesh slice 0:        18795
 total number of points in mesh slice 0:      1245080

 total number of elements in entire mesh:        37212
 approximate total number of points in entire mesh (with duplicates on MPI edges):    2459985.0000000000     
 approximate total number of DOFs in entire mesh (with duplicates on MPI edges):    7379955.0000000000     

 total number of time steps in the solver will be:         4000

 using single precision for the calculations

 smallest and largest possible floating-point numbers are:   1.17549435E-38  3.40282347E+38


 Elapsed time for mesh generation and buffer creation in seconds =    30.266782045364380     
 End of mesh generation

 done

-------------- next part --------------

 **********************************************
 **** Specfem 3-D Solver - MPI version f90 ****
 **********************************************


 Fixing slow underflow trapping problem using small initial field

 There are            2  MPI processes
 Processes are numbered from 0 to            1

 There is a total of            2  slices

  NDIM =            3

  NGLLX =            5
  NGLLY =            5
  NGLLZ =            5

 using single precision for the calculations

 smallest and largest possible floating-point numbers are:   1.17549435E-38  3.40282347E+38

 velocity model:   default 

 total acoustic elements    :           0
 total elastic elements     :       37212
 total poroelastic elements :           0

 ********
 minimum and maximum number of elements
 and points in the CUBIT + SCOTCH mesh:

 NSPEC_global_min =        18417
 NSPEC_global_max =        18795
 NSPEC_global_max / NSPEC_global_min imbalance =    1.0205245      =    2.0524516      %
 NSPEC_global_sum =        37212

 NGLOB_global_min =      1214905
 NGLOB_global_max =      1245080
 NGLOB_global_max / NGLOB_global_min imbalance =    1.0248374      =    2.4837332      %
 NGLOB_global_sum =      2459985

 If you have elements of a single type (all acoustic, all elastic, all poroelastic, and without CPML)
 in the whole mesh, then there should be no significant imbalance in the above numbers.
 Otherwise, it is normal to have imbalance in elements and points because the domain decomposer
 compensates for the different cost of different elements by partitioning them unevenly among processes.
 ********


 ********
 Model: P velocity min,max =    6000.0000       6000.0000    
 Model: S velocity min,max =    3464.0000       3464.0000    
 ********

 *********************************************
 *** Verification of simulation parameters ***
 *********************************************

 *** Xmin and Xmax of the model =   -21000.000       21000.000    
 *** Ymin and Ymax of the model =   -21000.000       21000.000    
 *** Zmin and Zmax of the model =   -21000.000       0.0000000    

 *** Max GLL point distance =    438.15924    
 *** Min GLL point distance =    108.94147    
 *** Max/min ratio =    4.0219688    

 *** Max element size =    1365.3062    
 *** Min element size =    630.90759    
 *** Max/min ratio =    2.1640351    

 *** Minimum period resolved =   0.49267685    
 *** Maximum suggested time step =   9.07845609E-03

 *** for DT :   2.00000000000000004E-003
 *** Max stability for wave velocities =   0.11015089    

 Elapsed time for checking mesh resolution in seconds =   0.12752795219421387     
 ******************************************
 There is a total of            2  slices
 ******************************************

 no UTM projection:

 *************************************
  locating source            1
 *************************************

 source located in slice            1
                in element        17582
                in elastic domain

 using moment tensor source: 
   xi coordinate of source in that element:    1.0000000000000000     
   eta coordinate of source in that element:   -1.0000000000000000     
   gamma coordinate of source in that element:   -1.0000000000000000     


 Source time function is a Heaviside, convolve later

   half duration:   1.00000000000000002E-002  seconds

 magnitude of the source:
      scalar moment M0 =    1.5418313185948713       dyne-cm
   moment magnitude Mw =   -10.574641900550120     

   time shift:    0.0000000000000000       seconds

 original (requested) position of the source:

           latitude:    1000000.0000000000     
          longitude:    1000000.0000000000     

              x:    1000000.0000000000     
              y:    1000000.0000000000     
          depth:   -1000000.0000000000       km
 topo elevation:    0.0000000000000000     

 position of the source that will be used:

              x:    21000.000000000000     
              y:    21000.000000000000     
          depth:    0.0000000000000000       km
              z:    0.0000000000000000     

 error in location of the source:   1.00000096E+09  m

 *****************************************************
 *****************************************************
 ***** WARNING: source location estimate is poor *****
 *****************************************************
 *****************************************************

 maximum error in location of the sources:   1.00000096E+09  m


 Elapsed time for detection of sources in seconds =   1.96411609649658203E-002

 End of source detection - done


 there are            4  stations in file ./DATA/STATIONS
 saving            4  stations inside the model in file ./DATA/STATIONS_FILTERED
 excluding            0  stations located outside the model


 Total number of receivers =            4


 ********************
  locating receivers
 ********************

 reading receiver information from ./DATA/STATIONS_FILTERED file

 Station #           1 : SC.str12dp00    horizontal distance:     20.124611      km
 Station #           2 : SC.str-12dp00    horizontal distance:     37.589893      km
 Station #           3 : SC.str12dp75    horizontal distance:     20.124611      km
 Station #           4 : SC.str-12dp75    horizontal distance:     37.589893      km

 station #            1     SC    str12dp00
      original latitude:    3000.0000    
      original longitude:    12000.000    
      original x:    12000.000    
      original y:    3000.0000    
      original depth:    0.0000000      m
      horizontal distance:    20.124611    
      target x, y, z:    12000.000       3000.0000       0.0000000    
      closest estimate found:   1.81898940E-12  m away
      in slice            1  in element        17934
      at coordinates: 
      xi    =  -0.20647596969423271     
      eta   =  -0.91038011331253543     
      gamma =   -1.0000000000000000     
      x:    11999.999999999998     
      y:    3000.0000000000000     
      depth:    0.0000000000000000       m
      z:    0.0000000000000000     



 station #            2     SC    str-12dp00
      original latitude:    3000.0000    
      original longitude:   -12000.000    
      original x:   -12000.000    
      original y:    3000.0000    
      original depth:    0.0000000      m
      horizontal distance:    37.589893    
      target x, y, z:   -12000.000       3000.0000       0.0000000    
      closest estimate found:    0.0000000      m away
      in slice            1  in element        17927
      at coordinates: 
      xi    =  -0.66479133261928258     
      eta   =  -0.34482670791132286     
      gamma =   -1.0000000000000000     
      x:   -12000.000000000000     
      y:    3000.0000000000000     
      depth:    0.0000000000000000       m
      z:    0.0000000000000000     



 station #            3     SC    str12dp75
      original latitude:    3000.0000    
      original longitude:    12000.000    
      original x:    12000.000    
      original y:    3000.0000    
      original depth:    7500.0000      m
      horizontal distance:    20.124611    
      target x, y, z:    12000.000       3000.0000      -7500.0000    
      closest estimate found:   9.09494702E-13  m away
      in slice            1  in element        11795
      at coordinates: 
      xi    =  -0.21616340595760816     
      eta   =  -0.91051580812572375     
      gamma =  -1.51244101951658958E-020
      x:    12000.000000000000     
      y:    3000.0000000000000     
      depth:    7500.0000000000009       m
      z:   -7500.0000000000009     



 station #            4     SC    str-12dp75
      original latitude:    3000.0000    
      original longitude:   -12000.000    
      original x:   -12000.000    
      original y:    3000.0000    
      original depth:    7500.0000      m
      horizontal distance:    37.589893    
      target x, y, z:   -12000.000       3000.0000      -7500.0000    
      closest estimate found:   1.87497132E-12  m away
      in slice            1  in element        11788
      at coordinates: 
      xi    =  -0.66567125667187177     
      eta   =  -0.36437165260444360     
      gamma =   1.81898940354585809E-015
      x:   -12000.000000000002     
      y:    3000.0000000000005     
      depth:    7500.0000000000000       m
      z:   -7500.0000000000000     


 maximum error in location of all the receivers:   1.87497132E-12  m

 Elapsed time for receiver detection in seconds =   0.10065293312072754     

 End of receiver detection - done


 Total number of samples for seismograms =         4000

 found a total of            4  receivers in all the slices


 no attenuation

 no anisotropy

 no oceans

 no gravity

 no acoustic simulation

 incorporating elastic simulation

 no poroelastic simulation

 no movie simulation


 There is 1 fault in file DATA/Par_file_faults
 There is 1 fault in file DATA/Par_file_faults

 no gravity simulation


 Elapsed time for preparing timerun in seconds =   0.24629902839660645     

 time loop:

            time step:   2.00000009E-03  s
 number of time steps:         4000
 total simulated time:    8.0000000      seconds
 start time: -1.99999996E-02  seconds



More information about the CIG-SEISMO mailing list