[CIG-SEISMO] SPECFEM3D: time per time step increasing during simulation

Tue Oct 7 14:10:55 PDT 2014

Dimitri and Martin,

Thanks for the hints. It looks like a problem with output, not 
denormalizing floats. When I turned off output, the time per time step 
became constant. The file server on the cluster was recently updated, so 
there might still be some kinks to work out there.

The denormalizing flags are turned on and -ftz with the Intel compilers 
didn't show any difference, so the defaults appear to be fine for that 
issue.

Thanks,
Brad

On 10/07/2014 12:46 PM, Dimitri Komatitsch wrote:
>
> Hi Brad and Martin, Hi all,
>
> Yes, that comes from very slow gradual underflow that is on by default
> unfortunately on some Intel procs, one must turn it off. SPECFEM has
> three options to handle that (you can use all of them, it does not hurt;
> in principle they are on by default though, thus I am surprised they
> seem to be off on your machine):
>
> 1/ compile with -ftz (Flush-to-Zero) for the Intel compiler;
> option -ffast-math for gfortran sometimes works and sometimes does not
> for some reason
>
> 2/ call a C function I wrote based on some routines I found on the Web:
> https://github.com/geodynamics/specfem3d/blob/devel/src/shared/force_ftz.c
>
> 3/ set the initial field to some small value instead of zero before the
> time loop to avoid underflows; here is how we do it in SPECFEM (the flag
> is defined in setup/constants.h.in, and is on by default):
>
> ! on some processors it is necessary to suppress underflows
> ! by using a small initial field instead of zero
>    logical, parameter :: FIX_UNDERFLOW_PROBLEM = .true.
>
>    if(FIX_UNDERFLOW_PROBLEM) displ(:,:) = VERYSMALLVAL
>
> (where VERYSMALLVAL is 1.d-24 or so)
>
> Best regards,
> Dimitri.
>
> On 10/07/2014 08:01 PM, Martin van Driel wrote:
>> Dear Brad,
>>
>> I am not a SPECFEM user, but I wanted to mention that we had some
>> similar effect in AxiSEM a while ago. We finally identified the reason:
>> denormal floats. Some reading:
>>
>> http://stackoverflow.com/questions/9314534/why-does-changing-0-1f-to-0-slow-down-performance-by-10x
>>
>>
>> When the wavefield is initialized with zeros, it goes through the regime
>> of denormal floats at the first rise of the P-wave. As the wavefront
>> increases, there are more and more denormal floats to treat. In AxiSEM,
>> once the p-wave had arrived the antipode, the simulation went back to
>> normal speed. The difference in our case was a factor of three in
>> performance.
>>
>> We found several solutions that change how denormal floats are treated.
>>
>> 1) Compiler flags: -ffast-math for gfortran, -ftz for ifort. Cray seems
>> to has the flush to zero enabled per default.
>>
>> 2) Alternatively, the behaviour can be changed using "IA intrinsics", see
>>
>> https://software.intel.com/en-us/articles/how-to-avoid-performance-penalties-for-gradual-underflow-behavior
>>
>>
>> 3) A simplistic solution is to initialize the displacement with some
>> value that is just above the denormal range, for single precision, 1e-30
>> or 1e-35 worked, if I recall correctly.
>>
>> Our current solution in AxiSEM is the IA intrinsics and essentially
>> consists in calling the function set_ftz() once in the very beginning of
>> the program:
>>
>> https://github.com/geodynamics/axisem/blob/master/SOLVER/ftz.c
>>
>> Hope this helps,
>> Martin
>>
>>
>> On 10/07/2014 06:39 PM, Brad Aagaard wrote:
>>> SPECFEM3D users and developers,
>>>
>>> I am finding that the average time per time step in a SPECFEM3D
>>> simulation is increasing as the simulation progresses:
>>>
>>>   Time step #          400
>>>   Time:   -1.002500      seconds
>>>   Elapsed time in seconds =    135.029711008072
>>>   Elapsed time in hh:mm:ss =    0 h 02 m 15 s
>>>   Mean elapsed time per time step in seconds =   0.337574277520180
>>>
>>>   Time step #          800
>>>   Time:  -2.4999999E-03  seconds
>>>   Elapsed time in seconds =    420.503839015961
>>>   Elapsed time in hh:mm:ss =    0 h 07 m 00 s
>>>   Mean elapsed time per time step in seconds =   0.525629798769951
>>>
>>>   Time step #         1200
>>>   Time:   0.9975000      seconds
>>>   Elapsed time in seconds =    854.967207908630
>>>   Elapsed time in hh:mm:ss =    0 h 14 m 14 s
>>>   Mean elapsed time per time step in seconds =   0.712472673257192
>>>
>>>   Time step #         1600
>>>   Time:    1.997500      seconds
>>>   Elapsed time in seconds =    1439.92759609222
>>>   Elapsed time in hh:mm:ss =    0 h 23 m 59 s
>>>   Mean elapsed time per time step in seconds =   0.899954747557640
>>>
>>> This behavior seems very odd because I would expect the work per time
>>> step to be constant. The job is running on 4 compute nodes (32 cores
>>> total) and easily fits in memory. I don't see any anomalous behavior on
>>> the cluster diagnostics (CPU load, network traffic, etc) consistent with
>>> an increasing workload. I have forked off the git master branch to add
>>> my own seismic velocity model.
>>>
>>> Has this behavior been observed before?
>>>
>>> I can try turning off output to see if that isolates the problem. Does
>>> anyone have any other suggestions?
>>>
>>> Thanks,
>>> Brad Aagaard
>>> _______________________________________________
>>> CIG-SEISMO mailing list
>>> CIG-SEISMO at geodynamics.org
>>> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
>> _______________________________________________
>> CIG-SEISMO mailing list
>> CIG-SEISMO at geodynamics.org
>> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
>>
>