[aspect-devel] Aspect hangs after several time steps

Timo Heister heister at clemson.edu
Wed Jan 27 00:36:19 PST 2016


Lev,

there are various issues with your setup. Please run in debug mode and
fix the problems. In particular, you are doing divisions by zero in
many places. Make sure you can run the same setup on a normal desktop
before thinking about running on a parallel cluster. I am listing the
first couple of problems I encountered:

1.
Additional Information:
Invalid character in field weak zone. Names of compositional fields
should consist of a combination of letters, numbers and underscores.

2.
Program received signal SIGFPE, Arithmetic exception.
0x00007fffd5e8de75 in aspect::MaterialModel::nz<3>::evaluate
(this=0x15746e0, in=..., out=...) at
/home/heister/Downloads/aspect-hang/timo2/nz.cc:77
77            double strainrate_E2 =
sqrt(0.5*(in.strain_rate[i][0][0]*in.strain_rate[i][0][0]+in.strain_rate[i][1][1]*in.strain_rate[i][1][1]+in.strain_rate[i][2][2]*in.strain_rate[i][2][2])
+ in.strain_rate[i][0][1]*in.strain_rate[i][0][1] +
in.strain_rate[i][0][2]*in.strain_rate[i][0][2] +
in.strain_rate[i][1][2]*in.strain_rate[i][1][2]);
(gdb) p in.strain_rate
$1 = std::vector of length 0, capacity 1

3.
Program received signal SIGFPE, Arithmetic exception.
0x00007fffd5e8e2a8 in aspect::MaterialModel::nz<3>::evaluate
(this=0x15746e0, in=..., out=...) at
/home/heister/Downloads/aspect-hang/timo2/nz.cc:101
101
exp((activation_energies[j]+activation_volumes[j]*in.pressure[i])/(nvs[j]*R*in.temperature[i]));
(gdb) p nvs
$1 = std::vector of length 4, capacity 4 = {1.5, 1.5, 3, 3}
(gdb) p j
$2 = 0
(gdb) p in.temperature
$6 = std::vector of length 1, capacity 1 = {0}


4.
Program received signal SIGFPE, Arithmetic exception.
0x00007fffd5e8e11f in aspect::MaterialModel::nz<3>::evaluate
(this=0x15746e0, in=..., out=...) at
/home/heister/Downloads/aspect-hang/timo2/nz.cc:87
87                double viscosity_MC =
(1/(1/(sigma_y/(2*strainrate_E2)+eta_min)+1/eta_max));
(gdb) p strainrate_E2
$1 = 0

5.
Program received signal SIGFPE, Arithmetic exception.
0x0000000000a62363 in
aspect::Simulator<3>::local_assemble_advection_system
(this=0x7fffffffb520, advection_field=..., viscosity_per_cell=...,
cell=..., scratch=..., data=...) at
/ssd/aspect-local/source/simulator/assembly.cc:2016
2016            data.local_rhs(i)
(gdb) p reaction_term
$2 = nan(0x4000000000000)

On Tue, Jan 26, 2016 at 11:04 PM, Lev Karatun <lev.karatun at gmail.com> wrote:
> Hi,
>
> the simulation that was run on 1 core resumed after a while (which never
> happened before - sorry about the confusion), and produced an error on time
> step 103 (attached). The same simulation on 8 cores is still hanging. The
> simulation on 8 cores without free surface ran without problems.
>
> Best regards,
> Lev Karatun.
>
> 2016-01-26 14:04 GMT-05:00 Lev Karatun <lev.karatun at gmail.com>:
>>
>> Hi Timo,
>>
>> I have found a good setup that reproduces the problem (after fixing the
>> error that I had, that is). On both 8 cores and 1 core the simulation stops
>> on time step 59. I attached the necessary files.
>>
>> Best regards,
>> Lev Karatun.
>>
>> 2016-01-26 5:15 GMT-05:00 Lev Karatun <lev.karatun at gmail.com>:
>>>
>>> Hi Timo,
>>>
>>> I composed a lengthy email with answers to your questions, but then I
>>> realized I have a mistake in boundary conditions, so now I need a bit more
>>> time to explore if the fixed ones work correctly.
>>>
>>> Thank you for you help.
>>>
>>> Best regards,
>>> Lev Karatun.
>>>
>>> 2016-01-25 7:09 GMT-05:00 Timo Heister <timo.heister at gmail.com>:
>>>>
>>>> Lev,
>>>>
>>>> I'm not sure what could cause a hang like you observe. Your first goal
>>>> should be to make this problem reproducible as quickly as possible
>>>> with the smallest number of processors.
>>>>
>>>> > Disabling free surface (changing the top boundary to free-slip) makes
>>>> > the simulation hang after 1 time step
>>>>
>>>> This is great. How many cores do you need to have this happen? If you
>>>> can reduce it further? Does this happen with any of the included
>>>> cookbooks? If not, please share your input file and all details
>>>> (number of cores, when it hangs, etc.).
>>>>
>>>>
>>>> On Mon, Jan 25, 2016 at 12:01 PM, Lev Karatun <lev.karatun at gmail.com>
>>>> wrote:
>>>> > Hi,
>>>> >
>>>> > (I already reported the problem last year
>>>> >
>>>> > (http://lists.geodynamics.org/pipermail/aspect-devel/2015-June/000919.html),
>>>> > but at that moment it wasn't as severe. Now I moved to the higher
>>>> > resolution
>>>> > models and it has gotten much worse)
>>>> >
>>>> > The problem is - Aspect hangs after a (seemingly) random timestep
>>>> > without
>>>> > producing any error. I ran several tests in attempts to narrow down
>>>> > the
>>>> > problem, and this is what I found:
>>>> > - Neither assigning a value to $TMP/$TMPDIR or adding a "Number of
>>>> > grouped
>>>> > files" parameter helps
>>>> > - Disabling visualization postprocessor doesn't help
>>>> > - Running Apsect in development mode produced the same results:
>>>> > hanging
>>>> > without any error messages
>>>> > - Restarting the hanging simulation from a checkpoint helps partially,
>>>> > it
>>>> > goes past the timestep at which in hung, but then after it hangs again
>>>> > - Disabling free surface (changing the top boundary to free-slip)
>>>> > makes the
>>>> > simulation hang after 1 time step
>>>> > - Changing CFL number affects how long the simulation runs before
>>>> > hanging:
>>>> > at 0.05 is hangs at timestep 13, at 0.1 - 56, at 0.2 it ran without
>>>> > hanging
>>>> > up to time step 153 (with disabled visualization though)
>>>> >
>>>> > Could you please help me solve this problem? It's really hurting the
>>>> > progress of my research.
>>>> > Thanks in advance!
>>>> >
>>>> > Best regards,
>>>> > Lev Karatun.
>>>> >
>>>> > _______________________________________________
>>>> > Aspect-devel mailing list
>>>> > Aspect-devel at geodynamics.org
>>>> > http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>>> _______________________________________________
>>>> Aspect-devel mailing list
>>>> Aspect-devel at geodynamics.org
>>>> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>>
>>>
>>
>
>
> _______________________________________________
> Aspect-devel mailing list
> Aspect-devel at geodynamics.org
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel



-- 
Timo Heister
http://www.math.clemson.edu/~heister/


More information about the Aspect-devel mailing list