[aspect-devel] Aspect hangs after several time steps

Timo Heister heister at clemson.edu
Tue Feb 23 05:32:57 PST 2016


Hey Lev,

running your test 135.prm I discovered a bug introduced over the last
few days, see https://github.com/geodynamics/aspect/pull/767

But with that fix, your tests runs fine here. What version of ASPECT
are you using? What does "make test" report? Do other examples work
for you?


On Mon, Feb 22, 2016 at 5:59 PM, Lev Karatun <lev.karatun at gmail.com> wrote:
> Hi again,
>
> so I simplified the setup (prm file attached) to the point when I have a box
> with a single compositional field, tangential velocity b.c. for all 6 faces,
> constant temperature throughout the entire box - so I'm literally modelling
> a constant state of material with nothing happening. And the simulation
> fails with the error " The residual in the last step was nan". However, if I
> delete the compositional fields from the input file, the simulation runs
> fine.
> What can be causing this?..
>
> Best regards,
> Lev Karatun.
>
> 2016-02-22 0:07 GMT-05:00 Lev Karatun <lev.karatun at gmail.com>:
>>
>> Hi Timo,
>>
>> I've simplified the model to the extreme, and still the solver doesn't
>> converge (The residual in the last step was nan). It's not even using any of
>> the plugins that I used, so I'm really confused. The cookbooks, however, run
>> without problems. (and my model in release mode runs successfully too)
>>
>> I'm trying to run the models on a lab computer, single core. I attached
>> the prm file that I'm using, could you have a look please?
>> Thanks in advance!
>>
>> Best regards,
>> Lev Karatun.
>>
>> 2016-02-12 9:10 GMT-05:00 Timo Heister <heister at clemson.edu>:
>>>
>>> Lev,
>>>
>>> your system setup (compiler, operating system, standard libraries)
>>> doesn't allow for the debugging trick to detect floating point
>>> exceptions. You could try a recent version of clang (or gcc if you
>>> were using clang) and see. Sorry, I don't know a sure way to get this
>>> enabled (I have no problems with it on ubuntu 14.04 using gcc or
>>> clang).
>>>
>>> That said, convergence failures can have other reasons not detectable
>>> using this technique. It is hard to tell, but you should probably
>>> simplify your test problem and if it works, increase the complexity
>>> one by one. Do the included files like convection-box or shell 2d/3d
>>> work correctly on your machine (is it a cluster?)?
>>>
>>> Best,
>>> Timo
>>>
>>> On Fri, Feb 12, 2016 at 5:21 AM, Lev Karatun <lev.karatun at gmail.com>
>>> wrote:
>>> > Hi Wolfgang,
>>> >
>>> > I fixed the issues that Timo pointed out, but I'm still getting the
>>> > "did not
>>> > converge" error. Is there a way for me to debug it the same way Timo
>>> > did?
>>> > What am I missing? A compiler with C++11 support? Something else?
>>> >
>>> > Best regards,
>>> > Lev Karatun.
>>> >
>>> > 2016-02-03 9:20 GMT-05:00 Wolfgang Bangerth <bangerth at tamu.edu>:
>>> >>
>>> >>
>>> >> Lev,
>>> >> this is complicated.
>>> >>
>>> >> You switched on floating point signals by hand, but your compiler
>>> >> support
>>> >> library has functions that trip this, so your program gets stopped in
>>> >> a
>>> >> place that is outside anyone's control (and where it doesn't actually
>>> >> matter
>>> >> much). This is why floating point exceptions were disabled
>>> >> automatically
>>> >> when you ran the ASPECT cmake script. In other words, to make
>>> >> progress, you
>>> >> need to disable floating point exceptions again.
>>> >>
>>> >> Of course, without being able to run FPE on your system, you won't be
>>> >> able
>>> >> to reproduce the issues Timo found on his system. But you don't have
>>> >> to find
>>> >> these issues -- he already did that for you. Just fix these places and
>>> >> you
>>> >> should be able to make progress.
>>> >>
>>> >> Best
>>> >>  W.
>>> >>
>>> >>
>>> >> On 02/01/2016 11:00 PM, Lev Karatun wrote:
>>> >>>
>>> >>> Hi Wolfgang,
>>> >>>
>>> >>> I've tried this, the last stack frame is:
>>> >>>
>>> >>>     #13 0x00000000012e50d5 in main (argc=2, argv=0x7fffffffe128) at
>>> >>>     /home/lev/aspect/aspect_debug_new/source/main.cc:513
>>> >>>
>>> >>>
>>> >>> The 513 line in main.cc is:
>>> >>>
>>> >>>     aspect::Simulator<3>::declare_parameters(prm);
>>> >>>
>>> >>>
>>> >>> Looks like it stops at the attempt of reading the Start time line:
>>> >>>
>>> >>>     #10 0x00007ffff537a569 in dealii::ParameterHandler::declare_entry
>>> >>>     (this=0x7fffffffdc90, entry="Start time", default_value="0",
>>> >>> pattern=...,
>>> >>>     documentation=
>>> >>>          "The start time of the simulation. Units: Years if the 'Use
>>> >>> years in
>>> >>>     output instead of seconds' parameter is set; seconds otherwise.")
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /home/lev/distrib/dealii_debug_new/source/base/parameter_handler.cc:1628
>>> >>>
>>> >>>
>>> >>> but it's just "set Start time = 0", I never changed it.
>>> >>>
>>> >>> Full backtrace:
>>> >>>
>>> >>>     Program received signal SIGFPE, Arithmetic exception.
>>> >>>     __mpn_lshift () at ../sysdeps/x86_64/lshift.S:26
>>> >>>     26              movq    -8(%rsi,%rdx,8), %mm7
>>> >>>     (gdb) bt
>>> >>>     #0  __mpn_lshift () at ../sysdeps/x86_64/lshift.S:26
>>> >>>     #1  0x0000003db304a53e in ___printf_fp (fp=0x7fffffff71b0,
>>> >>>     info=0x7fffffff70b0, args=<value optimized out>) at
>>> >>> printf_fp.c:483
>>> >>>     #2  0x0000003db30458a0 in _IO_vfprintf_internal (s=<value
>>> >>> optimized
>>> >>> out>,
>>> >>>     format=<value optimized out>, ap=<value optimized out>) at
>>> >>> vfprintf.c:1640
>>> >>>     #3  0x0000003db306f752 in _IO_vsnprintf (string=0x7fffffff7420
>>> >>> "",
>>> >>>     maxlen=<value optimized out>, format=0x7fffffff74f0 "%.*g",
>>> >>>     args=0x7fffffff7310)
>>> >>>          at vsnprintf.c:120
>>> >>>     #4  0x0000003dbd87eb4f in std::__convert_from_v (__cloc=<value
>>> >>> optimized
>>> >>>     out>, __out=0x7fffffff7420 "", __size=45, __fmt=0x7fffffff74f0
>>> >>> "%.*g")
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/c++locale.h:89
>>> >>>     #5  0x0000003dbd880f23 in std::num_put<char,
>>> >>>     std::ostreambuf_iterator<char, std::char_traits<char> >
>>> >>>      >::_M_insert_float<double> (this=0x3dbdaf22e0, __s=..., __io=
>>> >>>          ..., __fill=32 ' ', __mod=<value optimized out>,
>>> >>>     __v=-1.7976931348623157e+308)
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.tcc:980
>>> >>>     #6  0x0000003dbd881249 in std::num_put<char,
>>> >>>     std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put
>>> >>>     (this=<value optimized out>, __s=...,
>>> >>>          __io=<value optimized out>, __fill=<value optimized out>,
>>> >>> __v=<value
>>> >>>     optimized out>)
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.tcc:1127
>>> >>>     #7  0x0000003dbd89487f in put (this=0x7fffffff7630,
>>> >>>     __v=-1.7976931348623157e+308)
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.h:2390
>>> >>>     #8  std::basic_ostream<char, std::char_traits<char>
>>> >>> >::_M_insert<double>
>>> >>>     (this=0x7fffffff7630, __v=-1.7976931348623157e+308)
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream.tcc:73
>>> >>>     #9  0x00007ffff5373468 in dealii::Patterns::Double::description
>>> >>>     (this=0x1f92620) at
>>> >>>
>>> >>>
>>> >>> /home/lev/distrib/dealii_debug_new/source/base/parameter_handler.cc:290
>>> >>>     #10 0x00007ffff537a569 in dealii::ParameterHandler::declare_entry
>>> >>>     (this=0x7fffffffdc90, entry="Start time", default_value="0",
>>> >>> pattern=...,
>>> >>>     documentation=
>>> >>>          "The start time of the simulation. Units: Years if the 'Use
>>> >>> years in
>>> >>>     output instead of seconds' parameter is set; seconds otherwise.")
>>> >>>          at
>>> >>>
>>> >>>
>>> >>> /home/lev/distrib/dealii_debug_new/source/base/parameter_handler.cc:1628
>>> >>>     #11 0x000000000103fef2 in
>>> >>> aspect::Parameters<3>::declare_parameters
>>> >>>     (prm=...) at
>>> >>>
>>> >>> /home/lev/aspect/aspect_debug_new/source/simulator/parameters.cc:88
>>> >>>     #12 0x000000000104e2e8 in
>>> >>> aspect::Simulator<3>::declare_parameters
>>> >>>     (prm=...) at
>>> >>>
>>> >>> /home/lev/aspect/aspect_debug_new/source/simulator/parameters.cc:1283
>>> >>>     #13 0x00000000012e50d5 in main (argc=2, argv=0x7fffffffe128) at
>>> >>>     /home/lev/aspect/aspect_debug_new/source/main.cc:513
>>> >>>
>>> >>>
>>> >>>
>>> >>> Best regards,
>>> >>> Lev Karatun.
>>> >>>
>>> >>> 2016-02-01 21:34 GMT-05:00 Wolfgang Bangerth <bangerth at tamu.edu
>>> >>> <mailto:bangerth at tamu.edu>>:
>>> >>>
>>> >>>     On 02/01/2016 06:42 PM, Lev Karatun wrote:
>>> >>>
>>> >>>
>>> >>>         Could you please tell me what I'm doing wrong?..
>>> >>>         Thanks in advance.
>>> >>>
>>> >>>
>>> >>>     I don't think you *need* any of these packages. You may not get
>>> >>> to
>>> >>> see the
>>> >>>     exact location where the error happened in BLAS or some other
>>> >>> low-level
>>> >>>     library, but ultimately you only want to know the place where you
>>> >>> were in
>>> >>>     your own code. In other words, if the debugger stops at the
>>> >>> location
>>> >>> where
>>> >>>     the error happens, get a backtrace and go to the last stack frame
>>> >>> in
>>> >>> your
>>> >>>     code. Your task is then to find out what is wrong in that line.
>>> >>>
>>> >>>     Best
>>> >>>       W.
>>> >>>
>>> >>>     --
>>> >>>
>>> >>>
>>> >>> ------------------------------------------------------------------------
>>> >>>     Wolfgang Bangerth               email: bangerth at math.tamu.edu
>>> >>>     <mailto:bangerth at math.tamu.edu>
>>> >>>                                      www:
>>> >>> http://www.math.tamu.edu/~bangerth/
>>> >>>
>>> >>>     _______________________________________________
>>> >>>     Aspect-devel mailing list
>>> >>>     Aspect-devel at geodynamics.org
>>> >>> <mailto:Aspect-devel at geodynamics.org>
>>> >>>
>>> >>> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> Aspect-devel mailing list
>>> >>> Aspect-devel at geodynamics.org
>>> >>> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>> >>>
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >> ------------------------------------------------------------------------
>>> >> Wolfgang Bangerth               email:
>>> >> bangerth at math.tamu.edu
>>> >>                                 www:
>>> >> http://www.math.tamu.edu/~bangerth/
>>> >>
>>> >> _______________________________________________
>>> >> Aspect-devel mailing list
>>> >> Aspect-devel at geodynamics.org
>>> >> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Aspect-devel mailing list
>>> > Aspect-devel at geodynamics.org
>>> > http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>>
>>>
>>>
>>> --
>>> Timo Heister
>>> http://www.math.clemson.edu/~heister/
>>> _______________________________________________
>>> Aspect-devel mailing list
>>> Aspect-devel at geodynamics.org
>>> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>>
>>
>
>
> _______________________________________________
> Aspect-devel mailing list
> Aspect-devel at geodynamics.org
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel



-- 
Timo Heister
http://www.math.clemson.edu/~heister/


More information about the Aspect-devel mailing list