[CIG-SHORT] Pylith dies after thousands of time steps (convergence issue)

Tabrez Ali stali at purdue.edu
Thu Apr 30 06:11:09 PDT 2009


Brad

The solution at the last working step does converge and looks okay but  
then nothing happens and it dies. I am however experimenting with  
time_step and will also try to use the debugger.

Btw do you know if I can use --petsc.on_error_attach_debugger when the  
job is submitted via PBS or should I just run it interactively?

...
...
87 KSP Residual norm 3.579491816101e-07
88 KSP Residual norm 3.241876854223e-07
89 KSP Residual norm 2.836307394788e-07

[cli_0]: aborting job:
Fatal error in MPI_Wait: Error message texts are not available
[cli_1]: aborting job:
Fatal error in MPI_Wait: Error message texts are not available
[cli_3]: aborting job:
Fatal error in MPI_Wait: Error message texts are not available
[cli_2]: aborting job:
Fatal error in MPI_Wait: Error message texts are not available
mpiexec: Warning: tasks 0-3 exited with status 1.
--pyre-start: mpiexec: exit 1
/usr/rmt_share/scratch96/s/stali/pylith/bin/pylith: /usr/rmt_share/ 
scratch96/s/stali/pylith/bin/nemesis: exit 1

Tabrez

On Apr 29, 2009, at 4:26 PM, Brad Aagaard wrote:

> Tabrez-
>
> You may want to set ksp_monitor=true so that you can see the  
> residual. If the
> residual increases significantly, the solution is losing  
> convergence. This
> can be alleviated a bit by using an absolute convergence tolerance
> (ksp_atol). You probably need a slightly smaller time step or  
> slightly higher
> quality mesh (improve the aspect ratio of the most distorted cells).
>
> Brad
>
>
> On Wednesday 29 April 2009 1:13:21 pm Tabrez Ali wrote:
>> Brad
>>
>> I think you were right. The elastic problem worked out fine. I will
>> now try to play with time step (for the viscous runs)
>>
>> Tabrez
>>
>> On Apr 29, 2009, at 1:19 PM, Brad Aagaard wrote:
>>> On Wednesday 29 April 2009 10:09:26 am Tabrez Ali wrote:
>>>> Also I dont see the error until ~9000 time steps with one set of
>>>> material properties but get the error at around 4000th time step  
>>>> with
>>>> a different set of material properties (on the same mesh).
>>>
>>> This seems to indicate a time-integration stability issue. Does the
>>> one that
>>> has an error after 4000 time steps have a smaller Maxwell time? You
>>> might try
>>> running with purely elastic properties. If that works, then you may
>>> need to
>>> reduce your time step.
>
>



More information about the CIG-SHORT mailing list