"Bill Broadley \"Roundup Issue Tracker\"" <issue_tracker@geodynamics.org> wrote:
>
> Bill Broadley <bill@cse.ucdavis.edu> added the comment:
>
> Walter Landry "Roundup Issue Tracker" wrote:
> > Walter Landry <walter@geodynamics.org> added the comment:
> >
> > Please uncomment the four lines near the end
> >
> > <!-- <param name="journal.info">True</param> -->
> > <!-- <param name="journal.debug">True</param> -->
> > <!-- <param name="journal-level.info">2</param> -->
> > <!-- <param name="journal-level.debug">2</param> -->
> >
> > That will let me know exactly where it is getting stuck. Also, does it work if
> > you only run with one processor?
> >
>
> I kind of expected some output from a fast (2.2 GHz opteron) within 24 hours
> or so.
>
> After 275 hours on a 4 CPU 2.2 GHz opteron I got:
> mpirun -np 4 /share/apps/gale-1.2.2/bin/Gale
> `pwd`/input/benchmarks/extension.xml
> TimeStep = 1, Start time = 0 + 0 prev timeStep dt
> TimeStep = 1, Start time = 0 + 0 prev timeStep dt
> TimeStep = 1, Start time = 0 + 0 prev timeStep dt
> TimeStep = 1, Start time = 0 + 0 prev timeStep dt
> 3: In func SystemLinearEquations_NonLinearExecute: Failed to converge after
> 500 iterations.
> 0: In func SystemLinearEquations_NonLinearExecute: Failed to converge after
> 500 iterations.
> 2: In func SystemLinearEquations_NonLinearExecute: Failed to converge after
> 500 iterations.
> 1: In func SystemLinearEquations_NonLinearExecute: Failed to converge after
> 500 iterations.
> TimeStep = 2, Start time = 0 + 110.965 prev timeStep dt
> TimeStep = 2, Start time = 0 + 110.965 prev timeStep dt
> TimeStep = 2, Start time = 0 + 110.965 prev timeStep dt
> TimeStep = 2, Start time = 0 + 110.965 prev timeStep dt
> TimeStep = 3, Start time = 110.965 + 109.049 prev timeStep dt
> TimeStep = 3, Start time = 110.965 + 109.049 prev timeStep dt
> TimeStep = 3, Start time = 110.965 + 109.049 prev timeStep dt
> TimeStep = 3, Start time = 110.965 + 109.049 prev timeStep dt
>
> Top reports:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 23215 bill 25 0 91880 23m 13m R 101 0.3 14235:05 Gale
> 23216 bill 25 0 92272 24m 13m R 101 0.3 14235:56 Gale
> 23218 bill 25 0 91928 23m 13m R 101 0.3 14224:13 Gale
> 23214 bill 25 0 92168 24m 13m R 99 0.3 14234:13 Gale
>
> Is that kind of performance expected?
Iterative solvers for this problem do not work. You should use a
direct solver. Basically, just add
-pc_type lu -ksp_type preonly
to the command line in serial. In parallel, you need to compile PETSc
with MUMPS and add
-mat_type aijmumps -ksp_type preonly -pc_type lu
to the command line. You will get MUCH faster results that way.
> Is the failed to converge message expected?
With an iterative solver, yes. Try running it with a direct solver
and let me know if you still get the same problem. It should complete
quickly (a few minutes), and, with the debugging output, you will always
see progress.
Cheers,
Walter Landry
walter@geodynamics.org
|