[aspect-devel] Fwd: error when writing checkpoint files

Magali Billen mibillen at ucdavis.edu
Thu Jul 5 14:15:46 PDT 2018


Hello Timo,

I can answer some of your questions and then I’ll need to ask our sys-admin some of the others.
1. All the graphical output works fine - I’ve had no problems.
2. Two files do get written to the output directory from the checkpointing: restart.mesh.info and restart.mesh  
3. The machine is my own cluster so I don’t think I have any quota or write access constraints (but I’ll check this since 
the operating system was just updated).  

What is suppose to happen during checkpointing - is each processor writing its own files to the output directory?
Is this handled differently than the visualization output, which for VTK output also writes individual files for each processor?

-Magali


> On Jul 5, 2018, at 10:08 PM, Timo Heister <heister at clemson.edu> wrote:
> 
> Magali,
> 
> sadly the error message is not terribly helpful. What kind of
> filesystem are you writing the output directory to? It might be that
> you don't have quota or write access from all nodes or no access to
> MPI I/O on that volume. Can you try using a dedicated parallel
> filesystem instead of your home directory (if this exists on that
> machine)? Does the graphical output until that point work correctly?
> On Thu, Jul 5, 2018 at 3:52 PM Magali Billen <mibillen at ucdavis.edu <mailto:mibillen at ucdavis.edu>> wrote:
>> 
>> I created a fast simpler test to make sure that this error does not have anything to do with my material model module.
>> And, I still get the same error when trying to write a checkpointing file.
>> 
>> I’ve attached the parameter file and output file. This uses the simple material model in a 2d-box.
>> It runs for one time step and then tries to write the check pointing files and aborts.  I’ve pasted the text from
>> the output file with errors below since its short.
>> 
>> -Magali
>> 
>> 
>>> more *.output
>> -----------------------------------------------------------------------------
>> -- This is ASPECT, the Advanced Solver for Problems in Earth's ConvecTion.
>> --     . version 2.1.0-pre (master, f5e6a41d5)
>> --     . using deal.II 9.0.0
>> --     . using Trilinos 12.10.1
>> --     . using p4est 2.0.0
>> --     . running in DEBUG mode
>> --     . running with 4 MPI processes
>> -- How to cite ASPECT: https://urldefense.proofpoint.com/v2/url?u=https-3A__aspect.geodynamics.org_cite.html&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=6qL2eRKdrYQHmOojoTGocM9KC1I9SRb6cFOy7lYw5sI&e= <https://urldefense.proofpoint.com/v2/url?u=https-3A__aspect.geodynamics.org_cite.html&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=6qL2eRKdrYQHmOojoTGocM9KC1I9SRb6cFOy7lYw5sI&e=> 
>> -----------------------------------------------------------------------------
>> 
>> 
>> -----------------------------------------------------------------------------
>> The output directory <output-checkpoint_test/> provided in the input file appears not to exist.
>> ASPECT will create it for you.
>> -----------------------------------------------------------------------------
>> 
>> 
>> Number of active cells: 320 (on 4 levels)
>> Number of degrees of freedom: 4,500 (2,754+369+1,377)
>> 
>> *** Timestep 0:  t=0 years
>>   Solving temperature system... 0 iterations.
>>   Rebuilding Stokes preconditioner...
>>   Solving Stokes system... 13+0 iterations.
>> 
>> Number of active cells: 560 (on 5 levels)
>> Number of degrees of freedom: 8,034 (4,922+651+2,461)
>> 
>> *** Timestep 0:  t=0 years
>>   Solving temperature system... 0 iterations.
>>   Rebuilding Stokes preconditioner...
>>   Solving Stokes system... 15+0 iterations.
>> 
>> Number of active cells: 1,280 (on 6 levels)
>> Number of degrees of freedom: 18,215 (11,174+1,454+5,587)
>> 
>> *** Timestep 0:  t=0 years
>>   Solving temperature system... 0 iterations.
>>   Rebuilding Stokes preconditioner...
>>   Solving Stokes system... 14+0 iterations.
>> 
>> Number of active cells: 3,680 (on 7 levels)
>> Number of degrees of freedom: 51,050 (31,354+4,019+15,677)
>> 
>> *** Timestep 0:  t=0 years
>>   Solving temperature system... 0 iterations.
>>   Rebuilding Stokes preconditioner...
>>   Solving Stokes system... 15+0 iterations.
>> 
>> Number of active cells: 14,120 (on 8 levels)
>> Number of degrees of freedom: 190,061 (116,846+14,792+58,423)
>> 
>> *** Timestep 0:  t=0 years
>>   Solving temperature system... 0 iterations.
>>   Rebuilding Stokes preconditioner...
>>   Solving Stokes system... 16+0 iterations.
>> 
>>   Postprocessing:
>>     Writing graphical output: output-checkpoint_test/solution/solution-00000
>>     RMS, max velocity:        0.00254 m/year, 0.0136 m/year
>>     Temperature min/avg/max:  273 K, 1621 K, 1675 K
>> 
>> Number of active cells: 14,120 (on 8 levels)
>> Number of degrees of freedom: 190,061 (116,846+14,792+58,423)
>> 
>> Abort: MPI error
>> Abort: /share/apps/cig/dealii/dealii-9.0.0/install//tmp/unpack/p4est-2.0/src/p4est.c:3407
>> Abort
>> Abort: MPI error
>> Abort: /share/apps/cig/dealii/dealii-9.0.0/install//tmp/unpack/p4est-2.0/src/p4est.c:3407
>> Abort
>> Abort: MPI error
>> Abort: /share/apps/cig/dealii/dealii-9.0.0/install//tmp/unpack/p4est-2.0/src/p4est.c:3407
>> Abort
>> Abort: MPI error
>> Abort: /share/apps/cig/dealii/dealii-9.0.0/install//tmp/unpack/p4est-2.0/src/p4est.c:3407
>> Abort
>> SIGABRT received
>> SIGABRT received
>> SIGABRT received
>> [c12-11:01560] *** Process received signal ***
>> [c12-11:01560] Signal: Aborted (6)
>> [c12-11:01560] Signal code:  (-6)
>> [c12-11:01560] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f0930bdf890]
>> [c12-11:01560] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f092fed6e97]
>> [c12-11:01560] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f092fed8801]
>> [c12-11:01560] [ 3] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libsc-2.0.so(sc_abort_verbose+0x0)[0x7f092f925390]
>> [c12-11:01560] [ 4] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libsc-2.0.so(sc_abort+0xd)[0x7f092f92517b]
>> [c12-11:01560] [ 5] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libsc-2.0.so(sc_abort_verbosef+0x0)[0x7f092f925424]
>> [c12-11:01560] [ 6] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libp4est-2.0.so(p4est_save_ext+0x688)[0x7f092fb7f2c2]
>> [c12-11:01560] [ 7] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libp4est-2.0.so(p4est_save+0x2b)[0x7f092fb7ec37]
>> [c12-11:01560] [ 8] /share/apps/cig/dealii/dealii-9.0.0/install/deal.II-v9.0.0/lib/libdeal_II.g.so.9.0.0(_ZNK6dealii8parallel11distributed13T
>> riangulationILi2ELi2EE4saveEPKc+0x35a)[0x7f093ab7ae74]
>> [c12-11:01560] [ 9] /home/billen/AspectProjects/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE15create_snapshotEv+0x51e)[0x5567b6c96a9e]
>> [c12-11:01560] [10] /home/billen/AspectProjects/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE22maybe_write_checkpointElSt4pairIbbE+0x75)[0x5
>> 567b6d04165]
>> [c12-11:01560] [11] /home/billen/AspectProjects/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE3runEv+0x10c1)[0x5567b6ccca7b]
>> [c12-11:01560] [12] /home/billen/AspectProjects/aspect/build/aspect(_Z13run_simulatorILi2EEvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESa
>> IcEEEbb+0x117)[0x5567b68a4709]
>> [c12-11:01560] [13] /home/billen/AspectProjects/aspect/build/aspect(main+0x4df)[0x5567b6883b21]
>> [c12-11:01560] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f092feb9b97]
>> [c12-11:01560] [15] /home/billen/AspectProjects/aspect/build/aspect(_start+0x2a)[0x5567b66ef6ea]
>> [c12-11:01560] *** End of error message ***
>> srun: error: c12-11: task 0: Aborted
>> srun: error: c12-11: tasks 1-3: Exited with exit code 1
>> 
>> Begin forwarded message:
>> 
>> From: Magali Billen <mibillen at ucdavis.edu <mailto:mibillen at ucdavis.edu>>
>> Subject: error when writing checkpoint files
>> Date: July 5, 2018 at 5:40:37 PM GMT+2
>> To: aspect-devel at geodynamics.org <mailto:aspect-devel at geodynamics.org>
>> 
>> hello,
>> I have encountered an error when turning checkpointing on. The program aborts with errors related to Triangulation and p4est
>> :-(
>> 
>> I have attached the output log and the parameter file.  I’ve pasted a snippet of the error below, which appears exactly at the timestep
>> when the checkpointing should be written. There are only two files related to checkpointing started in the output directory restart.mesh and restart.mesh.info <http://restart.mesh.info/>
>> 
>> This uses my module for the material module ( I modified a version of visco_plastic.cc to read in and use some parameters, I didn’t
>> change anything related to output - at least not on purpose. The module works in all other ways in terms of normal visualization output).
>> And, based on the error messages, I don’t think this could be due something with the module).
>> 
>> - Magali
>> 
>> 
>> SIGABRT received
>> SIGABRT received
>> [c12-16:24102] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f55406e4890]
>> [c12-16:24102] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f553f9dbe97]
>> [c12-16:24102] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f553f9dd801]
>> [c12-16:24102] [ 3] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libsc-2.0.so(sc_abort_verbose+0x0)[0x7f553f42a390]
>> [c12-16:24102] [ 4] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libsc-2.0.so(sc_abort+0xd)[0x7f553f42a17b]
>> [c12-16:24102] [ 5] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libsc-2.0.so(sc_abort_verbosef+0x0)[0x7f553f42a424]
>> [c12-16:24102] [ 6] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libp4est-2.0.so(p4est_save_ext+0x688)[0x7f553f6842c2]
>> [c12-16:24102] [ 7] /share/apps/cig/dealii/dealii-9.0.0/install/p4est-2.0/DEBUG/lib/libp4est-2.0.so(p4est_save+0x2b)[0x7f553f683c37]
>> [c12-16:24102] [ 8] /share/apps/cig/dealii/dealii-9.0.0/install/deal.II-v9.0.0/lib/libdeal_II.g.so.9.0.0(_ZNK6dealii8parallel11distributed13TriangulationILi2ELi2EE4saveEPKc+0x35a)[0x7f554a67fe74]
>> [c12-16:24102] [ 9] /home/billen/AspectProjects/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE15create_snapshotEv+0x51e)[0x5653d3c04a9e]
>> [c12-16:24102] [10] /home/billen/AspectProjects/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE22maybe_write_checkpointElSt4pairIbbE+0x75)[0x5653d3c72165]
>> [c12-16:24102] [11] /home/billen/AspectProjects/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE3runEv+0x10c1)[0x5653d3c3aa7b]
>> [c12-16:24102] [12] /home/billen/AspectProjects/aspect/build/aspect(_Z13run_simulatorILi2EEvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEbb+0x117)[0x5653d3812709]
>> [c12-16:24102] [13] /home/billen/AspectProjects/aspect/build/aspect(main+0x4df)[0x5653d37f1b21]
>> [c12-16:24102] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f553f9beb97]
>> [c12-16:24102] [15] /home/billen/AspectProjects/aspect/build/aspect(_start+0x2a)[0x5653d365d6ea]
>> [c12-16:24102] *** End of error message ***
>> 
>> ____________________________________________________________
>> Professor of Geophysics
>> Earth & Planetary Sciences Dept., UC Davis
>> Davis, CA 95616
>> 2129 Earth & Physical Sciences Bldg.
>> Office Phone: (530) 752-4169
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__magalibillen.faculty.ucdavis.edu&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=IidMaaAQUiU7-64LWz0DCWATwg7jivdUIqhoxLwoxic&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__magalibillen.faculty.ucdavis.edu&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=IidMaaAQUiU7-64LWz0DCWATwg7jivdUIqhoxLwoxic&e=> 
>> 
>> Currently on Sabbatical at Munich University (LMU)
>> Department of Geophysics (PST + 9 hr)
>> 
>> Avoid implicit bias - check before you submit:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tomforth.co.uk_genderbias_&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=ODl62m5XBSzgt9GjmECkmwUzLW6uIEdZGGZpSbcgALA&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tomforth.co.uk_genderbias_&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=ODl62m5XBSzgt9GjmECkmwUzLW6uIEdZGGZpSbcgALA&e=> 
>> ___________________________________________________________
>> 
>> 
>> ____________________________________________________________
>> Professor of Geophysics
>> Earth & Planetary Sciences Dept., UC Davis
>> Davis, CA 95616
>> 2129 Earth & Physical Sciences Bldg.
>> Office Phone: (530) 752-4169
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__magalibillen.faculty.ucdavis.edu&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=IidMaaAQUiU7-64LWz0DCWATwg7jivdUIqhoxLwoxic&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__magalibillen.faculty.ucdavis.edu&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=IidMaaAQUiU7-64LWz0DCWATwg7jivdUIqhoxLwoxic&e=> 
>> 
>> Currently on Sabbatical at Munich University (LMU)
>> Department of Geophysics (PST + 9 hr)
>> 
>> Avoid implicit bias - check before you submit:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tomforth.co.uk_genderbias_&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=ODl62m5XBSzgt9GjmECkmwUzLW6uIEdZGGZpSbcgALA&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tomforth.co.uk_genderbias_&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=ODl62m5XBSzgt9GjmECkmwUzLW6uIEdZGGZpSbcgALA&e=> 
>> ___________________________________________________________
>> 
>> _______________________________________________
>> Aspect-devel mailing list
>> Aspect-devel at geodynamics.org <mailto:Aspect-devel at geodynamics.org>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.geodynamics.org_cgi-2Dbin_mailman_listinfo_aspect-2Ddevel&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=g8GpWiP81zZNplY3dwMEVsBkh7nxI-71ammJhNEgbNo&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.geodynamics.org_cgi-2Dbin_mailman_listinfo_aspect-2Ddevel&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=lChKkZS2cRE0Q1U7lZIY8svp4D8Yz3pqMxpvxefGQTo&s=g8GpWiP81zZNplY3dwMEVsBkh7nxI-71ammJhNEgbNo&e=> 
> 
> -- 
> Timo Heister
> http://www.math.clemson.edu/~heister/ <http://www.math.clemson.edu/~heister/>
> _______________________________________________
> Aspect-devel mailing list
> Aspect-devel at geodynamics.org <mailto:Aspect-devel at geodynamics.org>
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel <http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel>
____________________________________________________________
Professor of Geophysics 
Earth & Planetary Sciences Dept., UC Davis
Davis, CA 95616
2129 Earth & Physical Sciences Bldg.
Office Phone: (530) 752-4169
http://magalibillen.faculty.ucdavis.edu

Currently on Sabbatical at Munich University (LMU)
Department of Geophysics (PST + 9 hr)

Avoid implicit bias - check before you submit: 
http://www.tomforth.co.uk/genderbias/
___________________________________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.geodynamics.org/pipermail/aspect-devel/attachments/20180705/d4440118/attachment-0001.html>


More information about the Aspect-devel mailing list