[aspect-devel] [cse.ucdavis.edu #13562] Fwd: Fwd: error when writing checkpoint files

Timo Heister heister at clemson.edu
Sun Jul 8 07:44:52 PDT 2018


Magali,

can you try to set "number of grouped files" to 1 and see if this
works when you run with 2 or more cores?

Note that it was just a guess on my end that MPI IO is the problem. It
might be something else we are triggering inside p4est.

On Sun, Jul 8, 2018 at 4:19 AM Magali Billen <mibillen at ucdavis.edu> wrote:
>
> Hi everyone,
>
> My cluster is really old (GigE…), and is perhaps of a dying breed of “individual PI” clusters.
>  So, that problem is not fixable until I write a grant to add nodes to a larger, newer cluster with a more modern set-up.
>
> Bill - is MPI-IO enabled on Peleton? This cluster, or one like it, is what I would be buying nodes to add to.
>
> Wolfgang  -  your responses help to solve another mystery of why VTU works and not checkpointing.
> I started my PRM files from a file that John Naliboff gave me (he was using my cluster with a visiting student),
>  and in it the parameter “Number of grouped files” is set to zero (see below).  I had not dug into what that meant,
> but now its clear.
>
> Maybe the only related question, is whether it is possible to create a similar variable for Checkpointing?
> If not, I guess that's just really strong motivation for me to write an IFR proposal quickly ;-) (and a proposal for time
> on a national lab machine).
>
> -Magali
>
> # INFORMATION ON OUTPUT TO BE CREATED
> subsection Postprocess
>   set List of postprocessors = visualization, velocity statistics, temperature statistics
>
>   subsection Visualization
>     set List of output variables      = density, viscosity, strain rate
>     set Output format                 = vtu
>     set Time between graphical output = 0.10e6
>     set Number of grouped files       = 0
>   end
> end
>
>
> > On Jul 8, 2018, at 5:22 AM, Wolfgang Bangerth <bangerth at colostate.edu> wrote:
> >
> > On 07/07/2018 06:45 AM, Magali Billen wrote:
> >> I’ll be at CIDER the last two weeks of  July and I’ll try to talk to Rene in person about this issue and try to understand
> >> more about what options might exists.  Since this is all handled by other libraries (p4est), there may be no real option.   I
> >>  don’t feel like I have the expertise or experience with Aspect to wade into this on my own. Maybe after talking with Rene,
> >> we can see about trying to compiling p4est with mpi-io and see what happens.
> >
> > p4est can be configured to disable MPI-IO. So that's a problem that can be solved. But deal.II also uses MPI-IO, here:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dealii_dealii_blob_master_source_base_data-5Fout-5Fbase.cc-23L7286&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=y9C9vb_vomIN-3tVapS2IIRVdYUvMB4xfn9pOkNxI68&s=vJUTaxRsIyqCAsdOOH8xWMlRmOK5PJN_RU7OMVVIiYg&e= 
> >
> > This deal.II function is called from essentially all ASPECT runs:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_geodynamics_aspect_blob_master_source_postprocess_visualization.cc-23L594&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=y9C9vb_vomIN-3tVapS2IIRVdYUvMB4xfn9pOkNxI68&s=5YWmXmPeTwHWBTODyC3uNUM8HgGXShapMXyso39CVJQ&e= 
> >
> > The default for the number of grouped files is 16, and I suspect that most people leave it as is -- so basically everyone ends up in the `else` branch in line 598.
> >
> > In other words, while I don't know whether people use checkpoint/restart frequently, pretty much everyone I know uses VTU output, and that uses MPI-IO. I can't really reconcile this, but it seems to suggest that MPI-IO must work for most of our users.
> >
> > Best
> > Wolfgang
> >
> >
> > --
> > ------------------------------------------------------------------------
> > Wolfgang Bangerth          email:                 bangerth at colostate.edu
> >                           www: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.math.colostate.edu_-7Ebangerth_&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=y9C9vb_vomIN-3tVapS2IIRVdYUvMB4xfn9pOkNxI68&s=TUx1aZuf9zhb89hGAhffqpCu2xRVQJRbYRgoGPCkN9s&e= 
> >
>
> ____________________________________________________________
> Professor of Geophysics
> Earth & Planetary Sciences Dept., UC Davis
> Davis, CA 95616
> 2129 Earth & Physical Sciences Bldg.
> Office Phone: (530) 752-4169
> https://urldefense.proofpoint.com/v2/url?u=http-3A__magalibillen.faculty.ucdavis.edu&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=y9C9vb_vomIN-3tVapS2IIRVdYUvMB4xfn9pOkNxI68&s=ST9ZpC9d-huOnaBrBTuHiLv5_iYcrhexJMoBXuns99g&e= 
>
> Currently on Sabbatical at Munich University (LMU)
> Department of Geophysics (PST + 9 hr)
>
> Avoid implicit bias - check before you submit:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tomforth.co.uk_genderbias_&d=DwIFaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=c08Btfq4m9QEScXN3ZQwLZzzWQE7S8CYq1IYuzKV_Zk&m=y9C9vb_vomIN-3tVapS2IIRVdYUvMB4xfn9pOkNxI68&s=pcYkjkVZe6Fm65h3jTIbtSFYXMKC7KFrmHFOT3-STvM&e= 
> ___________________________________________________________
>


-- 
Timo Heister
http://www.math.clemson.edu/~heister/


More information about the Aspect-devel mailing list