[aspect-devel] Progress in writing the mantle convection code Aspect

Mon Oct 14 11:13:58 PDT 2013

On Oct 14, 2013, at 6:59 AM, Wolfgang Bangerth wrote:

> On 10/03/2013 04:58 PM, Eric Heien wrote:
>> I'll take an initial stab at a file format for reading particle data (as
>> opposed to an initializing function).  This will also be useful for
>> particle checkpointing, otherwise we would have to store lots of particle
>> data in the checkpoint file.
> 
> I think it would actually be simplest if we just put it into the checkpoint file. It needs to be stored somewhere anyway, and the checkpoint file already has a well-defined file format that is easy to use. Why not use it?

My concern is about the checkpoint file getting too big, and also (I believe) the serial writing of the checkpoint file.  If you have 1e9 particles on 1e4 processors, it seems there will be at least 30GB of particle data written sequentially from many points in the system, which may become a bottleneck during the checkpoints.  For now I'll just write in the checkpoint file and see how it does on large runs.

>> My plan is to support ASCII and HDF5 formats
>> in the same way as is currently done - columns of ID, X, Y, Z, scalar0,
>> scalar1, etc in a single file.  For now I'll just allow up to 3 scalars,
>> more than that will generate an error.  Can you think of any potential
>> problems with this?
> 
> You could define the number of scalars associated with each particle within the file somewhere at the top (or with each particle, if that were easier). I have no problem with it if we just error out if there are more than 3 properties, but I don't think this should be hard-coded in the file format.

Right, it wouldn't be in the file format.  But since we have to instantiate the templated particle classes in the code we need to have some limit.  I'll assume a maximum of 3 properties for now and it will be easy to extend this in the code later.

>> Also, this scheme (and the one I already implemented) is still not
>> completely random for multiple processors.  This is because each processor
>> will need to generate an exact number of particles within the local
>> subdomain, but this isn't a huge problem.
> 
> How do you do it right now?

If you have N particles and processor i covers a fraction V_i of the total domain volume, then processor i creates N*V_i particles.  This creates a reasonably random looking distribution, but statistically it's highly non-random because each subdomain has a set number of particles.

>> One thing I'm confused about is the "place a particle in cell" line - if
>> the distribution is relatively smooth over a cell then approximating it
>> with a uniform distribution is fine.  But if it changes significantly  I
>> don't see how to place particles within a cell without requiring the
>> cumulative distribution function.  Any thoughts about this?
> 
> It's a valid point, in particular if you want to restrict particles to a subdomain (i.e., your probability function is zero/one). But I think that if your mesh is not fine enough to adequately represent this, then you're going to be out of luck on other levels as well.
> 
> So my idea was to just evaluate the probability density at the cell center and assume it to be constant within each cell. We could, if we wanted to, improve this doing a quadrature over the cell, but let's not worry too much about this right now.

That sounds fine, as long as we make sure the user knows.  Actually the original idea of generating particles on a single processor and sending them to other processors will fix all these problems, including the random seed issue, as well as ensuring results are identical regardless of the number of processors used.  I might try coding this up soon to see how complex it is, and with the Generator class it will be easy to keep this separate from other implementations.

Thanks,

-Eric