[aspect-devel] Assemble temperature system time
Wolfgang Bangerth
bangerth at math.tamu.edu
Wed Nov 6 17:26:36 PST 2013
On 11/06/2013 06:48 PM, Eric Heien wrote:
> Hello all,
>
> I’m doing some medium sized 3D box runs on TACC Stampede now (1e6 DOF, 32 cores) and I’ve noticed the Assemble temperature system timing is very large.
>
> +---------------------------------------------+------------+------------+
> | Total wallclock time elapsed since start | 3.15e+03s | |
> | | | |
> | Section | no. calls | wall time | % of total |
> +---------------------------------+-----------+------------+------------+
> | Assemble Stokes system | 201 | 315s | 10% |
> | Assemble temperature system | 201 | 2.16e+03s | 69% |
> | Build Stokes preconditioner | 1 | 26.6s | 0.85% |
> | Build temperature preconditioner| 201 | 21.7s | 0.69% |
> | Solve Stokes system | 201 | 90s | 2.9% |
> | Solve temperature system | 201 | 13.9s | 0.44% |
> | Initialization | 2 | 1.51s | 0.048% |
> | Postprocessing | 201 | 310s | 9.8% |
> | Setup dof systems | 1 | 15.7s | 0.5% |
> +---------------------------------+-----------+------------+——————+
>
> Does anyone have a suggestion for why this might be? I know some of the developers recently worked on improving the performance of this, which is why it seems odd it would take so much time. Before I dig into the reasons with a profiler, I was hoping someone might know what’s wrong. This was compiled with Intel compiler version 13.0.079, and uses deal.II r31565 and Aspect r2009.
To me, this looks a lot like you're in debug mode. For reference, the
simulation from which I created this movie
http://www.youtube.com/watch?v=_bKqU_P4j48
used the attached version of the 3d box convection input file. There,
timing looked like this:
+---------------------------------------------+------------+------------+
| Total wallclock time elapsed since start | 5.88e+05s | |
| | | |
| Section | no. calls | wall time | % of total|
+---------------------------------+-----------+------------+------------+
| Assemble Stokes system | 12634 | 1.44e+05s | 24% |
| Assemble temperature system | 12634 | 1.09e+05s | 19% |
| Build Stokes preconditioner | 1098 | 5.96e+04s | 10% |
| Build temperature preconditioner| 12634 | 1.85e+04s | 3.2% |
| Solve Stokes system | 12634 | 1.72e+05s | 29% |
| Solve temperature system | 12634 | 5.18e+04s | 8.8% |
| Create snapshot | 252 | 203s | 0.035% |
| Initialization | 5 | 0.258s | 4.4e-05% |
| Postprocessing | 12631 | 6.15e+03s | 1% |
| Refine mesh structure, part 1 | 846 | 5.28e+03s | 0.9% |
| Refine mesh structure, part 2 | 846 | 475s | 0.081% |
| Setup dof systems | 847 | 1.03e+04s | 1.7% |
+---------------------------------+-----------+------------+------------+
I think these percentages are more realistic. This was on 64 processors.
Best
Wolfgang
--
------------------------------------------------------------------------
Wolfgang Bangerth email: bangerth at math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/
-------------- next part --------------
set Resume computation = false
set Timing output frequency = 10
# At the top, we define the number of space dimensions we would like to
# work in:
set Dimension = 3
# There are several global variables that have to do with what
# time system we want to work in and what the end time is. We
# also designate an output directory.
set Use years in output instead of seconds = false
set End time = 1.0
set Output directory = output-x
# Then there are variables that describe the tolerance of
# the linear solver as well as how the pressure should
# be normalized. Here, we choose a zero average pressure
# at the surface of the domain (for the current geometry, the
# surface is defined as the top boundary).
set Linear solver tolerance = 1e-15
set Temperature solver tolerance = 1e-15
set Pressure normalization = surface
set Surface pressure = 0
# Then come a number of sections that deal with the setup
# of the problem to solve. The first one deals with the
# geometry of the domain within which we want to solve.
# The sections that follow all have the same basic setup
# where we select the name of a particular model (here,
# the box geometry) and then, in a further subsection,
# set the parameters that are specific to this particular
# model.
subsection Geometry model
set Model name = box
subsection Box
set X extent = 1
set Y extent = 1
set Z extent = 1
end
end
# The next section deals with the initial conditions for the
# temperature (there are no initial conditions for the
# velocity variable since the velocity is assumed to always
# be in a static equilibrium with the temperature field).
# There are a number of models with the 'function' model
# a generic one that allows us to enter the actual initial
# conditions in the form of a formula that can contain
# constants. We choose a linear temperature profile that
# matches the boundary conditions defined below plus
# a small perturbation:
subsection Initial conditions
set Model name = function
subsection Function
set Variable names = x,y,z
set Function constants = p=0.01, L=1, pi=3.1415926536, k=1
set Function expression = (1.0-z) - p*cos(k*pi*x/L)*sin(pi*z)*y^3
end
end
# Then follows a section that describes the boundary conditions
# for the temperature. The model we choose is called 'box' and
# allows to set a constant temperature on each of the four sides
# of the box geometry. In our case, we choose something that is
# heated from below and cooled from above. (As will be seen
# in the next section, the actual temperature prescribed here
# at the left and right does not matter.)
subsection Boundary temperature model
set Model name = box
subsection Box
set Bottom temperature = 1
set Left temperature = 0
set Right temperature = 0
set Top temperature = 0
end
end
# We then also have to prescribe several other parts of the model
# such as which boundaries actually carry a prescribed boundary
# temperature (as described in the documentation of the `box'
# geometry, boundaries 2 and 3 are the bottom and top boundaries)
# whereas all other parts of the boundary are insulated (i.e.,
# no heat flux through these boundaries; this is also often used
# to specify symmetry boundaries).
subsection Model settings
set Fixed temperature boundary indicators = 4,5
# The next parameters then describe on which parts of the
# boundary we prescribe a zero or nonzero velocity and
# on which parts the flow is allowed to be tangential.
# Here, all four sides of the box allow tangential
# unrestricted flow but with a zero normal component:
set Zero velocity boundary indicators =
set Prescribed velocity boundary indicators =
set Tangential velocity boundary indicators = 0,1,2,3,4,5
# The final part of this section describes whether we
# want to include adiabatic heating (from a small
# compressibility of the medium) or from shear friction,
# as well as the rate of internal heating. We do not
# want to use any of these options here:
set Include adiabatic heating = false
set Include shear heating = false
set Radiogenic heating rate = 0
end
# The following two sections describe first the
# direction (vertical) and magnitude of gravity and the
# material model (i.e., density, viscosity, etc). We have
# discussed the settings used here in the introduction to
# this cookbook in the manual already.
subsection Gravity model
set Model name = vertical
subsection Vertical
set Magnitude = 1e16 # = Ra / Thermal expansion coefficient
end
end
subsection Material model
set Model name = simple # default:
subsection Simple model
set Reference density = 1
set Reference specific heat = 1
set Reference temperature = 0
set Thermal conductivity = 1
set Thermal expansion coefficient = 1e-10
set Viscosity = 1
end
end
# The settings above all pertain to the description of the
# continuous partial differential equations we want to solve.
# The following section deals with the discretization of
# this problem, namely the kind of mesh we want to compute
# on. We here use a globally refined mesh without
# adaptive mesh refinement.
subsection Mesh refinement
set Initial global refinement = 3
set Initial adaptive refinement = 3
set Time steps between mesh refinement = 15
set Additional refinement times = 0.003
end
# The final part is to specify what ASPECT should do with the
# solution once computed at the end of every time step. The
# process of evaluating the solution is called `postprocessing'
# and we choose to compute velocity and temperature statistics,
# statistics about the heat flux through the boundaries of the
# domain, and to generate graphical output files for later
# visualization. These output files are created every time
# a time step crosses time points separated by 0.01. Given
# our start time (zero) and final time (0.5) this means that
# we will obtain 50 output files.
subsection Postprocess
set List of postprocessors = velocity statistics, temperature statistics, heat flux statistics, visualization
subsection Visualization
set Time between graphical output = 0.0001
end
end
subsection Checkpointing
# The number of timesteps between performing checkpoints. If 0 and time
# between checkpoint is not specified, checkpointing will not be performed.
# Units: None.
set Steps between checkpoint = 50
end
More information about the Aspect-devel
mailing list