[cig-commits] commit: Started working on tpv13 figures.

Wed Apr 11 18:25:38 PDT 2012

changeset:   100:bc90ffdf91b7
tag:         tip
user:        Brad Aagaard <baagaard at usgs.gov>
date:        Wed Apr 11 18:24:55 2012 -0700
files:       faultRup.tex figs/tpv13_geometry.tex
description:
Started working on tpv13 figures.


diff -r bb3fc212dd01 -r bc90ffdf91b7 faultRup.tex

--- a/faultRup.tex	Tue Apr 10 17:52:37 2012 -0700
+++ b/faultRup.tex	Wed Apr 11 18:24:55 2012 -0700
@@ -5,7 +5,6 @@
 \usepackage{array}
 \usepackage{rotating}
 \usepackage[centertags]{amsmath}
-
 
 % :SUBMIT: if draft, comment out this line
 \usepackage{graphics}
@@ -440,10 +439,10 @@ Thus, the Jacobian of the entire system 
     \end{array} \right).
 \end{equation}
 Note that the terms in $\mathbf{N_{n^+}}$ and $\mathbf{N_{n^-}}$ are
-identical, but they refer to degrees of freedom on the positive and
+identical, but they refer to degrees of freedom (DOF) on the positive and
 negative sides of the fault, respectively. Consequently, in practice
 we compute the terms for the positive side of the fault and assemble
-the terms into the appropriate degrees of freedom for both sides of
+the terms into the appropriate DOF for both sides of
 the fault. Hence, we compute
 \begin{equation}
   \label{eqn:jacobian:constraint:code}
@@ -459,10 +458,10 @@ with the Jacobian of the entire system t
       \mathbf{0} & \mathbf{L}_p & -\mathbf{L}_p & \mathbf{0} 
     \end{array} \right),
 \end{equation}
-where $n$ denotes degrees of freedom not associated with the fault,
-$n^-$ denotes degrees of freedom associated with the negative side of
-the fault, $n^+$ denotes degrees of freedom associated with the
-positive side of the fault, and $p$ denotes degrees of freedom
+where $n$ denotes DOF not associated with the fault,
+$n^-$ denotes DOF associated with the negative side of
+the fault, $n^+$ denotes DOF associated with the
+positive side of the fault, and $p$ denotes DOF
 associated with the Lagrange multipliers.
 
 % ------------------------------------------------------------------
@@ -612,7 +611,7 @@ current estimate of slip. We then comput
 current estimate of slip. We then compute the increment in fault slip
 corresponding to this perturbation in the Lagrange multipliers
 assuming deformation is limited to vertices on the fault. That is, we
-consider only the degrees of freedom associated with the fault
+consider only the DOF associated with the fault
 interface when computing how a perturbation in the Lagrange
 multipliers corresponds to a change in fault slip. In terms of the
 general form of a linear system of equations ($\mathbf{A} \mathbf{u} =
@@ -636,7 +635,7 @@ equation~(\ref{eqn:saddle:point:code}) h
   \mathbf{b}_p \\
   \end{pmatrix},
 \end{equation}
-where $n^+$ and $n^-$ refer to the degrees of freedom associated with
+where $n^+$ and $n^-$ refer to the DOF associated with
 the positive and negative sides of the fault,
 respectively. Furthermore, we can ignore the terms $\mathbf{b}_{n^+}$
 and $\mathbf{b}_{n^-}$ because they remain constant as we change the
@@ -661,7 +660,7 @@ required to match the fault constitutive
 required to match the fault constitutive model will be
 poor. Similarly, in rare cases in which the fault slip extends across
 the entire domain, deformation extends much farther from the fault and
-the estimate derived using only the fault degrees of freedom will be
+the estimate derived using only the fault DOF will be
 poor. In order to make this iterative procedure more robust so that it
 works well across a wide variety of fault constitutive models, we add
 a small enhancement to the iterative procedure.
@@ -728,7 +727,7 @@ equation~(\ref{eqn:fault:disp}), we must
 equation~(\ref{eqn:fault:disp}), we must represent the displacement on
 both sides of the fault for any vertex on the fault. One option is to
 designate ``fault vertices'' which possess twice as many displacement
-degrees of freedom \citep{Aagaard:etal:BSSA:2001}. However, this
+DOF \citep{Aagaard:etal:BSSA:2001}. However, this
 requires storing the global variable indices by cell rather than by
 vertex or adding special fault metadata to the vertices, significantly
 increasing storage costs and/or index lookup costs. 
@@ -932,7 +931,7 @@ elasticity, we have found AMG preconditi
 elasticity, we have found AMG preconditioners provide substantially
 faster convergence that the Additive Schwarz method. We combine the
 field split preconditioner with the AMG preconditioner, such that we
-precondition the degrees of freedom for each global coordinate axis
+precondition the DOF for each global coordinate axis
 independently. Table~\ref{tab:iterates} shows the number of iterates
 required to solve a problem with prescribed slip on three faults
 \brad{Setup solver test for weak scaling computations or use
@@ -964,7 +963,7 @@ eliminating the off-diagonal terms so th
 eliminating the off-diagonal terms so that the Jacobian is diagonal
 permits use of a much faster solver. With a diagonal Jacobian the
 number of operations required for the solve is proportional to the
-number of degrees of freedom, and the memory requirements are greatly
+number of DOF, and the memory requirements are greatly
 reduced by storing the diagonal of the matrix as a vector rather
 than as a sparse matrix. However, the block structure of our Jacobian
 matrix, with the fault slip constraints occupying off-diagonal blocks,
@@ -1061,7 +1060,7 @@ residual is
 \end{equation}
 Solving the first row of equation~(\ref{eqn:lumped:jacobian:residual})
 for the increment in the solution and accounting for the structure of
-$\mathbf{L}$ as we write the expressions for degrees of freedom on
+$\mathbf{L}$ as we write the expressions for DOF on
 each side of the fault, we have
 \begin{gather}
   \mathbf{du}_{n^+} = 
@@ -1103,7 +1102,7 @@ is zero,
 \end{equation}
 Because $\mathbf{K}$ and $\mathbf{L}$ are comprised of diagonal
 blocks, this expression for the updates to the solution are local to
-the degrees of freedom attached to the fault and the Lagrange
+the DOF attached to the fault and the Lagrange
 multipliers.
 
 % Spontaneous rupture model and lumped Jacobian
@@ -1132,46 +1131,107 @@ We compare the relative performance of t
 We compare the relative performance of the various preconditioners
 discussed in section~\ref{sec:solver:quasistatic} for quasi-static
 problems using a static simulation with three vertical, strike-slip
-faults. Using multiple, intersecting faults provides a more complex
-test of the preconditioner compared with a single fault due to the
-loose coupling in tractions (Lagrange multipliers) among the faults.
+faults. Using multiple, intersecting faults involves multiple saddle
+points, so it provides a more thorough test of the preconditioner
+compared to a single fault with a single saddle point.
 Figure~\ref{fig:solvertest:geometry} shows the geometry of the faults
-embedded in the domain and Table~\ref{tab:solvertest:parameters} gives the
-parameters used in the simulation. We apply Dirichlet boundary conditions on two
-lateral sides with 2.0 m of shearing motion and no motion
-perpendicular to the boundary. We also apply a Dirichlet boundary
-condition to the bottom of the domain to prevent vertical motion. We
-prescribe uniform slip on the three faults with zero slip along the
-buried edges. 
+embedded in the domain and Table~\ref{tab:solvertest:parameters} gives
+the parameters used in the simulation. We apply Dirichlet boundary
+conditions on two lateral sides with 2.0 m of shearing motion and no
+motion perpendicular to the boundary. We also apply a Dirichlet
+boundary condition to the bottom of the domain to prevent vertical
+motion. We prescribe uniform slip on the three faults with zero slip
+along the buried edges.
 
-
-
-\begin{itemize}
-\item Description (geometry, BC, fault slip)
-\item Parameters (physical properties)
-\item Meshing (cell types)
-\end{itemize}
+We generate both hexahedral meshes and tetrahedral meshes using CUBIT
+\citep{cubit} and construct meshes so that the problem size (number of
+DOF) for the two different cell types (hexahedra and
+tetahedra) are nearly the same. The suite of simulations examine
+problem sizes increasing by about a factor of two from $1.78\times
+10^5$ DOF to $2.14\times 10^7$ DOF. The
+corresponding discretization sizes are 1520 m to 392 m for the
+hexahedral meshes and 1744 m to 456 m for the tetrahedral meshes.
+Figure~\ref{fig:solvertest:mesh} shows the 1744 m resolution
+tetrahedral mesh. As we will see in
+Section~\ref{src:verification:quasi-static}, the hexahedral mesh for a
+given resolution is more accurate, so the errors in solution for each
+pair of meshes are significantly larger for the tetrahedral mesh.
 
 \subsection{Preconditioner Performance}
 
-\begin{itemize}
-\item Simulation parameters (discretization, nprocs, pc)
-\item Iterations, explanation
-\end{itemize}
+We characterize preconditioner performance in terms of the number of
+iterations required for the residual to reach a convergence tolerance
+and the sensitivity of the number of interations to the problem
+size. An ideal preconditioner would yield a small, constant number of
+iterations independent of problem size. However, for complex problems
+such as elasticity with fault slip and potentially nonuniform physical
+properties, ideal preconditioners may not exist. Hence, we seek a
+preconditioner that provides a minimal increase in the number of
+interations as the problem size increases, so that we can efficiently
+simulate quasi-static crustal deformation related to faulting and
+post-seismic and interseismic deformation.
+
+For this benchmark of preconditioner performance, we examine the
+number of iterations required for convergence using the PETSc Additive
+Schwarz (ASM), field split (with and without our custom
+preconditioner), and Schur complement preconditioners discussed in
+section~\ref{sec:solver:quasi-static}. We characterize the dependence
+on problem size using serial simulations (we examine parallel scaling
+for the best preconditioner in the next section) and the three lowest
+resolution meshes in our suite of hexahedral and tetrahedral meshes
+with the results summarized in
+Table~\ref{tab:solvertest:preconditioner:iterates}.
+
+The family of field split preconditioners using algebraic multigrid
+methods minimize the increase in the number of iterations with problem
+size. For these preconditioners the number of iterations increases by
+only about 20\% for a four times increase in the number of degrees of
+freedom, compared to 60\% for the ASM preconditioner. Within the
+family of field split preconditioners, the one with multiplicative
+composition minimizes the number of iterations. The custom
+preconditioner for the fault block (Lagrange multipliers), it greatly
+accelerates the convergence with an 80\% further reduction in the
+number of iterations required for convergence.
+
+\matt{Add additional comments, explanation of performance}
 
 \subsection{Parallel Scaling Performance}
 
-\begin{itemize}
-\item Setup of scaling (discretization, nprocs)
-\item Description of scaling (solver steps, no output)
-\item discussion
-\end{itemize}
+We evaluate the parallel performance via a weak scaling
+criterion. That is, we run simulations on various numbers of
+processors/cores with an increase in the problem size as the number of
+processors increases to maintain the same workload (number of DOF) for
+each core. Ideally, the time for the various stages of the simulation
+should be independent of the number of processors/cores. For this
+performance benchmark we use the entire suite of hexahedral and
+tetrahedral meshes described ealier that range in size from
+$1.78\times 10^5$ DOF to $2.14\times 10^7$ DOF. In each of these
+simulations, we employ the field split algebraic multigrid
+preconditioner with multiplicative composition and the custom fault
+block preconditioner. We ran the simulations on a Beowulf cluster
+comprised of 24 compute nodes connected by QDR Infiniband, where each
+compute node consists of two quad-core Intel Xeron E5620 processors
+with 24 GB RAM. Simulations run on eight or fewer cores were run on a
+single compute node. Thus, in addition to algorithm bottlenecks,
+runtime performance is potentially impeeded by core/memory affinity,
+memory bandwidth, and communication among compute nodes.
+
+\brad{Update this after tuning solver}%
+Figure~\ref{fig:solvertest:scaling} illustrates excellent the parallel
+performance for the finite-element assembly routines (reforming the
+Jacobian sparse matrix and computing the residual) with somewhat poor
+parallel performance for setting up the preconditioner and performing
+the solve.  The finite-element assembly routines achieve weak scaling
+with negligible effects from the cluster architecture. The solver, on
+the other hand, shows a significant increase in runtime \ldots
+\matt{Can we get better solver scaling?}%
 
 % ------------------------------------------------------------------
 \section{Code Verification Benchmarks}
 \label{sec:verification:benchmarks}
 
 \subsection{Quasi-static}
+\label{sec:verification:quasi-static}
 
 \begin{itemize}
 \item Savage and Prescott
@@ -1187,12 +1247,13 @@ buried edges.
 \end{itemize}
 
 \subsection{Dynamic}
+\label{sec:verification:dynamic}
 
 \begin{itemize}
 \item Spontaneous rupture benchmark: TPV12, TPV13
   \begin{itemize}
   \item dipping fault, depth dependent stresses, super-shear rupture,
-    Drucker-Prage elastoplastic model
+    Drucker-Prager elastoplastic model
   \item Not ideal due to discontinuities in spatial variation of parameters
   \item 2-D, compare quad4 and tri3
   \item 3-D, only tet4 (complex geometry)
@@ -1308,14 +1369,6 @@ MGK acknowledges partial support from NS
 
 
 \begin{figure}
-  ADD diagram of solvertest problem
-  %\noindent\includegraphics{figs/solvertest_geometry}
-  \caption{Diagram of geometry and boundary conditions for the
-    performance benchmark.}
-  \label{fig:solvertest:geometry}
-\end{figure}
-
-\begin{figure}
   \noindent\includegraphics{figs/solvertest_geometry}
   \caption{Geometry of problem used in quasi-static performance
     benchmark. Dirichlet boundary conditions prescribe a horizontal
@@ -1326,6 +1379,16 @@ MGK acknowledges partial support from NS
     motion on the two other faults. The faults extend down to a depth
     of 12.0 km.}
   \label{fig:solvertest:geometry}
+\end{figure}
+
+\begin{figure}
+  \noindent\includegraphics[width=84mm]{figs/solvertest_mesh}
+  \caption{Tetrahedral finite-element mesh with a uniform
+    discretization size of 1744 m for the performance benchmark. The
+    colors correspond to the volumes in the CUBIT geometry that are
+    separated by the fault surfaces and boundary between the upper
+    crust and lower crust.}
+  \label{fig:solvertest:mesh}
 \end{figure}
 
 \begin{figure}
@@ -1428,25 +1491,41 @@ MGK acknowledges partial support from NS
   (RL) slip on the middle fault and left-lateral (LL) slip on the end faults.}
 \end{table}
 
+
 \begin{table}
-\caption{\brad{REDO}Performance of Krylov Solvers on a problem with 3 faults. Problem 1 (P1) has 25000 unknowns and 645
-  constraints, whereas the larger Problem 2 (P2) has 191,000 unknowns and 2241 constraints. The relative
-  solver tolerance is $10^{-8}$, and the preconditioners used were Additive Schwarz Method (ASM), FieldSplit (FS),
-  Incomplete Cholesky (ICC), algebraic multigrid (ML), Schur complement, least-square commutator (LSC), and a custom
-  preconditioner for the fault problem.}
-\label{tab:iterates}
+\caption{Preconditioner Performance\tablenotemark{a}}
+\label{tab:solvertest:preconditioner:iterates}
 \centering
-\begin{tabular}{llrr}
-KSP   & PC              & P1 Iterates & P2 Iterates \\ % Use refine.cfg for P2
-\hline
-GMRES & ASM/Shifted ICC &          77 &      $>500$ \\ % asm.cfg
-GMRES & FS/ML           &      $>500$ &      $>500$ \\ % fieldsplit.cfg
-GMRES & FS/Schur/ML     &          62 &         135 \\ % schur.cfg
-GMRES & FS/LSC/ML       &          53 &         108 \\ % lsc.cfg
-GMRES & FS/ML/Custom    &          37 &          49 \\ % fieldsplit.cfg custompc.cfg
+\begin{tabular}{lcrrr}
+  \hline
+  Preconditioner & Cell & \multicolumn{3}{c}{Problem Size} \\
+     &      & S1 & S2 & S4 \\
+  \hline
+  ASM
+    & Tet4 & 239 & 287 & 434 \\
+    & Hex8 & 184 & 236 & 298 \\
+  FieldSplit (add)
+    & Tet4 & 416 & 482 & 499 \\
+    & Hex8 & 304 & 329 & 370 \\
+  FieldSplit (mult)
+    & Tet4 & 341 & 390 & 396 \\
+    & Hex8 & 245 & 261 & 293 \\
+  FieldSplit (mult,custom)
+    & Tet4 & 62 & 69 & 77 \\
+    & Hex8 & 51 & 57 & 62 \\
+  \hline
 \end{tabular}
+\tablenotetext{a}{Number of iterations for Additive Schwarz (ASM),
+  field split (additive, multiplicative, and multiplicative with
+  custom fault block preconditioner), and Schur complement preconditioners for
+  tetrahedral and hexahedral discretizations and three problem sizes
+  (S1 with $1.8\times 10^5$ DOF, S2 with $3.5\times
+  10^5$ DOF, and S3 with $6.9\times 10^5$ DOF). The field split
+  preconditioner with multiplicative composittion and the custom fault
+  block preconditioner yields good performance with only a fraction of
+  the iterates as the other preconditioners and a small increase with
+  problem size.}
 \end{table}
-
 
 
 % ------------------------------------------------------------------
diff -r bb3fc212dd01 -r bc90ffdf91b7 figs/tpv13_geometry.tex
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/figs/tpv13_geometry.tex	Wed Apr 11 18:24:55 2012 -0700
@@ -0,0 +1,113 @@
+\documentclass{article}
+\usepackage[none]{hyphenat}
+\usepackage{helvet}
+\renewcommand{\familydefault}{phv}
+
+\makeatletter
+\renewcommand{\large}{\@setfontsize\large{10}{9}}
+\renewcommand{\normalsize}{\@setfontsize\normalsize{8}{9.5}}
+\makeatother
+
+\usepackage{tikz}
+
+\begin{document}
+\pagestyle{empty}
+
+\input{figcolors}
+
+\usetikzlibrary{shapes,calc,3d}%
+
+\colorlet{colfault}{ltred}
+\colorlet{colpoint}{blue}
+\colorlet{colbox}{ltslate}
+
+\tikzstyle{fault}=[fill=colfault,line width=0.75pt,opacity=0.7]
+\tikzstyle{hypocenter}=[line width=0.5pt]
+
+
+
+\tikzstyle{point}=[color=colpoint,line width=2.0pt]
+
+\tikzstyle{label}=[color=black,font=\bfseries\large,sloped]
+
+\tikzstyle{boxstyle}=[fill=colbox,opacity=0.4,draw=black,line join=round,shade]
+\tikzstyle{boxshade}=[top color=mdslate,bottom color=ltslate]
+
+\tikzstyle{slice}=[color=black,dashed,line width=1.0pt]
+
+\tikzstyle{dimension}=[color=black,line width=0.75pt,sloped]
+
+\begin{tikzpicture}[>=latex,line width=0.75pt,
+  x  = {(-0.9659cm,+0.25882cm)},
+  y  = {(-0.5cm,-0.5cm)},
+  z  = {(0cm,1cm)}]
+
+% DOMAIN
+% bottom 
+\begin{scope}[canvas is yx plane at z=-3.6]
+  \path[boxstyle,boxshade] ({-3.2,-2.775}) rectangle ({+3.2,+2.025});
+\end{scope}
+% back
+\begin{scope}[canvas is zx plane at y=-3.2]
+  \path[boxstyle,boxshade] (-3.6,-2.775) rectangle (0.0,+2.025);
+\end{scope}
+% left 
+\begin{scope}[canvas is zy plane at x=+2.025]
+  \path[boxstyle,boxshade] (-3.6,-3.2) rectangle (0.0,+3.2);
+\end{scope}
+
+% SLICE
+\draw[slice] (+2.025,0.0,0.0) -- (+2.025,0.0,-3.6) -- (-2.775,0.0,-3.6);
+
+
+% FAULT
+\draw[fault]
+ (+0.0,+1.5,0.0) --
+ (-0.75,+1.5,-1.299) --
+ (-0.75,-1.5,-1.299) --
+ (+0.0,-1.5,0.0) --
+ cycle ;
+% hypocenter
+\draw[hypocenter]
+ (-0.675,+0.15,-1.1691) --
+ (-0.675,-0.15,-1.1691) --
+ (-0.525,-0.15,-0.9093) --
+ (-0.525,+0.15,-0.9093) --
+ cycle ;
+
+% FAULT SLICE
+%\draw[slice] (0.0,0.0,0.0) -- (-0.75,0.0,-1.299);
+
+% POINTS
+% ADD STUFF HERE
+
+% Fault dimensions
+\draw[dimension,<->] (-0.85,-1.5,-1.399) -- (-0.85,+1.5,-1.399) node[dimension,midway,below] {30 km};
+\draw[dimension,<->] (-0.75,-1.7,-1.299) -- (0.0,-1.7,0) node[dimension,midway,above] {15 km};
+
+% DOMAIN
+% right
+\begin{scope}[canvas is zy plane at x=-2.775]
+  \path[boxstyle,boxshade] (-3.6,-3.2) rectangle (0.0,+3.2);
+\end{scope}
+% front
+\begin{scope}[canvas is zx plane at y=+3.2]
+  \path[boxstyle,boxshade] (-3.6,-2.775) rectangle (0.0,+2.025);
+\end{scope}
+% top
+\begin{scope}[canvas is yx plane at z=+0.0]
+  \path[boxstyle,boxshade] (-3.2,-2.775) rectangle (+3.2,+2.025);
+\end{scope}
+
+% SLICE
+\draw[slice] (-2.775,0.0,-3.6) -- (-2.775,0.0,0.0) -- (+2.025,0.0,0.0);
+
+% Dimensions
+\draw[dimension,<->] (-2.775,-3.2,-3.8) -- (-2.775,+3.2,-3.8) node[dimension,midway,below] {64 km};
+\draw[dimension,<->] (+2.025,+3.2,-3.8) -- (-2.775,+3.2,-3.8) node[dimension,midway,below] {48 km};
+\draw[dimension,<->] (+2.225,+3.2,-3.6) -- (+2.225,+3.2,+0.0) node[dimension,midway,above] {36 km};
+
+
+
+\end{tikzpicture}
+\end{document}