[cig-commits] r4671 - mc/3D/CitcomS/trunk/doc/manual
tan2 at geodynamics.org
tan2 at geodynamics.org
Fri Sep 29 17:56:33 PDT 2006
Author: tan2
Date: 2006-09-29 17:56:33 -0700 (Fri, 29 Sep 2006)
New Revision: 4671
Modified:
mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx
Log:
Updated Chapter 3 "Running CitComS.py" for the .cfg input format, also
explained how to launch a parallel job. Some TODO items left.
Modified: mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx
===================================================================
--- mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx 2006-09-30 00:40:29 UTC (rev 4670)
+++ mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx 2006-09-30 00:56:33 UTC (rev 4671)
@@ -1,4 +1,4 @@
-#LyX 1.4.1 created this file. For more info see http://www.lyx.org/
+#LyX 1.4.2 created this file. For more info see http://www.lyx.org/
\lyxformat 245
\begin_document
\begin_header
@@ -86,7 +86,7 @@
\begin_layout Author
California Institute of Technology
\newline
-Version 2.0.2
+Version 2.1.0
\end_layout
\begin_layout Date
@@ -192,7 +192,7 @@
\begin_layout Standard
The manual was written for the usage of CitComS.py on a variety of different
platforms.
- CitComS has run on shared memory computers (Sun, Hewlett-Packard, SGI,
+ CitComS.py has run on shared memory computers (Sun, Hewlett-Packard, SGI,
and IBM), commercial distributed memory machines (Intel and Cray/SGI),
and Beowulf implementations (including Beowulfs at the California Institute
of Technology's Center for Advanced Computing Research and Seismological
@@ -512,6 +512,7 @@
In the summer of 2005, as part of the 2.0.1 release, the CIG replaced the
old build procedure with the GNU Build System.
A subsequent release, version 2.0.2, could compile and run on 64-bit systems.
+ (TODO: histroy of v2.1)
\end_layout
\begin_layout Section
@@ -582,8 +583,8 @@
\begin_layout Standard
Pyre provides a simulation framework that includes solver integration and
coupling, uniform access to facilities, and integrated visualization.
- The framework offers a way to add new solvers to CitComS and to fine-tune
- CitComS simulations.
+ The framework offers a way to add new solvers to CitComS.py and to fine-tune
+ CitComS.py simulations.
Future versions of this documentation will cover coupled simulations generated
using via an "exchanger."
\end_layout
@@ -615,8 +616,7 @@
\end_layout
\begin_layout Caption
-Pyre\InsetSpace ~
-Architecture.
+Pyre Architecture.
The integration framework is a set of cooperating abstract services.
\end_layout
@@ -626,21 +626,21 @@
\end_layout
\begin_layout Standard
-Developers have created Pyre classes for CitComS to facilitate simulation
+Developers have created Pyre classes for CitComS.py to facilitate simulation
setup.
However, they are not independent classes in a strict sense.
They still share the same underlaying data structure and their functionality
is not divided clearly.
- CitComS was not designed to be object-oriented and to make it so would
+ CitComS.py was not designed to be object-oriented and to make it so would
require significant investment of effort with little return.
However, the lack of object-oriented features does not hinder the coupling
- of CitComS with other solvers.
+ of CitComS.py with other solvers.
\end_layout
\begin_layout Standard
-This version of CitComS "attaches" to Pyre via the use of bindings.
- They are included with CitComS, eliminating the need for users to write
+This version of CitComS.py "attaches" to Pyre via the use of bindings.
+ They are included with CitComS.py, eliminating the need for users to write
or alter them.
\end_layout
@@ -1500,6 +1500,7 @@
PYTHONPATH
\family default
.
+ (Change the path name accordingly if your python is not of version 2.3.x.)
Finally, on some obscure Unix systems, you may need to add
\family typewriter
PREFIX/lib
@@ -1658,10 +1659,6 @@
\begin_layout Standard
-\end_layout
-
-\begin_layout Standard
-
\family typewriter
\InsetSpace ~
\InsetSpace ~
@@ -1685,10 +1682,6 @@
\end_layout
\begin_layout Standard
-
-\end_layout
-
-\begin_layout Standard
By default,
\family typewriter
configure
@@ -1981,6 +1974,10 @@
\end_layout
\begin_layout Section
+\begin_inset LatexCommand \label{sec:Batch-System-Configuration}
+
+\end_inset
+
Batch System Configuration
\end_layout
@@ -2387,7 +2384,7 @@
\end_layout
\begin_layout LyX-Code
-$ export PYTHONPATH=$HOME/cig/lib/python2.4/site-packages:$PYTHONPATH
+$ export PYTHONPATH=$HOME/cig/lib/python2.3/site-packages:$PYTHONPATH
\end_layout
\begin_layout LyX-Code
@@ -2439,10 +2436,6 @@
Running CitComS.py
\end_layout
-\begin_layout Standard
-
-\end_layout
-
\begin_layout Section
Using CitComS.py
\end_layout
@@ -2457,48 +2450,67 @@
\family typewriter
/usr/local/bin
\family default
-),
+), the program
\family typewriter
-citcomsregional.sh
+citcoms
\family default
- is used for regional models and
+ is used for running both the regional and full spherical models.
+ The program
\family typewriter
-citcomsfull.sh
+citcoms
\family default
- for full spherical models.
- The program
-\family typewriter
- citcomsregional.sh
-\family default
- can be executed without any command line options and will run with default
- parameters;
-\family typewriter
-citcomsfull.sh
-\family default
- requires at least 12 processors to run.
- To do so, you must specify the
-\family typewriter
-launcher
-\family default
- parameters, which are called "facilities."
+ can be executed without any command line options and will run a regional
+ model with default parameters.
+ It can also run a full spherical model if correct parameters are set (see
+
+\begin_inset LatexCommand \vref{sec:Cookbook-1:-Global}
+
+\end_inset
+
+for an example).
\end_layout
\begin_layout Standard
-Since you will likely want to specify the parameters of your CitComS runs,
- you will need to alter both computational details (such as the number of
- time steps) and controlling parameters specific to your problem (such as
- the Rayleigh number).
- Most of the properties you will set using CitComS.py have identical names
- as the parameters for the old CitComS.
-
+On input, CitComS.py needs numerous parameters specified (see Appendix
+\begin_inset LatexCommand \vref{cha:Appendix-A:-Facilities,}
+
+\end_inset
+
+ for full list).
+ All parameters have sensible default values.
+ Since you will likely want to specify the parameters of your CitComS.py
+ runs, you will need to alter both computational details (such as the number
+ of time steps) and controlling parameters specific to your problem (such
+ as the Rayleigh number).
+ These input parameters, or properties in the Pyre terminology, are grouped
+ under several Pyre components.
+ You can think
\end_layout
+\begin_layout Standard
+Most of the properties you will set using CitComS.py have identical names
+ as the parameters for the old CitComS, which is described in
+\begin_inset LatexCommand \vref{cha:Appendix-A:-Facilities,}
+
+\end_inset
+
+.
+\end_layout
+
\begin_layout Section
-Changing Parameters
+Changing Parameters
\end_layout
\begin_layout Standard
-Pyre has the following syntax to change parameters from their default values.
+(TODO: precedence of various input methods)
+\end_layout
+
+\begin_layout Subsection
+From the Command Line
+\end_layout
+
+\begin_layout Standard
+Pyre has the following syntax to change properties from their default values.
To change the value of a property of a component, use:
\end_layout
@@ -2516,13 +2528,49 @@
\end_layout
\begin_layout Standard
-A different component can be attached to a facility by:
+Each facility has a default component attached to it.
+ A different component can be attached to a facility by:
\end_layout
\begin_layout LyX-Code
--[facility]=[new_component]
\end_layout
+\begin_layout Subsection
+In a .cfg File
+\end_layout
+
+\begin_layout Standard
+Entering all those parameters via the command line involves the risk of
+ typographical errors, which can lead to undesired results.
+ So, you may find it easier to write a brief input file that contains the
+ parameters.
+ Since v2.1, the input parameters can be saved in a .cfg file.
+ The file has a format similar to Windows INI file.
+ The file is composed of one or more sections.
+ Each section is like:
+\end_layout
+
+\begin_layout LyX-Code
+[CitcomS.component1.component2]
+\newline
+# this is a comment
+\newline
+property1 = value1
+\newline
+property2
+ = value2
+\end_layout
+
+\begin_layout Subsection
+System or Personal Default Parameters in a .pml File
+\end_layout
+
+\begin_layout Standard
+(TODO: where is the system-wide pml file, where is the personal file, and
+ the syntax of it?)
+\end_layout
+
\begin_layout Section
Coordinate System and Mesh
\end_layout
@@ -2534,11 +2582,11 @@
\emph on
theta
\emph default
- is the colatitude measured from the north pole,
+ (or x) is the colatitude measured from the north pole,
\emph on
fi
\emph default
- is the east longitude, and z is the radius.
+ (or y) is the east longitude, and z is the radius.
\emph on
theta
@@ -2548,8 +2596,11 @@
fi
\emph default
are in units of radians.
- Figure 3.1 shows the organization of the mesh in a regional problem.
-
+ Figure 3.1(TODO: Sue, please please update '3.1' to an automatic tag) shows
+ the organization of the mesh in a regional problem.
+ The numbering of the nodes is z-direction first, then x-direction, then
+ y-direction.
+ This numbering convention is used for the input and output data.
\end_layout
\begin_layout Standard
@@ -2557,7 +2608,7 @@
placement H
wide false
sideways false
-status collapsed
+status open
\begin_layout Description
\begin_inset Graphics
@@ -2569,9 +2620,7 @@
\end_layout
\begin_layout Caption
-Global\InsetSpace ~
-Node\InsetSpace ~
-Numbering.
+Global Node Numbering.
Left: Global node numbering starts at the base of arrow A (theta
\size scriptsize
min
@@ -2620,45 +2669,21 @@
\end_layout
\begin_layout Section
-A Simple CitComS Test
+Uniprocessor Example
\end_layout
\begin_layout Standard
CitComS runs similarly in full spherical or regional modes.
For the purpose of this example, you will perform a test run of the regional
version.
- Instructions for using the full version can be found at
-\begin_inset LatexCommand \vref{sec:Cookbook-1:-Global}
-
-\end_inset
-
-.
+ Execute the following on the command line:
\end_layout
-\begin_layout Standard
-Execute the following on the command line:
-\end_layout
-
-\begin_layout Paragraph
-\begin_inset LatexCommand \label{sub:Uniprocessor-CitComS-from}
-
-\end_inset
-
-Example: Uniprocessor CitComS from the Command Line
-\end_layout
-
\begin_layout LyX-Code
-citcomsregional.sh --steps=10
+$ citcoms --steps=10 --controller.monitoringFrequency=5 --solver.datafile=example0
+ --solver.mesher.nodex=17 --solver.mesher.nodey=17
\end_layout
-\begin_layout LyX-Code
---controller.monitoringFrequency=5 --solver.datafile=example
-\end_layout
-
-\begin_layout LyX-Code
---solver.mesher.nodex=17 --solver.mesher.nodey=17
-\end_layout
-
\begin_layout Standard
This runs a default convection problem in a regional domain for 10 time
steps and with a mesh of 17
@@ -2670,318 +2695,461 @@
\end_inset
9 nodal points.
- See Chapter
-\begin_inset LatexCommand \vref{cha:Postprocessing-and-Graphics}
-
-\end_inset
-
- for instructions to immediately visualize your results with OpenDX.
+ The model results are writen to files
+\family typewriter
+example0.*
+\family default
+ with an interval of 5 time steps.
+
\end_layout
-\begin_layout Section
-Multiprocessor Example
-\end_layout
-
\begin_layout Standard
-On input, CitComS needs numerous parameters specified (see Appendix A for
- full list).
- Most parameters have sensible default values.
- Depending on those input parameters, CitComS could also require numerous
- input files, such as velocity boundary conditions which could be used for
- time dependent, plate tectonic problems.
- Finally, CitComS will potentially output a large number of files which
- will be spread throughout the file systems contained in the individual
- nodes of your Beowulf.
- This means that you will have to organize your directories carefully when
- running CitComS so that you can find your way through these files as well
- as use a post-processing program contained with this distribution.
- (In the example that follows, these output files would be placed in
+Instead of writing the input parameters on the command line, you can put
+ them in a .cfg file.
+ When you downloaded the CitComS.py source, this included an
\family typewriter
-/scratch/username
+examples
\family default
- with a filename prefix
+ directory.
+ In this directory, you will find an equivalent .cfg of the previous example
+
\family typewriter
-example1
+example0.cfg
\family default
- on each individual node.)
+.
+ You can run the model using:
\end_layout
-\begin_layout Standard
-It is possible to input the parameters relevant to your test run from the
- command line:
+\begin_layout LyX-Code
+$ citcoms
+\family typewriter
+example0.cfg
\end_layout
\begin_layout Paragraph
-Example: Multiprocessor CitComS from the Command Line
+Example: Uniprocessor CitComS.py, example0.cfg
\end_layout
\begin_layout LyX-Code
-$ citcomsregional.sh --launcher.nodes=4
-\end_layout
+[CitcomS]
+\newline
+steps = 10
+\newline
-\begin_layout LyX-Code
---launcher.nodegen="n%03d" --launcher.nodelist=[101-102]
-\end_layout
+\newline
+[CitcomS.controller]
+\newline
+monitoringFrequency = 5
+\newline
-\begin_layout LyX-Code
---solver.mesher.nprocx=2 --solver.mesher.nprocy=2
+\newline
+[CitcomS.solver]
+\newline
+data
+file = example0
+\newline
+
+\newline
+[CitcomS.solver.mesher]
+\newline
+nodex = 17
+\newline
+nodey = 17
\end_layout
-\begin_layout LyX-Code
---solver.datafile="/scratch/username/example1"
+\begin_layout Section
+Multiprocessor Example
\end_layout
\begin_layout Standard
-However, entering all those parameters via the command line involves the
- risk of typographical errors, which can lead to undesired results.
- So, you may find it easier to write a brief shell script that contains
- the parameters.
- The file should have the suffix
+In order to run this example you should be on a Beowulf cluster with four
+ or more processors or on a supercomputing center and be in the directory
+ in which the input file is located, in this case, the
\family typewriter
-.sh
-\family default
-.
- When you downloaded the CitComS source, this included an
-\family typewriter
examples
\family default
directory.
- In this directory, you will find a script called
-\family typewriter
-example1.sh
-\family default
-.
+ CitComS.py has been extensively used on both environments, using up to several
+ hundred of processors.
+ How to run a multiprocessor CitComS.py model depends on your hardware and
+ software settings.
+ For example, whether a batch system is used, what are the names of the
+ computers in a cluster, and how is the filesystem organized.
+ This section will lead you thorough the different settings of parallel
+ environment.
\end_layout
\begin_layout Paragraph
-Example: Simple Regional CitComS Shell Script, example1.sh
+Example: Multiprocessor CitComS.py, example1.cfg
\end_layout
\begin_layout LyX-Code
-#!/bin/sh
+[CitcomS]
\newline
+steps = 71
+\newline
-\end_layout
+\newline
+[CitcomS.controller]
+\newline
+monitoringFrequency = 10
+\newline
-\begin_layout LyX-Code
-citcomsregional.sh
-\backslash
+\newline
+[CitcomS.solver]
+\newline
+dat
+afile = example1
+\newline
+\newline
+[CitcomS.solver.mesher]
+\newline
+nprocx = 2
+\newline
+nprocy = 2
+\newline
+nodex = 17
+\newline
+nodey
+ = 17
+\newline
+nodez = 9
+
\end_layout
-\begin_layout LyX-Code
-
-\backslash
-
+\begin_layout Standard
+This example uses 2 processors in colatitude (x-coordinate), 2 in longitude
+ (y-direction), and 1 in the radial (z-direction), i.
+ e., it uses 4 processors in total.
+ In addition, there will be 17 nodes in x, 17 nodes in y, and 9 nodes in
+ z.
+ The model will run for 71 time steps and output the results every 10 time
+ steps.
\end_layout
-\begin_layout LyX-Code
---steps=71
-\backslash
-
+\begin_layout Standard
+It is important to realize that within the example script (and in finite
+ element method, FEM) the term "node" refers to the mesh points defining
+ the corners of the elements.
+ In
+\family typewriter
+example1.cfg
+\family default
+, this is indicated with:
\end_layout
\begin_layout LyX-Code
-
-\backslash
-
+nodex = 17
+\newline
+nodey = 17
+\newline
+nodez = 9
\end_layout
-\begin_layout LyX-Code
---controller.monitoringFrequency=10
-\backslash
-
+\begin_layout Standard
+These quantities refer to the total number of FEM nodes in a given direction
+ for the complete problem, and for the example it works out that within
+ a given processor there will be 9 x 9 x 9 nodes.
+ Note that in the x-direction (or y) that for the entire problem there are
+ 17 nodes and there is one node shared between two processors.
+ This shared node is duplicated in two adjacent processors.
+ Unfortunately, "nodes" sometimes also refers to the individual computers
+ which make up the Beowulf cluster.
+ In the example scripts, this is indicated with:
\end_layout
\begin_layout LyX-Code
-
-\backslash
-
+nprocx = 2
+\newline
+nprocy = 2
\end_layout
-\begin_layout LyX-Code
---launcher.nodes=4
-\backslash
-
+\begin_layout Subsection
+File Systems and Output Formats
\end_layout
-\begin_layout LyX-Code
---launcher.nodegen="n%03d"
-\backslash
-
+\begin_layout Standard
+CitComS.py will potentially output a large number of ASCII files.
+ This means that you will have to organize your directories carefully when
+ running CitComS.py so that you can find your way through these files as
+ well as use a post-processing program contained with this distribution.
+
\end_layout
-\begin_layout LyX-Code
---launching.nodelist=[101-102,101-102]
-\backslash
+\begin_layout Standard
+You might have a local hard disk on every machine in the Beowulf cluster.
+ Each hard disk is mounted locally to the machine.
+ This scenario is referred to as local file system in this section.
+ Or you might be using some kind of parallel file system on your cluster
+ (e.g., NFS, GPFS, PVFS, to name a few), which is mounted on all machines
+ on the cluster.
+ Usually your home directory is mounted on a parallel file system.
+ The local file system is usually more cost- and time-efficient than the
+ parallel file system.
\end_layout
-\begin_layout LyX-Code
-
-\backslash
-
+\begin_layout Standard
+If you want CitComS.py write its output to the local hard disks, you need
+ to have a common directory structure for all the local hard disks.
+ For example, if the directory
+\family typewriter
+/scratch/username
+\family default
+ (you should subsitute
+\family typewriter
+username
+\family default
+ to your own username) exists on all local hard disks, you can run the example
+ script with:
\end_layout
\begin_layout LyX-Code
---solver.datafile=/scratch/username/example1
-\backslash
+$ citcoms
+\family typewriter
+example1.cfg
+\family default
+ --solver.datafile=/scratch/username/example1
+\end_layout
+\begin_layout Standard
+The additional command line option will override the
+\family typewriter
+datafile
+\family default
+ property in the input file.
+ Then, the output files would be placed in
+\family typewriter
+/scratch/username
+\family default
+ with a filename prefix
+\family typewriter
+example1
+\family default
+ on each individual machine.
\end_layout
-\begin_layout LyX-Code
-
-\backslash
-
+\begin_layout Standard
+If you want CitComS.py write its output to the parallel file system, you
+ have several choices.
+ You can run the example script with:
\end_layout
\begin_layout LyX-Code
---solver.mesher.nprocx=2
-\backslash
-
+$ citcoms
+\family typewriter
+example1.cfg
+\family default
+ --solver.datafile=/home/username/example1
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nprocy=2
-\backslash
-
+\begin_layout Standard
+Then, the output files would be placed in your home directory with a filename
+ prefix
+\family typewriter
+example1
+\family default
+.
+ The potential problem with this approach is that the directory
+\family typewriter
+/home/username
+\family default
+ will be flooded with hundreds or tens of thousands files if you are running
+ a model using several tens of processors for thousands of time steps.
+ Or you can have each machine write its output to its own directory, according
+ to its MPI rank.
+ You can run the example script with:
\end_layout
\begin_layout LyX-Code
---solver.mesher.nodex=17
-\backslash
-
+$ citcoms
+\family typewriter
+example1.cfg
+\family default
+ --solver.datadir=/home/username --solver.output.output_format=ascii
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nodey=17
-\backslash
+\begin_layout Standard
+You will see four new directories
+\family typewriter
+/home/username/0
+\family default
+,
+\family typewriter
+/home/username/1
+\family default
+,
+\family typewriter
+/home/username/2
+\family default
+, and
+\family typewriter
+/home/username/3
+\family default
+.
+ The processor of MPI rank 0 will write its output in
+\family typewriter
+/home/username/0
+\family default
+ with a filename prefix
+\family typewriter
+example1
+\family default
+ (defined by the property
+\family typewriter
+datafile
+\family default
+ inside
+\family typewriter
+example1.cfg
+\family default
+) and so on.
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nodez=9
-\backslash
-
+\begin_layout Standard
+The last choice is the most powerful one.
+ Instead of writing many ASCII files, CitComS.py can write its result in
+ a single HDF5 (Hierarchical Data Format) file.
+ The HDF5 file takes less disk space than the ASCII files in total and doesn't
+ require additional post-processing to be visualized in OpenDX.
+ In order to use this feature, you need to compile CitComS.py with parallel
+ HDF5 library (TODO: link to chapter 2).
+ You can run the example script with:
\end_layout
\begin_layout LyX-Code
-# End of file
+$ citcoms
+\family typewriter
+example1.cfg
+\family default
+ --solver.datafile=/home/username/example1 --solver.output.output_format=hdf5
\end_layout
\begin_layout Standard
-This is a basic problem with 2 processors in colatitude (x-coordinate),
- 2 in longitude (y-direction), and 1 in the radial (z-direction).
- In addition, there will be 17 nodes in x, 17 nodes in y, and 9 nodes in
- z.
-\end_layout
-
-\begin_layout Standard
-It is important to realize that within the code (and in finite element analysis)
- the term "node" refers to the mesh points defining the corners of the elements.
- Unfortunately, "nodes" also refers to the individual computers which make
- up the Beowulf PC cluster.
- In
+Then, the output file will be
\family typewriter
-example1.sh
+/home/username/example1.h5
\family default
-, this is indicated with:
+.
+ See Chapter (TODO: link to hdf5 chapter) for more information on how to
+ work with HDF5 output.
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nprocx=2
-\backslash
-
+\begin_layout Subsection
+Launchers
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nprocy=2
-\backslash
-
+\begin_layout Standard
+If you have used MPI before, you will know that
+\family typewriter
+mpirun
+\family default
+ requires several command line options to launch a parallel job.
+ Or if you have used one of the batch system, you will know that the batch
+ system requires you to write a script to launch a job.
+ Fortunately, launching a parallel CitComS.py job is simplified by the
+\family typewriter
+launcher
+\family default
+ component.
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nodex=17
-\backslash
-
+\begin_layout Subsubsection
+Using MPICH or LAM/MPI
\end_layout
-\begin_layout LyX-Code
---solver.mesher.nodey=17
-\backslash
-
+\begin_layout Standard
+You need to specify on what machines will the job run.
+ Suppose the machines on your cluster are named: n001, n002, \SpecialChar \ldots{}
+, etc., and
+ you want to run the job on machines n001, n003, n004, and n005 (maybe n002
+ is down for the monment).
+ To run the example, invoke the following:
\end_layout
\begin_layout LyX-Code
---solver.mesher.nodez=9
-\backslash
-
+$ citcoms example1.cfg --launcher.nodegen="n%03d" --launcher.nodelist=[1,3-5]
\end_layout
\begin_layout Standard
-These quantities refer to the total number of FEM nodes in a given direction
- for the complete problem and for the example it works out that within a
- given processor there will be 9 x 9 x 9 nodes.
- Note that in the x-direction (or y) that for the entire problem there are
- 17 nodes and there is one node shared between two processors.
- This shared node is duplicated in two adjacent processors.
- The global numbering of FEM nodes is z-direction first, then x-direction,
- then y-direction.
- This numbering convention is used for the input and output data.
-\end_layout
-
-\begin_layout Standard
-In order to run this example you should be on your front end and be in the
- directory in which the input file is located, in this case, the
+The
\family typewriter
-examples
+launcher.nodegen
\family default
- directory.
- To run the test example, invoke the following:
+ property is a printf-style format string, used in conjuntion with
+\family typewriter
+launcher.nodelist
+\family default
+ to generate the list of machine names.
+ The
+\family typewriter
+launcher.nodelist
+\family default
+ property is a comma-seperated list of machine names in square barckets.
\end_layout
-\begin_layout LyX-Code
-$ ./example1.sh
-\end_layout
-
\begin_layout Standard
-Once you have completed your run, you will notice that it generates a file
- called
+If you are using MPICH, you will notice that a machine file
\family typewriter
mpirun.nodes
\family default
-.
- It will contain a list of the nodes where CitComS has run.
- You may want to login to one of the nodes to check that the data appears
- to have been properly generated.
- You should also check the log file for any error messages.
-
+ is generated.
+ It will contain a list of the nodes where CitComS.py has run.
+ Save machine file as it will be useful in the post-processing step.
\end_layout
\begin_layout Standard
-If you receive a message such as
+If you are using LAM/MPI, there is no machinefile genrated.
+ In addition, you can pass additional environment variables to LAM/MPI via:
\end_layout
\begin_layout LyX-Code
+$ citcoms example1.cfg --launcher.environments=[env_var1,env_var2]
+\end_layout
-\family typewriter
-bash: --solver.mesher.nodez=9: command not found
+\begin_layout Subsubsection
+Using a Batch System
\end_layout
\begin_layout Standard
-chances are your script is missing a line continuation or has an extra space.
- However, most of the time CitComS will just run using the default values
- -- even if it finds a typographical error such as "nnode" instead of "node."
- So, along with double-checking your parameters as you type them in, you
- should check your results to be sure they make sense.
-
+First, you should follow the instruction on section
+\begin_inset LatexCommand \vref{sec:Batch-System-Configuration}
+
+\end_inset
+
+.
+ Once the configuration has done, you can run the example simply by:
\end_layout
+\begin_layout LyX-Code
+$ citcoms example1.cfg
+\end_layout
+
+\begin_layout Subsection
+Monitoring Your Jobs
+\end_layout
+
\begin_layout Standard
-Following your successful run, you will want to retrieve the output files
+Once launched, CitComS.py with print the progress of the model to the standard
+ error stream (stderr).
+ Usually, the stderr is directed to your terminal so that you can monitor
+ the progress.
+ On some system, the stderr is redirected to a file.
+ In any case, the progress is always saved in a log file by the rank-0 processor.
+ You should check the log file for any error messages.
+ You may also want to login to one of the machine to check that the data
+ appears to have been properly generated.
+ Following your successful run, you will want to retrieve the output files
from all the nodes and process them so they can be visualized with the
- visualization program OpenDX and proceed to the next chapter.
+ visualization program OpenDX and proceed to the chapter
+\begin_inset LatexCommand \vref{cha:Postprocessing-and-Graphics}
+
+\end_inset
+
+.
\end_layout
\begin_layout Section
@@ -3133,67 +3301,55 @@
\end_layout
\begin_layout Standard
-These output files need to be post-processed before you can do the visualization.
- There are two scripts and both can post-process (retrieve and combine)
- CitComS output:
+When you execute a CitComS run, your input parameters will be saved in a
+ file
\family typewriter
-autocombine.py
+PID.parameters
\family default
- and
+, where
\family typewriter
-batchcombine.py
+PID
\family default
-.
- It is recommened to use
+ is a five-digit number for the process ID.
+ This file contains most of the input parameters, which can be useful for
+ archiving and post-processing.
+ (XXX: TODO)
+\end_layout
+
+\begin_layout Standard
+These output files need to be post-processed before you can do the visualization.
+ There is a script that can post-process (retrieve and combine) CitComS
+ output:
\family typewriter
autocombine.py
\family default
- over
-\family typewriter
-batchcombine.py
-\family default
-, as the former is easier to use.
-
+.
+ The script will retrieve CitComS data to the current directory and combine
+ the output into a few files.
\end_layout
\begin_layout Standard
-To use
+Using
\family typewriter
autocombine.py
\family default
-, you have to run CitComS.py in a slightly different way.
- CitComS.py prints out the input parameters to the screen.
- That printout can be redirected to a file, which can be used later to retrieve
- input parameters.
- This is done with:
+, you can retrieve and combine data of time-step 10 by:
\end_layout
\begin_layout LyX-Code
-$ my_citcoms.py_script > input_param
+$ autocombine.py mpirun.nodes PID.parameters 10
\end_layout
\begin_layout Standard
-Then, you can retrieve and combine data of step 10 by using:
-\end_layout
-
-\begin_layout LyX-Code
-$ autocombine.py mpirun.nodes input_param 10
-\end_layout
-
-\begin_layout Standard
This reads the MPI machinefile (
\family typewriter
mpirun.nodes
\family default
) and CitComS.py input parameters (
\family typewriter
-input_param
+PID.param
\family default
-), then calls
-\family typewriter
-batchcombine.py
-\family default
- to do the actual job.
+eters), then calls other scripts to do the actual job.
The general usage for
\family typewriter
autocombine.py
@@ -3212,130 +3368,22 @@
\end_layout
\begin_layout Standard
-If you do not have the input parameters saved to a file, you have to use
-
+If your Beowulf cluster uses ssh (rather than rsh) to access the computation
+ nodes, you have to manually edit
\family typewriter
-batchcombine.py
+PREFIX/bin/batchpaste.sh
\family default
- to retrieve and combine the data.
- The general usage for
-\family typewriter
-batchcombine.py
-\family default
- is:
-\end_layout
-
-\begin_layout LyX-Code
-$ batchcombine.py [machinefile] [modeldir] [modelname]
-\backslash
-
-\end_layout
-
-\begin_layout LyX-Code
-[timestep] [nodex] [nodey] [nodez] [n-surf-cap]
-\backslash
+ and replace 'rsh' to 'ssh' in the script.
\end_layout
-\begin_layout LyX-Code
-[nprocx] [nprocy] [nprocz]
-\end_layout
-
\begin_layout Standard
-where
-\end_layout
-
-\begin_layout Itemize
-
+If your cluster uses a parallell file system and all of your data resides
+ in one directory (e.g.,
\family typewriter
-machinefile
-\family default
- is the file with a list of all the machines you are using, which under
- normal circumstances are the same as the launching nodes.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-modeldir
-\family default
- is the scratch directory where all your files are located.
- This should correspond to the first part of the
-\family typewriter
-datafile
-\family default
- entry, e.g.,
-\family typewriter
/scratch/username
\family default
-.
-
-\family typewriter
-modelname
-\family default
- is the model prefix for example
-\family typewriter
-test_case
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-timestep
-\family default
- is the time step you want to export.
- This corresponds to the last term in the filename of the velocity file.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-nodex
-\family default
-,
-\family typewriter
-nodey
-\family default
-,
-\family typewriter
-nodez
-\family default
- are the same values as those used in the input file.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-n-surf-cap
-\family default
- is equal to 1 for regional CitComS and 12 for full CitComS.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-nprocx
-\family default
-,
-\family typewriter
-nprocy
-\family default
-,
-\family typewriter
-nprocz
-\family default
- are the same values as those used in the input file.
-\end_layout
-
-\begin_layout Standard
-If your Beowulf cluster uses ssh (rather than rsh) to access the computation
- nodes, you have to manually edit
-\family typewriter
-bin/batchpaste.sh
-\family default
- and replace 'rsh' to 'ssh' in the file.
- If your cluster forbids access to the computation nodes, you can still
- post-process your output files (if they all reside in one directory) by
+) in the parallel file system, you can post-process your output files by
following the instructions in the next section.
\end_layout
@@ -3345,10 +3393,6 @@
\family typewriter
autocombine.py
\family default
- or
-\family typewriter
-batchcombine.py
-\family default
has been run, you should end up with 2 files (or 24 files for full CitComS)
that resemble these:
\end_layout
@@ -3407,26 +3451,6 @@
[step1] [step2 or more ...]
\end_layout
-\begin_layout Standard
-or
-\end_layout
-
-\begin_layout LyX-Code
-$ batchcombine.py localhost [modeldir] [modelname] [timestep]
-\backslash
-
-\end_layout
-
-\begin_layout LyX-Code
-[nodex] [nodey] [nodez] [n-surf-cap]
-\backslash
-
-\end_layout
-
-\begin_layout LyX-Code
-[nprocx] [nprocy] [nprocz]
-\end_layout
-
\begin_layout Section
Using OpenDX for Regional Visualization
\end_layout
@@ -3531,11 +3555,11 @@
field.
Then, edit the filename with the cap number replaced by
\family typewriter
-%d
+%02d
\family default
, e.g.,
\family typewriter
- test-case.cap%d.10.general
+ test-case.cap%02d.10.general
\family default
.) In the pulldown menu, select
\family typewriter
@@ -3564,7 +3588,7 @@
placement H
wide false
sideways false
-status open
+status collapsed
\begin_layout Standard
\begin_inset Graphics
@@ -3670,9 +3694,7 @@
\end_inset
-Cookbook 1: Global Model Using
-\family sans
-fullcitcoms
+Cookbook 1: Global Model
\end_layout
\begin_layout Subsection
@@ -3785,9 +3807,9 @@
\end_layout
\begin_layout Itemize
-the location of the output, e.g.,
+the prefix of the output filenames, e.g.,
\family typewriter
- /scratch/username/cookbook1
+ cookbook1
\end_layout
\begin_layout Itemize
@@ -3818,12 +3840,8 @@
, that is actually the default value.
\end_layout
-\begin_layout Standard
-
-\end_layout
-
\begin_layout LyX-Code
---solver.datafile="/scratch/username/cookbook1"
+--solver.datafile="cookbook1"
\backslash
\end_layout
@@ -3858,10 +3876,6 @@
, it is run by typing
\end_layout
-\begin_layout Standard
-
-\end_layout
-
\begin_layout LyX-Code
$ ./cookbook1.sh
\end_layout
@@ -3929,7 +3943,7 @@
\end_layout
\begin_layout LyX-Code
---solver.datafile="/scratch/username/Test/cookbook1"
+--solver.datafile="cookbook1"
\backslash
\end_layout
@@ -4075,12 +4089,7 @@
\begin_layout Standard
You would like to solve for thermal convection and with velocity boundary
conditions on the top surface within a given region of a sphere.
- As with
-\begin_inset LatexCommand \vref{sub:Uniprocessor-CitComS-from}
-
-\end_inset
-
-, you will need to modify a script that calls the regional version of CitComS,
+ You will need to modify a script that calls the regional version of CitComS,
\family typewriter
citcomsregional.sh
@@ -4400,7 +4409,7 @@
\end_layout
\begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook2
+--solver.datafile=cookbook2
\backslash
\end_layout
@@ -4580,7 +4589,7 @@
\end_layout
\begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook3
+--solver.datafile=cookbook3
\backslash
\end_layout
@@ -4736,7 +4745,7 @@
\end_layout
\begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook3
+--solver.datafile=cookbook3
\backslash
\end_layout
@@ -5070,7 +5079,7 @@
\end_layout
\begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook4
+--solver.datafile=cookbook4
\backslash
\end_layout
@@ -5379,10 +5388,6 @@
starting age for the model:
\end_layout
-\begin_layout Standard
-
-\end_layout
-
\begin_layout LyX-Code
--solver.param.vel_bound_file=$PWD/velocity/vel.dat
\backslash
@@ -5493,7 +5498,7 @@
\end_layout
\begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook5
+--solver.datafile=cookbook5
\backslash
\end_layout
@@ -6066,7 +6071,7 @@
\begin_layout Standard
\begin_inset Tabular
-<lyxtabular version="3" rows="8" columns="2">
+<lyxtabular version="3" rows="9" columns="2">
<features islongtable="true">
<column alignment="left" valignment="top" leftline="true" width="1.75in">
<column alignment="left" valignment="top" leftline="true" rightline="true" width="3.5in">
@@ -6075,7 +6080,7 @@
\begin_inset Text
\begin_layout Standard
-datafile=""
+datadir="."
\end_layout
\end_inset
@@ -6086,18 +6091,44 @@
\begin_layout Standard
\family typewriter
-datafile
+datadir
\family default
- controls the location of output files and the prefix of their names.
- In this case the directory /scratch/ must be created a priori.
- Files such as test.xxx, e.g.,
+ controls the location of output files.
+
\end_layout
+\end_inset
+</cell>
+</row>
+<row topline="true">
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
+
\begin_layout Standard
+datafile="regtest"
+\end_layout
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
+\begin_inset Text
+
+\begin_layout Standard
+
\family typewriter
-\size small
-datafile="/scratch/test"
+datafile
+\family default
+ controls the prefix of output filenames.
+ Files such as regtest.xxx.
+ The prefix cannot contain the
+\begin_inset Quotes sld
+\end_inset
+
+/
+\begin_inset Quotes srd
+\end_inset
+
+ character.
\end_layout
\end_inset
More information about the cig-commits
mailing list