[cig-commits] r4671 - mc/3D/CitcomS/trunk/doc/manual

tan2 at geodynamics.org tan2 at geodynamics.org
Fri Sep 29 17:56:33 PDT 2006


Author: tan2
Date: 2006-09-29 17:56:33 -0700 (Fri, 29 Sep 2006)
New Revision: 4671

Modified:
   mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx
Log:
Updated Chapter 3 "Running CitComS.py" for the .cfg input format, also
explained how to launch a parallel job. Some TODO items left.


Modified: mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx
===================================================================
--- mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx	2006-09-30 00:40:29 UTC (rev 4670)
+++ mc/3D/CitcomS/trunk/doc/manual/citcoms.lyx	2006-09-30 00:56:33 UTC (rev 4671)
@@ -1,4 +1,4 @@
-#LyX 1.4.1 created this file. For more info see http://www.lyx.org/
+#LyX 1.4.2 created this file. For more info see http://www.lyx.org/
 \lyxformat 245
 \begin_document
 \begin_header
@@ -86,7 +86,7 @@
 \begin_layout Author
 California Institute of Technology
 \newline
-Version 2.0.2
+Version 2.1.0
 \end_layout
 
 \begin_layout Date
@@ -192,7 +192,7 @@
 \begin_layout Standard
 The manual was written for the usage of CitComS.py on a variety of different
  platforms.
- CitComS has run on shared memory computers (Sun, Hewlett-Packard, SGI,
+ CitComS.py has run on shared memory computers (Sun, Hewlett-Packard, SGI,
  and IBM), commercial distributed memory machines (Intel and Cray/SGI),
  and Beowulf implementations (including Beowulfs at the California Institute
  of Technology's Center for Advanced Computing Research and Seismological
@@ -512,6 +512,7 @@
  In the summer of 2005, as part of the 2.0.1 release, the CIG replaced the
  old build procedure with the GNU Build System.
  A subsequent release, version 2.0.2, could compile and run on 64-bit systems.
+ (TODO: histroy of v2.1)
 \end_layout
 
 \begin_layout Section
@@ -582,8 +583,8 @@
 \begin_layout Standard
 Pyre provides a simulation framework that includes solver integration and
  coupling, uniform access to facilities, and integrated visualization.
- The framework offers a way to add new solvers to CitComS and to fine-tune
- CitComS simulations.
+ The framework offers a way to add new solvers to CitComS.py and to fine-tune
+ CitComS.py simulations.
  Future versions of this documentation will cover coupled simulations generated
  using via an "exchanger."
 \end_layout
@@ -615,8 +616,7 @@
 \end_layout
 
 \begin_layout Caption
-Pyre\InsetSpace ~
-Architecture.
+Pyre Architecture.
  The integration framework is a set of cooperating abstract services.
 \end_layout
 
@@ -626,21 +626,21 @@
 \end_layout
 
 \begin_layout Standard
-Developers have created Pyre classes for CitComS to facilitate simulation
+Developers have created Pyre classes for CitComS.py to facilitate simulation
  setup.
  However, they are not independent classes in a strict sense.
  They still share the same underlaying data structure and their functionality
  is not divided clearly.
- CitComS was not designed to be object-oriented and to make it so would
+ CitComS.py was not designed to be object-oriented and to make it so would
  require significant investment of effort with little return.
  However, the lack of object-oriented features does not hinder the coupling
- of CitComS with other solvers.
+ of CitComS.py with other solvers.
  
 \end_layout
 
 \begin_layout Standard
-This version of CitComS "attaches" to Pyre via the use of bindings.
- They are included with CitComS, eliminating the need for users to write
+This version of CitComS.py "attaches" to Pyre via the use of bindings.
+ They are included with CitComS.py, eliminating the need for users to write
  or alter them.
 \end_layout
 
@@ -1500,6 +1500,7 @@
 PYTHONPATH
 \family default
 .
+ (Change the path name accordingly if your python is not of version 2.3.x.)
  Finally, on some obscure Unix systems, you may need to add 
 \family typewriter
 PREFIX/lib
@@ -1658,10 +1659,6 @@
 
 \begin_layout Standard
 
-\end_layout
-
-\begin_layout Standard
-
 \family typewriter
 \InsetSpace ~
 \InsetSpace ~
@@ -1685,10 +1682,6 @@
 \end_layout
 
 \begin_layout Standard
-
-\end_layout
-
-\begin_layout Standard
 By default, 
 \family typewriter
 configure
@@ -1981,6 +1974,10 @@
 \end_layout
 
 \begin_layout Section
+\begin_inset LatexCommand \label{sec:Batch-System-Configuration}
+
+\end_inset
+
 Batch System Configuration
 \end_layout
 
@@ -2387,7 +2384,7 @@
 \end_layout
 
 \begin_layout LyX-Code
-$ export PYTHONPATH=$HOME/cig/lib/python2.4/site-packages:$PYTHONPATH
+$ export PYTHONPATH=$HOME/cig/lib/python2.3/site-packages:$PYTHONPATH
 \end_layout
 
 \begin_layout LyX-Code
@@ -2439,10 +2436,6 @@
 Running CitComS.py
 \end_layout
 
-\begin_layout Standard
-
-\end_layout
-
 \begin_layout Section
 Using CitComS.py
 \end_layout
@@ -2457,48 +2450,67 @@
 \family typewriter
 /usr/local/bin
 \family default
-), 
+), the program 
 \family typewriter
-citcomsregional.sh
+citcoms
 \family default
- is used for regional models and 
+ is used for running both the regional and full spherical models.
+ The program 
 \family typewriter
-citcomsfull.sh
+citcoms
 \family default
- for full spherical models.
- The program
-\family typewriter
- citcomsregional.sh
-\family default
- can be executed without any command line options and will run with default
- parameters; 
-\family typewriter
-citcomsfull.sh
-\family default
- requires at least 12 processors to run.
- To do so, you must specify the 
-\family typewriter
-launcher
-\family default
- parameters, which are called "facilities."
+ can be executed without any command line options and will run a regional
+ model with default parameters.
+ It can also run a full spherical model if correct parameters are set (see
+ 
+\begin_inset LatexCommand \vref{sec:Cookbook-1:-Global}
+
+\end_inset
+
+for an example).
 \end_layout
 
 \begin_layout Standard
-Since you will likely want to specify the parameters of your CitComS runs,
- you will need to alter both computational details (such as the number of
- time steps) and controlling parameters specific to your problem (such as
- the Rayleigh number).
- Most of the properties you will set using CitComS.py have identical names
- as the parameters for the old CitComS.
- 
+On input, CitComS.py needs numerous parameters specified (see Appendix 
+\begin_inset LatexCommand \vref{cha:Appendix-A:-Facilities,}
+
+\end_inset
+
+ for full list).
+ All parameters have sensible default values.
+ Since you will likely want to specify the parameters of your CitComS.py
+ runs, you will need to alter both computational details (such as the number
+ of time steps) and controlling parameters specific to your problem (such
+ as the Rayleigh number).
+ These input parameters, or properties in the Pyre terminology, are grouped
+ under several Pyre components.
+ You can think 
 \end_layout
 
+\begin_layout Standard
+Most of the properties you will set using CitComS.py have identical names
+ as the parameters for the old CitComS, which is described in 
+\begin_inset LatexCommand \vref{cha:Appendix-A:-Facilities,}
+
+\end_inset
+
+.
+\end_layout
+
 \begin_layout Section
-Changing Parameters
+Changing Parameters 
 \end_layout
 
 \begin_layout Standard
-Pyre has the following syntax to change parameters from their default values.
+(TODO: precedence of various input methods)
+\end_layout
+
+\begin_layout Subsection
+From the Command Line
+\end_layout
+
+\begin_layout Standard
+Pyre has the following syntax to change properties from their default values.
  To change the value of a property of a component, use:
 \end_layout
 
@@ -2516,13 +2528,49 @@
 \end_layout
 
 \begin_layout Standard
-A different component can be attached to a facility by:
+Each facility has a default component attached to it.
+ A different component can be attached to a facility by:
 \end_layout
 
 \begin_layout LyX-Code
 --[facility]=[new_component] 
 \end_layout
 
+\begin_layout Subsection
+In a .cfg File
+\end_layout
+
+\begin_layout Standard
+Entering all those parameters via the command line involves the risk of
+ typographical errors, which can lead to undesired results.
+ So, you may find it easier to write a brief input file that contains the
+ parameters.
+ Since v2.1, the input parameters can be saved in a .cfg file.
+ The file has a format similar to Windows INI file.
+ The file is composed of one or more sections.
+ Each section is like:
+\end_layout
+
+\begin_layout LyX-Code
+[CitcomS.component1.component2]
+\newline
+# this is a comment
+\newline
+property1 = value1
+\newline
+property2
+ = value2
+\end_layout
+
+\begin_layout Subsection
+System or Personal Default Parameters in a .pml File
+\end_layout
+
+\begin_layout Standard
+(TODO: where is the system-wide pml file, where is the personal file, and
+ the syntax of it?)
+\end_layout
+
 \begin_layout Section
 Coordinate System and Mesh
 \end_layout
@@ -2534,11 +2582,11 @@
 \emph on
 theta
 \emph default
- is the colatitude measured from the north pole, 
+ (or x) is the colatitude measured from the north pole, 
 \emph on
 fi
 \emph default
- is the east longitude, and z is the radius.
+ (or y) is the east longitude, and z is the radius.
  
 \emph on
 theta
@@ -2548,8 +2596,11 @@
 fi
 \emph default
  are in units of radians.
- Figure 3.1 shows the organization of the mesh in a regional problem.
- 
+ Figure 3.1(TODO: Sue, please please update '3.1' to an automatic tag) shows
+ the organization of the mesh in a regional problem.
+ The numbering of the nodes is z-direction first, then x-direction, then
+ y-direction.
+ This numbering convention is used for the input and output data.
 \end_layout
 
 \begin_layout Standard
@@ -2557,7 +2608,7 @@
 placement H
 wide false
 sideways false
-status collapsed
+status open
 
 \begin_layout Description
 \begin_inset Graphics
@@ -2569,9 +2620,7 @@
 \end_layout
 
 \begin_layout Caption
-Global\InsetSpace ~
-Node\InsetSpace ~
-Numbering.
+Global Node Numbering.
  Left: Global node numbering starts at the base of arrow A (theta
 \size scriptsize
 min
@@ -2620,45 +2669,21 @@
 \end_layout
 
 \begin_layout Section
-A Simple CitComS Test
+Uniprocessor Example
 \end_layout
 
 \begin_layout Standard
 CitComS runs similarly in full spherical or regional modes.
  For the purpose of this example, you will perform a test run of the regional
  version.
- Instructions for using the full version can be found at 
-\begin_inset LatexCommand \vref{sec:Cookbook-1:-Global}
-
-\end_inset
-
-.
+ Execute the following on the command line:
 \end_layout
 
-\begin_layout Standard
-Execute the following on the command line:
-\end_layout
-
-\begin_layout Paragraph
-\begin_inset LatexCommand \label{sub:Uniprocessor-CitComS-from}
-
-\end_inset
-
-Example: Uniprocessor CitComS from the Command Line
-\end_layout
-
 \begin_layout LyX-Code
-citcomsregional.sh --steps=10 
+$ citcoms --steps=10 --controller.monitoringFrequency=5 --solver.datafile=example0
+ --solver.mesher.nodex=17 --solver.mesher.nodey=17
 \end_layout
 
-\begin_layout LyX-Code
---controller.monitoringFrequency=5 --solver.datafile=example 
-\end_layout
-
-\begin_layout LyX-Code
---solver.mesher.nodex=17 --solver.mesher.nodey=17
-\end_layout
-
 \begin_layout Standard
 This runs a default convection problem in a regional domain for 10 time
  steps and with a mesh of 17 
@@ -2670,318 +2695,461 @@
 \end_inset
 
  9 nodal points.
- See Chapter 
-\begin_inset LatexCommand \vref{cha:Postprocessing-and-Graphics}
-
-\end_inset
-
- for instructions to immediately visualize your results with OpenDX.
+ The model results are writen to files 
+\family typewriter
+example0.*
+\family default
+ with an interval of 5 time steps.
+ 
 \end_layout
 
-\begin_layout Section
-Multiprocessor Example
-\end_layout
-
 \begin_layout Standard
-On input, CitComS needs numerous parameters specified (see Appendix A for
- full list).
- Most parameters have sensible default values.
- Depending on those input parameters, CitComS could also require numerous
- input files, such as velocity boundary conditions which could be used for
- time dependent, plate tectonic problems.
- Finally, CitComS will potentially output a large number of files which
- will be spread throughout the file systems contained in the individual
- nodes of your Beowulf.
- This means that you will have to organize your directories carefully when
- running CitComS so that you can find your way through these files as well
- as use a post-processing program contained with this distribution.
- (In the example that follows, these output files would be placed in 
+Instead of writing the input parameters on the command line, you can put
+ them in a .cfg file.
+ When you downloaded the CitComS.py source, this included an 
 \family typewriter
-/scratch/username
+examples
 \family default
- with a filename prefix 
+ directory.
+ In this directory, you will find an equivalent .cfg of the previous example
+ 
 \family typewriter
-example1
+example0.cfg
 \family default
- on each individual node.)
+.
+ You can run the model using:
 \end_layout
 
-\begin_layout Standard
-It is possible to input the parameters relevant to your test run from the
- command line:
+\begin_layout LyX-Code
+$ citcoms 
+\family typewriter
+example0.cfg
 \end_layout
 
 \begin_layout Paragraph
-Example: Multiprocessor CitComS from the Command Line
+Example: Uniprocessor CitComS.py, example0.cfg
 \end_layout
 
 \begin_layout LyX-Code
-$ citcomsregional.sh --launcher.nodes=4 
-\end_layout
+[CitcomS]
+\newline
+steps = 10
+\newline
 
-\begin_layout LyX-Code
---launcher.nodegen="n%03d" --launcher.nodelist=[101-102] 
-\end_layout
+\newline
+[CitcomS.controller]
+\newline
+monitoringFrequency = 5
+\newline
 
-\begin_layout LyX-Code
---solver.mesher.nprocx=2 --solver.mesher.nprocy=2 
+\newline
+[CitcomS.solver]
+\newline
+data
+file = example0
+\newline
+
+\newline
+[CitcomS.solver.mesher]
+\newline
+nodex = 17
+\newline
+nodey = 17
 \end_layout
 
-\begin_layout LyX-Code
---solver.datafile="/scratch/username/example1"
+\begin_layout Section
+Multiprocessor Example
 \end_layout
 
 \begin_layout Standard
-However, entering all those parameters via the command line involves the
- risk of typographical errors, which can lead to undesired results.
- So, you may find it easier to write a brief shell script that contains
- the parameters.
- The file should have the suffix 
+In order to run this example you should be on a Beowulf cluster with four
+ or more processors or on a supercomputing center and be in the directory
+ in which the input file is located, in this case, the 
 \family typewriter
-.sh
-\family default
-.
- When you downloaded the CitComS source, this included an 
-\family typewriter
 examples
 \family default
  directory.
- In this directory, you will find a script called 
-\family typewriter
-example1.sh
-\family default
-.
+ CitComS.py has been extensively used on both environments, using up to several
+ hundred of processors.
+ How to run a multiprocessor CitComS.py model depends on your hardware and
+ software settings.
+ For example, whether a batch system is used, what are the names of the
+ computers in a cluster, and how is the filesystem organized.
+ This section will lead you thorough the different settings of parallel
+ environment.
  
 \end_layout
 
 \begin_layout Paragraph
-Example: Simple Regional CitComS Shell Script, example1.sh 
+Example: Multiprocessor CitComS.py, example1.cfg 
 \end_layout
 
 \begin_layout LyX-Code
-#!/bin/sh
+[CitcomS]
 \newline
+steps = 71
+\newline
 
-\end_layout
+\newline
+[CitcomS.controller]
+\newline
+monitoringFrequency = 10
+\newline
 
-\begin_layout LyX-Code
-citcomsregional.sh 
-\backslash
+\newline
+[CitcomS.solver]
+\newline
+dat
+afile = example1
+\newline
 
+\newline
+[CitcomS.solver.mesher]
+\newline
+nprocx =  2
+\newline
+nprocy =  2
+\newline
+nodex  = 17
+\newline
+nodey
+  = 17
+\newline
+nodez  =  9                                                      
+                          
 \end_layout
 
-\begin_layout LyX-Code
-
-\backslash
-
+\begin_layout Standard
+This example uses 2 processors in colatitude (x-coordinate), 2 in longitude
+ (y-direction), and 1 in the radial (z-direction), i.
+ e., it uses 4 processors in total.
+ In addition, there will be 17 nodes in x, 17 nodes in y, and 9 nodes in
+ z.
+ The model will run for 71 time steps and output the results every 10 time
+ steps.
 \end_layout
 
-\begin_layout LyX-Code
---steps=71 
-\backslash
-
+\begin_layout Standard
+It is important to realize that within the example script (and in finite
+ element method, FEM) the term "node" refers to the mesh points defining
+ the corners of the elements.
+ In 
+\family typewriter
+example1.cfg
+\family default
+, this is indicated with:
 \end_layout
 
 \begin_layout LyX-Code
-
-\backslash
- 
+nodex  = 17
+\newline
+nodey  = 17
+\newline
+nodez  =  9 
 \end_layout
 
-\begin_layout LyX-Code
---controller.monitoringFrequency=10 
-\backslash
-
+\begin_layout Standard
+These quantities refer to the total number of FEM nodes in a given direction
+ for the complete problem, and for the example it works out that within
+ a given processor there will be 9 x 9 x 9 nodes.
+ Note that in the x-direction (or y) that for the entire problem there are
+ 17 nodes and there is one node shared between two processors.
+ This shared node is duplicated in two adjacent processors.
+ Unfortunately, "nodes" sometimes also refers to the individual computers
+ which make up the Beowulf cluster.
+ In the example scripts, this is indicated with:
 \end_layout
 
 \begin_layout LyX-Code
-
-\backslash
-
+nprocx =  2
+\newline
+nprocy =  2
 \end_layout
 
-\begin_layout LyX-Code
---launcher.nodes=4 
-\backslash
-
+\begin_layout Subsection
+File Systems and Output Formats
 \end_layout
 
-\begin_layout LyX-Code
---launcher.nodegen="n%03d" 
-\backslash
-
+\begin_layout Standard
+CitComS.py will potentially output a large number of ASCII files.
+ This means that you will have to organize your directories carefully when
+ running CitComS.py so that you can find your way through these files as
+ well as use a post-processing program contained with this distribution.
+ 
 \end_layout
 
-\begin_layout LyX-Code
---launching.nodelist=[101-102,101-102] 
-\backslash
+\begin_layout Standard
+You might have a local hard disk on every machine in the Beowulf cluster.
+ Each hard disk is mounted locally to the machine.
+ This scenario is referred to as local file system in this section.
+ Or you might be using some kind of parallel file system on your cluster
+ (e.g., NFS, GPFS, PVFS, to name a few), which is mounted on all machines
+ on the cluster.
+ Usually your home directory is mounted on a parallel file system.
+ The local file system is usually more cost- and time-efficient than the
+ parallel file system.
  
 \end_layout
 
-\begin_layout LyX-Code
-
-\backslash
-
+\begin_layout Standard
+If you want CitComS.py write its output to the local hard disks, you need
+ to have a common directory structure for all the local hard disks.
+ For example, if the directory 
+\family typewriter
+/scratch/username
+\family default
+ (you should subsitute 
+\family typewriter
+username
+\family default
+ to your own username) exists on all local hard disks, you can run the example
+ script with:
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile=/scratch/username/example1 
-\backslash
+$ citcoms 
+\family typewriter
+example1.cfg
+\family default
+ --solver.datafile=/scratch/username/example1
+\end_layout
 
+\begin_layout Standard
+The additional command line option will override the 
+\family typewriter
+datafile
+\family default
+ property in the input file.
+ Then, the output files would be placed in 
+\family typewriter
+/scratch/username
+\family default
+ with a filename prefix 
+\family typewriter
+example1
+\family default
+ on each individual machine.
 \end_layout
 
-\begin_layout LyX-Code
-
-\backslash
-
+\begin_layout Standard
+If you want CitComS.py write its output to the parallel file system, you
+ have several choices.
+ You can run the example script with:
 \end_layout
 
 \begin_layout LyX-Code
---solver.mesher.nprocx=2 
-\backslash
-
+$ citcoms 
+\family typewriter
+example1.cfg
+\family default
+ --solver.datafile=/home/username/example1
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nprocy=2 
-\backslash
- 
+\begin_layout Standard
+Then, the output files would be placed in your home directory with a filename
+ prefix 
+\family typewriter
+example1
+\family default
+.
+ The potential problem with this approach is that the directory 
+\family typewriter
+/home/username
+\family default
+ will be flooded with hundreds or tens of thousands files if you are running
+ a model using several tens of processors for thousands of time steps.
+ Or you can have each machine write its output to its own directory, according
+ to its MPI rank.
+ You can run the example script with:
 \end_layout
 
 \begin_layout LyX-Code
---solver.mesher.nodex=17 
-\backslash
- 
+$ citcoms 
+\family typewriter
+example1.cfg
+\family default
+ --solver.datadir=/home/username --solver.output.output_format=ascii
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nodey=17 
-\backslash
+\begin_layout Standard
+You will see four new directories 
+\family typewriter
+/home/username/0
+\family default
+, 
+\family typewriter
+/home/username/1
+\family default
+, 
+\family typewriter
+/home/username/2
+\family default
+, and 
+\family typewriter
+/home/username/3
+\family default
+.
+ The processor of MPI rank 0 will write its output in 
+\family typewriter
+/home/username/0
+\family default
+ with a filename prefix 
+\family typewriter
+example1
+\family default
+ (defined by the property 
+\family typewriter
+datafile
+\family default
+ inside 
+\family typewriter
+example1.cfg
+\family default
+) and so on.
  
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nodez=9 
-\backslash
-
+\begin_layout Standard
+The last choice is the most powerful one.
+ Instead of writing many ASCII files, CitComS.py can write its result in
+ a single HDF5 (Hierarchical Data Format) file.
+ The HDF5 file takes less disk space than the ASCII files in total and doesn't
+ require additional post-processing to be visualized in OpenDX.
+ In order to use this feature, you need to compile CitComS.py with parallel
+ HDF5 library (TODO: link to chapter 2).
+ You can run the example script with:
 \end_layout
 
 \begin_layout LyX-Code
-# End of file
+$ citcoms 
+\family typewriter
+example1.cfg
+\family default
+ --solver.datafile=/home/username/example1 --solver.output.output_format=hdf5
 \end_layout
 
 \begin_layout Standard
-This is a basic problem with 2 processors in colatitude (x-coordinate),
- 2 in longitude (y-direction), and 1 in the radial (z-direction).
- In addition, there will be 17 nodes in x, 17 nodes in y, and 9 nodes in
- z.
-\end_layout
-
-\begin_layout Standard
-It is important to realize that within the code (and in finite element analysis)
- the term "node" refers to the mesh points defining the corners of the elements.
- Unfortunately, "nodes" also refers to the individual computers which make
- up the Beowulf PC cluster.
- In 
+Then, the output file will be 
 \family typewriter
-example1.sh
+/home/username/example1.h5
 \family default
-, this is indicated with:
+.
+ See Chapter (TODO: link to hdf5 chapter) for more information on how to
+ work with HDF5 output.
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nprocx=2 
-\backslash
-
+\begin_layout Subsection
+Launchers
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nprocy=2 
-\backslash
- 
+\begin_layout Standard
+If you have used MPI before, you will know that 
+\family typewriter
+mpirun
+\family default
+ requires several command line options to launch a parallel job.
+ Or if you have used one of the batch system, you will know that the batch
+ system requires you to write a script to launch a job.
+ Fortunately, launching a parallel CitComS.py job is simplified by the 
+\family typewriter
+launcher
+\family default
+ component.
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nodex=17 
-\backslash
- 
+\begin_layout Subsubsection
+Using MPICH or LAM/MPI
 \end_layout
 
-\begin_layout LyX-Code
---solver.mesher.nodey=17 
-\backslash
-
+\begin_layout Standard
+You need to specify on what machines will the job run.
+ Suppose the machines on your cluster are named: n001, n002, \SpecialChar \ldots{}
+, etc., and
+ you want to run the job on machines n001, n003, n004, and n005 (maybe n002
+ is down for the monment).
+ To run the example, invoke the following:
 \end_layout
 
 \begin_layout LyX-Code
---solver.mesher.nodez=9 
-\backslash
- 
+$ citcoms example1.cfg --launcher.nodegen="n%03d" --launcher.nodelist=[1,3-5]
 \end_layout
 
 \begin_layout Standard
-These quantities refer to the total number of FEM nodes in a given direction
- for the complete problem and for the example it works out that within a
- given processor there will be 9 x 9 x 9 nodes.
- Note that in the x-direction (or y) that for the entire problem there are
- 17 nodes and there is one node shared between two processors.
- This shared node is duplicated in two adjacent processors.
- The global numbering of FEM nodes is z-direction first, then x-direction,
- then y-direction.
- This numbering convention is used for the input and output data.
-\end_layout
-
-\begin_layout Standard
-In order to run this example you should be on your front end and be in the
- directory in which the input file is located, in this case, the 
+The 
 \family typewriter
-examples
+launcher.nodegen
 \family default
- directory.
- To run the test example, invoke the following:
+ property is a printf-style format string, used in conjuntion with 
+\family typewriter
+launcher.nodelist
+\family default
+ to generate the list of machine names.
+ The 
+\family typewriter
+launcher.nodelist
+\family default
+ property is a comma-seperated list of machine names in square barckets.
 \end_layout
 
-\begin_layout LyX-Code
-$ ./example1.sh
-\end_layout
-
 \begin_layout Standard
-Once you have completed your run, you will notice that it generates a file
- called 
+If you are using MPICH, you will notice that a machine file 
 \family typewriter
 mpirun.nodes
 \family default
-.
- It will contain a list of the nodes where CitComS has run.
- You may want to login to one of the nodes to check that the data appears
- to have been properly generated.
- You should also check the log file for any error messages.
- 
+ is generated.
+ It will contain a list of the nodes where CitComS.py has run.
+ Save machine file as it will be useful in the post-processing step.
 \end_layout
 
 \begin_layout Standard
-If you receive a message such as 
+If you are using LAM/MPI, there is no machinefile genrated.
+ In addition, you can pass additional environment variables to LAM/MPI via:
 \end_layout
 
 \begin_layout LyX-Code
+$ citcoms example1.cfg --launcher.environments=[env_var1,env_var2]
+\end_layout
 
-\family typewriter
-bash: --solver.mesher.nodez=9: command not found
+\begin_layout Subsubsection
+Using a Batch System
 \end_layout
 
 \begin_layout Standard
-chances are your script is missing a line continuation or has an extra space.
- However, most of the time CitComS will just run using the default values
- -- even if it finds a typographical error such as "nnode" instead of "node."
- So, along with double-checking your parameters as you type them in, you
- should check your results to be sure they make sense.
- 
+First, you should follow the instruction on section 
+\begin_inset LatexCommand \vref{sec:Batch-System-Configuration}
+
+\end_inset
+
+.
+ Once the configuration has done, you can run the example simply by:
 \end_layout
 
+\begin_layout LyX-Code
+$ citcoms example1.cfg
+\end_layout
+
+\begin_layout Subsection
+Monitoring Your Jobs
+\end_layout
+
 \begin_layout Standard
-Following your successful run, you will want to retrieve the output files
+Once launched, CitComS.py with print the progress of the model to the standard
+ error stream (stderr).
+ Usually, the stderr is directed to your terminal so that you can monitor
+ the progress.
+ On some system, the stderr is redirected to a file.
+ In any case, the progress is always saved in a log file by the rank-0 processor.
+ You should check the log file for any error messages.
+ You may also want to login to one of the machine to check that the data
+ appears to have been properly generated.
+ Following your successful run, you will want to retrieve the output files
  from all the nodes and process them so they can be visualized with the
- visualization program OpenDX and proceed to the next chapter.
+ visualization program OpenDX and proceed to the chapter
+\begin_inset LatexCommand \vref{cha:Postprocessing-and-Graphics}
+
+\end_inset
+
+.
 \end_layout
 
 \begin_layout Section
@@ -3133,67 +3301,55 @@
 \end_layout
 
 \begin_layout Standard
-These output files need to be post-processed before you can do the visualization.
- There are two scripts and both can post-process (retrieve and combine)
- CitComS output: 
+When you execute a CitComS run, your input parameters will be saved in a
+ file 
 \family typewriter
-autocombine.py
+PID.parameters
 \family default
- and 
+, where 
 \family typewriter
-batchcombine.py
+PID
 \family default
-.
- It is recommened to use 
+ is a five-digit number for the process ID.
+ This file contains most of the input parameters, which can be useful for
+ archiving and post-processing.
+ (XXX: TODO) 
+\end_layout
+
+\begin_layout Standard
+These output files need to be post-processed before you can do the visualization.
+ There is a script that can post-process (retrieve and combine) CitComS
+ output: 
 \family typewriter
 autocombine.py
 \family default
- over 
-\family typewriter
-batchcombine.py
-\family default
-, as the former is easier to use.
- 
+.
+ The script will retrieve CitComS data to the current directory and combine
+ the output into a few files.
 \end_layout
 
 \begin_layout Standard
-To use 
+Using 
 \family typewriter
 autocombine.py
 \family default
-, you have to run CitComS.py in a slightly different way.
- CitComS.py prints out the input parameters to the screen.
- That printout can be redirected to a file, which can be used later to retrieve
- input parameters.
- This is done with: 
+, you can retrieve and combine data of time-step 10 by: 
 \end_layout
 
 \begin_layout LyX-Code
-$ my_citcoms.py_script > input_param 
+$ autocombine.py mpirun.nodes PID.parameters 10 
 \end_layout
 
 \begin_layout Standard
-Then, you can retrieve and combine data of step 10 by using: 
-\end_layout
-
-\begin_layout LyX-Code
-$ autocombine.py mpirun.nodes input_param 10 
-\end_layout
-
-\begin_layout Standard
 This reads the MPI machinefile (
 \family typewriter
 mpirun.nodes
 \family default
 ) and CitComS.py input parameters (
 \family typewriter
-input_param
+PID.param
 \family default
-), then calls 
-\family typewriter
-batchcombine.py
-\family default
- to do the actual job.
+eters), then calls other scripts to do the actual job.
  The general usage for 
 \family typewriter
 autocombine.py
@@ -3212,130 +3368,22 @@
 \end_layout
 
 \begin_layout Standard
-If you do not have the input parameters saved to a file, you have to use
- 
+If your Beowulf cluster uses ssh (rather than rsh) to access the computation
+ nodes, you have to manually edit 
 \family typewriter
-batchcombine.py
+PREFIX/bin/batchpaste.sh
 \family default
- to retrieve and combine the data.
- The general usage for 
-\family typewriter
-batchcombine.py
-\family default
- is: 
-\end_layout
-
-\begin_layout LyX-Code
-$ batchcombine.py [machinefile] [modeldir] [modelname] 
-\backslash
-
-\end_layout
-
-\begin_layout LyX-Code
-[timestep] [nodex] [nodey] [nodez] [n-surf-cap] 
-\backslash
+ and replace 'rsh' to 'ssh' in the script.
  
 \end_layout
 
-\begin_layout LyX-Code
-[nprocx] [nprocy] [nprocz] 
-\end_layout
-
 \begin_layout Standard
-where 
-\end_layout
-
-\begin_layout Itemize
-
+If your cluster uses a parallell file system and all of your data resides
+ in one directory (e.g., 
 \family typewriter
-machinefile
-\family default
- is the file with a list of all the machines you are using, which under
- normal circumstances are the same as the launching nodes.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-modeldir
-\family default
- is the scratch directory where all your files are located.
- This should correspond to the first part of the 
-\family typewriter
-datafile
-\family default
- entry, e.g., 
-\family typewriter
 /scratch/username
 \family default
-.
- 
-\family typewriter
-modelname
-\family default
- is the model prefix for example 
-\family typewriter
-test_case
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-timestep
-\family default
- is the time step you want to export.
- This corresponds to the last term in the filename of the velocity file.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-nodex
-\family default
-, 
-\family typewriter
-nodey
-\family default
-, 
-\family typewriter
-nodez
-\family default
- are the same values as those used in the input file.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-n-surf-cap
-\family default
- is equal to 1 for regional CitComS and 12 for full CitComS.
-\end_layout
-
-\begin_layout Itemize
-
-\family typewriter
-nprocx
-\family default
-, 
-\family typewriter
-nprocy
-\family default
-, 
-\family typewriter
-nprocz
-\family default
- are the same values as those used in the input file.
-\end_layout
-
-\begin_layout Standard
-If your Beowulf cluster uses ssh (rather than rsh) to access the computation
- nodes, you have to manually edit 
-\family typewriter
-bin/batchpaste.sh
-\family default
- and replace 'rsh' to 'ssh' in the file.
- If your cluster forbids access to the computation nodes, you can still
- post-process your output files (if they all reside in one directory) by
+) in the parallel file system, you can post-process your output files by
  following the instructions in the next section.
  
 \end_layout
@@ -3345,10 +3393,6 @@
 \family typewriter
 autocombine.py
 \family default
- or 
-\family typewriter
-batchcombine.py
-\family default
  has been run, you should end up with 2 files (or 24 files for full CitComS)
  that resemble these: 
 \end_layout
@@ -3407,26 +3451,6 @@
 [step1] [step2 or more ...] 
 \end_layout
 
-\begin_layout Standard
-or 
-\end_layout
-
-\begin_layout LyX-Code
-$ batchcombine.py localhost [modeldir] [modelname] [timestep] 
-\backslash
-
-\end_layout
-
-\begin_layout LyX-Code
-[nodex] [nodey] [nodez] [n-surf-cap] 
-\backslash
- 
-\end_layout
-
-\begin_layout LyX-Code
-[nprocx] [nprocy] [nprocz] 
-\end_layout
-
 \begin_layout Section
 Using OpenDX for Regional Visualization
 \end_layout
@@ -3531,11 +3555,11 @@
  field.
  Then, edit the filename with the cap number replaced by 
 \family typewriter
-%d
+%02d
 \family default
 , e.g.,
 \family typewriter
- test-case.cap%d.10.general
+ test-case.cap%02d.10.general
 \family default
 .) In the pulldown menu, select
 \family typewriter
@@ -3564,7 +3588,7 @@
 placement H
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Standard
 \begin_inset Graphics
@@ -3670,9 +3694,7 @@
 
 \end_inset
 
-Cookbook 1: Global Model Using 
-\family sans
-fullcitcoms
+Cookbook 1: Global Model
 \end_layout
 
 \begin_layout Subsection
@@ -3785,9 +3807,9 @@
 \end_layout
 
 \begin_layout Itemize
-the location of the output, e.g.,
+the prefix of the output filenames, e.g.,
 \family typewriter
- /scratch/username/cookbook1
+ cookbook1
 \end_layout
 
 \begin_layout Itemize
@@ -3818,12 +3840,8 @@
 , that is actually the default value.
 \end_layout
 
-\begin_layout Standard
-
-\end_layout
-
 \begin_layout LyX-Code
---solver.datafile="/scratch/username/cookbook1" 
+--solver.datafile="cookbook1" 
 \backslash
  
 \end_layout
@@ -3858,10 +3876,6 @@
 , it is run by typing 
 \end_layout
 
-\begin_layout Standard
-
-\end_layout
-
 \begin_layout LyX-Code
 $ ./cookbook1.sh
 \end_layout
@@ -3929,7 +3943,7 @@
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile="/scratch/username/Test/cookbook1" 
+--solver.datafile="cookbook1" 
 \backslash
 
 \end_layout
@@ -4075,12 +4089,7 @@
 \begin_layout Standard
 You would like to solve for thermal convection and with velocity boundary
  conditions on the top surface within a given region of a sphere.
- As with 
-\begin_inset LatexCommand \vref{sub:Uniprocessor-CitComS-from}
-
-\end_inset
-
-, you will need to modify a script that calls the regional version of CitComS,
+ You will need to modify a script that calls the regional version of CitComS,
  
 \family typewriter
 citcomsregional.sh
@@ -4400,7 +4409,7 @@
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook2 
+--solver.datafile=cookbook2 
 \backslash
 
 \end_layout
@@ -4580,7 +4589,7 @@
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook3 
+--solver.datafile=cookbook3 
 \backslash
 
 \end_layout
@@ -4736,7 +4745,7 @@
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook3 
+--solver.datafile=cookbook3 
 \backslash
 
 \end_layout
@@ -5070,7 +5079,7 @@
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook4 
+--solver.datafile=cookbook4 
 \backslash
  
 \end_layout
@@ -5379,10 +5388,6 @@
  starting age for the model: 
 \end_layout
 
-\begin_layout Standard
-
-\end_layout
-
 \begin_layout LyX-Code
 --solver.param.vel_bound_file=$PWD/velocity/vel.dat 
 \backslash
@@ -5493,7 +5498,7 @@
 \end_layout
 
 \begin_layout LyX-Code
---solver.datafile=/scratch/username/cookbook5 
+--solver.datafile=cookbook5 
 \backslash
 
 \end_layout
@@ -6066,7 +6071,7 @@
 
 \begin_layout Standard
 \begin_inset Tabular
-<lyxtabular version="3" rows="8" columns="2">
+<lyxtabular version="3" rows="9" columns="2">
 <features islongtable="true">
 <column alignment="left" valignment="top" leftline="true" width="1.75in">
 <column alignment="left" valignment="top" leftline="true" rightline="true" width="3.5in">
@@ -6075,7 +6080,7 @@
 \begin_inset Text
 
 \begin_layout Standard
-datafile=""
+datadir="."
 \end_layout
 
 \end_inset
@@ -6086,18 +6091,44 @@
 \begin_layout Standard
 
 \family typewriter
-datafile
+datadir
 \family default
- controls the location of output files and the prefix of their names.
- In this case the directory /scratch/ must be created a priori.
- Files such as test.xxx, e.g., 
+ controls the location of output files.
+ 
 \end_layout
 
+\end_inset
+</cell>
+</row>
+<row topline="true">
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
+
 \begin_layout Standard
+datafile="regtest"
+\end_layout
 
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
+\begin_inset Text
+
+\begin_layout Standard
+
 \family typewriter
-\size small
-datafile="/scratch/test"
+datafile
+\family default
+ controls the prefix of output filenames.
+ Files such as regtest.xxx.
+ The prefix cannot contain the 
+\begin_inset Quotes sld
+\end_inset
+
+/
+\begin_inset Quotes srd
+\end_inset
+
+ character.
 \end_layout
 
 \end_inset



More information about the cig-commits mailing list