These scalability tests were run using CitcomS 3.2.0 with default configuration. The mesh for these tests is a regional cap with 129×129×129 nodes. Total velocity unknowns is 129^3 x 3 = 6.4 million. The model is run for 11 time steps. The result reported is the total wall clock time. Each node on this cluster has 2 Xeon 5680 series 3.33GHz hex-core processors with a 12MB unified L3 cache and 24GB RAM, for a total of 12 cores per node. The interconnect is QDR InfiniBand.
|Partition||Total Procs||Wall Time (sec)||Speedup||Scalability|
The input file is available here: input.sample.zip (2 KB, uploaded by Denise Kwong 3 months 3 days ago). It is currently configured for 1×1×1 processors, to do different processor divisions you must change the nprocx, nprocy, and nprocz parameters. You must create a folder named “scratch” in the working directory for the output files. The input file uses the non-Python version of CitcomS, located at CitcomS-3.2.0/bin/CitcomSRegional.