Project

General

Profile

CluckComputingCluster

Timescales and SL6/SL5 handling

short-term

Through Dec 2012
Dedicated grunts on SL5 (we can use whenever we want)
Cluck running SL6
Grunt1 can be used as a place to build software products we need (special product build such as compilers)
Grunts2-5 can be used for development of code (for the retreat and immediate time afterwards)
Grunts2-5 can be used for running multi-node tests (when we need them for this purpose)

mid-term

Jan-May is the time frame.
SL6/SL5 mixed environment.
Batch system controls access to grunt nodes.
SL5 builds will need to be done from cluck (VM or other) or oink.
Oink can be used for building products and for SL5 development also.

long-term

After mixed environment, all SL6.
Batch system controls access to grunt nodes.

Virtual machines {or virtual environment)

The mixed environment causes trouble for us. Ron will try to set up a "virtual root" system that enables compatibility with SL5 under SL6. If this cannot be done in a short time (~4 hours), then a VM should be used.

Use the KVM hypervizer built into the kernel if a VM is started. Ron thinks it is QEMU. If VM used, lets use allocate 16GB or RAM?

This virtual environment can be used for building SL5 applications.
The virtual machines will also need to run the RTEMS OS eventually.

Home directory handling

It is important that all SL5 development nodes see the same home and installation structure to make building and running easy, especially for MPI or other multi-node jobs. For R, Python, and Ruby (for example), they expect a home directory to be associated with a specific OS. This structure satisfies these needs.

short/mid-term

Grunts will all have same home directory. This will be an SL5 home directory area.
This home directory will be anchored at the virtual environment on cluck and mounted on the grunts for our group (cluck users). In other words, cluck is the home for the home directories.

long-term

Cluck home directory will be shared on the grunts. There will be a transition time to make the older SL5 home area data available.

Mount points (cluck origin)

/home (/mnt/disk1/grunt/home)

/products (/products)
UPS area, products that can be built as relocatable things or as built in place things

/opt (/mnt/disk1/grunt/opt)
vendor products, such as Intel, AMD, nVidia, Sun/Oracle

/usr/local (/mnt/disk1/grunt/local)
Things that do not really make sense to be versioned under /products

Tools and products

Need all libraries and tools to build art on the grunts.

  • TBB
  • mvapich
  • openMPI

Connectivity

Test IB/MPI throughput to be sure it is correct from grunt to grunt (NetPIPE).

building our source code - the way we will do development

Development for SL6 on cluck.

Development for SL5 on grunts or oink.

task list and documentation

Perhaps use MOTD to describe use of the systems?

  • Complete changes to the mount locations from cluck to grunts
  • building TBB on grunts
  • indexing the instructions for accessing code, packages, machines and sending out links

short-term

Need page that explains using the grunts and cluck for development.

mid/long-term

Need page that explains how to use PBS to allocate nodes and run jobs.

Batch queues

Can we allocate a grunt as interactive and allow other developers to also access it?
We need access to the grunts from three queues.

  1. accelerator simulation runs (lowest priority access)
  2. ADS "production runs" (Geant4 performance) (medium priority access)
  3. development needs of CET (high priority access)