Project

General

Profile

Wiki » History » Version 127

« Previous - Version 127/128 (diff) - Next » - Current version
Cheng-Yang Tan, 05/12/2017 11:17 AM


LOCO in C++

This is a translation of A. Petrenko's implementation of LOCO. This implementation removes TCL, sdds* commandline programs and octave as requirements. Either Elegant or MADX is used as the tracking program.
The source code can be retrieved by issuing the following command:
git clone http://cdcvs.fnal.gov/projects/booster_loco_cpp 

and download and install the following files:
  • loco configuration file
    The loco.cfg file should be renamed .loco and put into your home directory.
  • headers.zip
    The headers.zip file should be unzipped into a headers directory. Its location can be specified in .loco.
  • common.zip
    An example set of common files that should be unzipped into a commons directory. Its location can be specified in .loco.

Speed

MacBook Pro i7 GHz

The C++ Elegant and MADX LOCO implementations have been benchmarked with a MacBook Pro 2.7 GHz i7.

Tracking with Elegant

The results are:
  • 107s per break point on the ramp with C++ Elegant LOCO.
  • 176 s per break point on the ramp with TCL Elegant LOCO.

i.e. C++ Elegant LOCO is 39% faster compared to TCL Elegant LOCO.

Tracking with MADX

The results are:
  • 136s per break point on the ramp with C++ MADX LOCO.

i.e. C++ MADX LOCO is 27% slower compared to C++ Elegant LOCO.

INTEL Xeon GHz (CLX55)

  • C++ Elegant LOCO
  • C++ Elegant LOCO that runs on CLX55 is very slow compared to the i7 above. It takes 375s = 6.25 min per break point! It is even slower that TCL Elegant LOCO running on the i7!
  • C++ MADX LOCO runs takes 146s per break point. This is just 7% slower than on the i7.

This means that the Elegant that was compiled on CLX is the bottleneck and not the CPU.

CPU benchmarks

Interesting, but not too relevant after the MADX version of LOCO because it shows only a 7% drop in speed compared to an i7.

As a check as to why the i7 is so much faster than the Xeon, their CPU marks can be compared and they are:

i.e. the i7 is about 5 times faster than the Xeon in CPU marks while for C++ LOCO, the i7 is fastser than the Xeon by 3.5 times.

Requirements

Builds on MacOSX 10.8.4 and LINUX on CLX cluster are supported

Compiling on MacOSX

The requirements for building fitit:
  • python (comes pre-installed)
  • Apple Accelerate Framework
  • Apple's implementation of BLAS and LAPACK used for SVD. (comes pre-installed)
  • git
  • gnuplot
  • use MacPorts to install
 port install gnuplot 
  • Boost library
  • use MacPorts to install
 port install boost 

Boost.Process 0.5. This is the beta version of the process library and needs to be downloaded and installed separately

Boost.Process

  • GNU Scientific library
  • use MacPorts to install
 port install gsl 

Compiling on LINUX (CLX cluster)

The requirements for building fitit:
  • python (comes pre-installed)
  • git (comes pre-installed)
  • gnuplot (comes pre-installed)
  • Boost library (CLX has a very old version of Boost and so a new version has to be installed)
  • Download and install Boost
  • CLX does not have bzip installed and so after bootstrap.sh is run, the following command must be used to compile Boost
    ./bootstrap.sh --prefix=/export/home1/cytan
    ./b2 -sNO_BZIP2=1 install
    

    where the argument of the prefix option is where the base directory of where the include directory lives.
    There is a bug in boost/python/ssize_t.hpp. Comment out the following in the file:
    //typedef Py_ssize_t ssize_t;  <----- comment this out
    ssize_t const ssize_t_max = PY_SSIZE_T_MAX;
    ssize_t const ssize_t_min = PY_SSIZE_T_MIN;
    

Boost.Process 0.5. This is the beta version of the process library and needs to be downloaded and installed separately

Boost.Process

  • GNU Scientific Library
  • LAPACK
  • Download (but use ATLAS (see below) to install) LAPACK
  • ATLAS
  • Download and install ATLAS.

configure with

mkdir atlas; cd atlas;
../configure -b 64 -D c -DPentiumCPS=2327.550 --prefix=/export/home1/cytan/loco_new/atlas  --with-netlib-lapack-tarfile=/export/home1/cytan/loco_new/l
apack-3.4.2.tgz

where you must change the prefix and with-netlib-lapack-tarfile options to the location that you desire.

Unfortunately ATLAS configure tries to be smarter than the user and refuses to configure unless throttling is turned off. And the option that disables this check has been disabled!
Therefore, in general, if you are not the superuser, you are SOL'd. But there is a workaround:
edit the file in CONFIG/src/config.c and search for ProbeCPUThrottle() and add the following line:

int ProbeCPUThrottle(int verb, char *targarg, enum OSTYPE OS, enum ASMDIA asmb)
{
  return 0;   // <--- add this line to override throttle check!

and everything should configure.

  • MADX is already installed on the CLX cluster.
  • Elegant for CLX LINUX is not available as a binary because CLX LINUX is still 32 bit. Therefore, Elegant has to be built from scratch.
    The following files are required to build Elegant: deprecated: I am no longer maintaining elegant on CLX and the CLX LINUX has been updated to 64 bit
  • SDDS.3.0.tar.gz
  • epics.base.configure.tar.gz
  • epics.extensions.configure.tar.gz
  • elegant.25.2.1.tar.gz
  • defns.rpn
which are all available on the APS server
Run
/epics/base/startup/EpicsHostArch

to set up your EPICS*_ environment variables. Check that that they make sense!
More work to get elegant compiled:
  • Soft links to directories built in epics/extensions and epics/base need to be made.
  • -lcurses must be added to Makefile.OAG at
     # section on LAPACK
      ifneq ($(LAPACK), 0)
        ifdef WIN32
          USR_CPPFLAGS += -DLAPACK -I$(LAPACK_INCLUDE)
          USR_LIBS += lapack
        else
          # there is no lapack header file
          OP_SYS_LDLIBS += -llapack -lblas -lcurses <--- add -lcurses HERE!
    #       -lgfortran
          USR_CPPFLAGS += -DLAPACK 
        endif
    

It is much easier to get a working copy of Elegant from me: than to do your own compile!

Building

Edit the Makefile to put in the appropriate paths if necessary. The Makefile has been set up so that it knows whether it is run on a Mac or on the CLX cluster.

Run

 make

fitit should be created. Move fitit to your local path.

A pre-built version of fitit for MacOSX or CLX LINUX can be requested from the author:

.loco configuration file

The .loco file contains configuration parameters about where some files and directories live. Edit this file to reflect your run environment. Default values are used if these options are not set.
An example of the contents of a .loco file is

[fitit]
g[disp2sdds]
gnufont=/Library/Fonts/Arial.ttf
plotdir=./plots
dispheader=/Users/cytan/expt/booster/loco_new/common/DispHeader.sdds

[orm2sdds]
gnufont=/Library/Fonts/Arial.ttf
plotdir=./plots
ormheader=/Users/cytan/expt/booster/loco_new/common/ORMHeader.sdds

[fitit]
gnufont=/Library/Fonts/Arial.ttf

plots_dir=./plots
common_dir=./common
headers_dir=/Users/cytan/expt/booster/loco_new/common
work_dir=/tmp

lattice=machine.lte
orm_ele=ORM.ele
twiss_ele=twiss.ele

mlattice=booster.madx

magnet_settings=MagnetSettings.txt
rposfile=deltaRPOS.txt
machine_params_sdds=machine_parameters.sdds
params2vary_sdds=parameters2vary.sdds
machine_params_vs_t=machine_params_vs_t.sdds
qsdqsf_fname=qsdqsf.dat

elegant=/Users/cytan/bin/elegant
madx=/Users/cytan/bin/madx

Running

Required environment variable for Elegant

In order for Elegant to run, the following environment variable must be pointed to defns.rpn file that can be downloaded here
Add this to your .bashrc file with the appropriate path

export RPN_DEFNS=/export/home1/cytan/loco_new/common/defns.rpn

Checking the configuration

fitit -h

The output should look something like this

fitit version 1.0-37-g6f0192c-38
config file: /Users/cytan/.loco
         gnuplot font = /Library/Fonts/Arial.ttf
         plots dir = ./plots
         common dir = ./common
         headers dir = /Users/cytan/expt/booster/loco_new/common
         results dir = ./results
         work dir = /tmp
         magnet settings = MagnetSettings.txt
         rpos file = deltaRPOS.txt
         qsdqsf file = qsdqsf.dat
         lattice = booster.madx
         machine params file = machine_parameters.sdds
         parameters to vary file = parameters2vary.sdds
MADX is the tracking program
         MADX = /Users/cytan/bin/madx

command line options and their DEFAULT values

generic options :
  -h [ --help ]                         this message
  -A [ --ormA ] arg                     ORM input file. Either one file with 
                                        both H & V data or two files separated 
                                        by , or space. H first, V second
  -d [ --dfile ] arg                    dispersion input file name
  -r [ --rfactor ] arg (=1 1)           rfactors. dresp rfactor first, loco 
                                        rfactor second
  -s [ --smin ] arg (=0.0025000000000000001)
                                        smin (s): start time for processing the
                                        ORM data
  -S [ --Sthreshold ] arg (=0.14999999999999999)
                                        Sthreshold: singular value threshold
  -t [ --trange ] arg (=3 30)           start and stop time in ms (separated by
                                        a space) for fitting the ramp data
  -c [ --config ] arg (=/Users/cytan/.loco)
                                        LOCO configuration file
  -E [ --elegant ]                      use elegant as tracking program
  -u [ --useoffset ]                    use magnet current offsets in 
                                        Magnet.txt
  -n [ --nfit_iter ] arg (=3)           number of LOCO fit iterations
  -D [ --debug ]                        enable debugging
  -v [ --version ]                      print version

config file options:
  --fitit.gnufont arg                   gnuplot font
  --fitit.plots_dir arg (=./plots)      plot directory
  --fitit.common_dir arg (=./common)    common directory
  --fitit.results_dir arg (=./results)  results directory
  --fitit.headers_dir arg (=./headers)  headers directory
  --fitit.work_dir arg (=/tmp)          work directory for temporary files
  --fitit.rposfile arg (=deltaRPOS.txt) rpos ramp filename
  --fitit.elegant arg                   elegant path
  --fitit.madx arg                      madx path
  --fitit.bad_bpm_list arg (=bad_bpm_list.dat)
                                        bad bpm list
  --fitit.bad_corrector_list arg (=bad_corrector_list.dat)
                                        bad corrector list
  --fitit.lattice arg (=machine.lte)    elegant lattice file
  --fitit.magnet_settings arg (=MagnetSettings.txt)
                                        magnet settings up the ramp file
  --fitit.orm_ele arg (=ORM.ele)        elegant ORM setup file
  --fitit.twiss_ele arg (=twiss.ele)    elegant twiss setup file
  --fitit.mlattice arg (=booster.madx)  madx lattice file
  --fitit.machine_params_sdds arg (=machine_parameters.sdds)
                                        initial machine element calibrations
  --fitit.params2vary_sdds arg (=parameters2vary.sdds)
                                        parameters to vary and their step 
                                        change
  --fitit.qsdqsf_fname arg (=qsdqsf.dat)
                                        QL and QS K values in madx format

Creating machine_params_vs_t.sdds, twiss_vs_t.twi and optics_vs_t.sdds

The requirements are that you have the following files:

  • in your directory where you are going to perform LOCO, the following files have been collected:
  • An ORM file: e.g. 28MARCH_ORM_ALL.TXT
  • A Disp file: e.g. 28MARCH_DISP_1.TXT
  • in the common directory:
  • deltaRPOS.txt -- the RPOS ramp file used for dispersion measurements
  • MagnetSettings.txt -- the ramp file that is the source file for the creation of optics.sdds
  • bad_bpm_list.dat -- the bad bpm list
  • bad_corrector_list.dat -- the bad corrector list
  • parameters2vary.sdds -- parameters to vary and their step values. Unlike LOCO TCL, this file is never overwritten.
  • machine_parameters.sdds -- the intiial values of the BPM calibrations, tilts etc.
  • MADX input files that are required in the common directory:
  • booster.madx -- parent file
  • booster.ele -- magnet definitions
  • booster.seq -- lattice file
  • qsdqsf.dat -- QL and QS K values in MADX format
  • DC.dat -- the strength of the DC elements

The latest versions of the MADX input files can be downloaded from here

  • If you plan to run Elegant, the following input files are required in the common directory:
  • machine.lte -- the Booster lattice in Elegant format
  • ORM.ele -- Elegant options and set up file for calculating the theoretical orbit response.
  • twiss.ele -- Elegant options and set up file for calculating twiss.twi

An example of the files in the common directory can be found in common.zip

The files that are generated in the results directory are:

  • machine_params_vs_t.sdds file that is the LOCO output.
  • twiss_vs_t.twi file in sdds format. It is in binary sdds if Elegant is used and in ascii format if MADX is used.
  • optics_vs_t.sdds file is created from MagnetSettings.txt after interpolation.

By default MADX is the tracking program unless the -E option is given on the commandline and then Elegant is used instead.

If the -D option is given on the commandline
  • all the plot files are deposited in the plots directory (unless the location has been changed in .loco). The plots directory is automatically created by fitit.

NOTE: There are so many plots generated that the program slows down tremendously. Only recommended if the plots are really necessary!
Todo: number of plots to save will be added as an option.

For example:

fitit -A 28MARCH_ORM_ALL.TXT -d 28MARCH_DISP_1.TXT -u -n 2

will, after 2 fit iterations (default is 3), create the machine_params_vs_t.sdds, twiss_vs_t.twi and optics_vs_t.sdds files in the results directory.

Future Work

The goals are:
  • Clean up the code so that there are no longer any hardcoded paths in the source. DONE!
  • Replace Elegant with MADX. DONE!
  • Port implementation to LINUX. DONE!
  • Speed up the program by using distributed programming. This can be easily done with MPI programming on a Rocks Cluster or something similar. DONE!

Only the the final step remains, i.e. the addition of distributed programming.

Differences in results between TCL LOCO and C++ LOCO

The source of the differences in results between TCL LOCO and C++ LOCO can be boiled down to the way how posN() works in Mathematica and in ORM.cpp or Disp.cpp.

For the creation of the ORM files in TCL LOCO, a Mathematica program is used which contains a function called posN[]. Within pos[] is a Mathematic function called SortBy[].
For C++ LOCO, a function called posN() in ORM.cpp or Disp.pp uses STL sort().

Both sorts are used to sort the second argument of a list which contains the standard deviations of a set of data. The first index points to the group of data that gives the standard deviations.
For example, using SortBy[] on the list

{{1, 0.0627807}, {2, 0.0627807}, {3, 0.313951}}

Mathematica sorts the second entry to give
{{2, 0.0627807}, {1, 0.0627807}, {3, 0.313951}}

and returns the index 2 back to the caller.

However, when SortBy[] is used on the following list

{{1, 0.0365138}, {2, 0.0421625}, {3, 0.0365138}}

the result is
{{1, 0.0365138}, {3, 0.0365138}, {2, 0.0421625}}

and the index 1 is returned back to the caller.

Therefore, the first index returned by SortBy[] is random if there are identical standard deviations.
The same is true for STL sort(), i.e. random and the returned first index between SortBy[] and sort() may not necessarily match.

The above is reason why the machine_params_vs_t.sdds file is different when generated by TCL LOCO and C++ LOCO for the same input files.