Within the ongoing investigation of the Higgs boson at the CMS detector, part of the LHC at CERN, the Higgs production in association with two top quarks allows measuring the Higgs coupling strength to top quarks. As the Higgs boson is too short-lived to be detected itself, it has to be reconstructed from its decay products.
TauRoast searches for cases where the Higgs boson decays to two tau leptons. Since the tau leptons are very short-lived, they are not observed directly, but by the particle decay products that they generate. So, the analysis must search for detector events that show a signature of decay products compatible with both hadronic tau and top decays. Properties of such events are used to distinguish the events of interest (Higgs decays) from all other events and are also used in further statistical analysis.
More information of the code and data sources of tauroast, please check here.
Hardware Architecture: X86_64; Kernel: Linux 2.6.18; OS: RedHat 5.10
CPU Cores: 64; Memory Space: 125GB; Disk Space: 204GB
To figure out the underlying file dependencies and execution environment, Parrot allows you to record the names of all the accessed files during the execution process of one program, which is implemented as the --name-list dependencylist option. When one filename is resolved by the Parrot name resolver, it is also recorded into the dependencylist file. The system call type of a file is also transferred to the name resolver and recorded into the dependencylist file.
The command used to generate the dependency list for Tauroast is as follows:
% parrot_run --name-list namelist.full /bin/bash ~/script-v4.sh
The source code of script-v4.sh is here.
After executing this command, all the accessed file names will be recorded into the file called namelist.full. The format of namelist is filename|system-call-type, such as usr/bin/ls|stat, which means the file /usr/bin/ls is accessed using the stat system call.
To repeat the step, you need to use the cctools source code under the following commit id: ca9d3c38c6e8c105a18bc50869c985242d1e84fa
For more information of parrot_run, please check here.
The namelist file created above has duplicate items due to the possibility that one file may be accessed multiple times during the execution of one program. To shrink the namelist file, we could remove the duplicate items from the namelist file. For example:
% sort -u namelist.full > namelist
First, run the following command to put all the environment variables into a file named env-list, each line corresponds to one environment variable, and in the following format: <name>=<value>.
% env > env-list
Then change env-list into the following format: setenv <name> "<value>" through the following command:
%env-process.sh -p env-list
env-process.sh will creates a file named env-setting, which is a list of environment varibles. Each line corresponds to one environment variable, and in the following format: setenv <name> "<value>"
Create a package including all the dependencies, common mountlist, and the environment file through the following command:
% /bin/bash package-utility.sh --namelist namelist --env env-setting --path /tmp/package
package-utility.sh copies all the files in the namelist file into /tmp/package without messing up the directory paths, copies env-setting into the package, creates a file called common-mountlist including all the mount points which are not included in the package.
The package format of this version:
(a) All the accessed directories and files will be copied into the package, like /etc, /bin, /lib and so on.
(b) common-mountlist: record all the mount points which are not included in the package, such as /proc, /dev, /sys.
(c) env-setting: the list of environment varibles. each line corresponds to one environment variable, and in the following format: setenv <name> "<value>"
The size of the package is: 21GB.
The package can be distributed in the format of TAR or TGZ.
If wget is not yet installed on the virtual machine, first install it through the following command:
% yum -y install wget
Download the package through the following command:
Suppose you put this tar file under /root
% cd /root
% tar xvf package-hep.tar
After this, the path of your package is: /root/package-hep
Generate the mountlist through the following command. The -p parameter must be the path of your package; the -m parameter is the location of the mountlist which is determined by you.
% /bin/bash /root/package-hep/repeat-hep.sh -p /root/package-hep -m /root/mountlist
To do this, cctools must be installed on your machine. Here, I installed cctools into ~/cctools and add the path into $PATH.
#enter into tcsh
% /bin/tcsh
#set the environment variable which is inside the package and called `env-setting`
% source /root/package-hep/env-setting
#repeat the experiment with the help of parrot; -m parameter: the mountlist file you generated in step 4; -l: must be absolute path including the absolute path of the packgage directory and lib64/ld-linux-x86-64.so.2
% ~/cctools/bin/parrot_run -m /root/mountlist -w /afs/crc.nd.edu/user/h/hmeng -l /root/package-hep/lib64/ld-linux-x86-64.so.2 /bin/tcsh ~/script-v4.sh