Analysis Getting Started » History » Version 19
Sean Jeffas, 05/11/2023 10:37 AM
1 | 1 | Sean Jeffas | h1. Analysis Getting Started |
---|---|---|---|
2 | 2 | Sean Jeffas | |
3 | These instructions are specific to the analyzer and SBS-offline installations existing under /work/halla/sbs, maintained by Andrew Puckett. |
||
4 | |||
5 | {{toc}} |
||
6 | 3 | Sean Jeffas | |
7 | h1. How to Reach the SBS Work Directory |
||
8 | |||
9 | 4 | Sean Jeffas | * Log in to the JLab ifarm |
10 | ** https://scicomp.jlab.org/docs/getting_started |
||
11 | 3 | Sean Jeffas | * The SBS work directory is located at /work/halla/sbs |
12 | 4 | Sean Jeffas | ** Created a directory here with @mkdir username@ |
13 | 3 | Sean Jeffas | ** If you do not have permission contact Ole Hansen (ole@jlab.org) and ask to be added to the SBS user group. |
14 | 5 | Sean Jeffas | |
15 | h1. Setting up Environments |
||
16 | |||
17 | * If you want to set up your own personal analyzer you can look here https://github.com/JeffersonLab/analyzer |
||
18 | |||
19 | h1. Getting Files from Cache |
||
20 | |||
21 | 6 | Sean Jeffas | * All raw EVIO files from GMn are on tape at @/mss/halla/sbs@ |
22 | * Cached EVIO files are located at @/cache/halla/sbs@ |
||
23 | 5 | Sean Jeffas | * To write files from tape to cache see documentation here, https://scicomp.jlab.org/docs/node/586 |
24 | ** For example, to get all EVIO splits from run runnumber in GMn to cache execute @jcache get /mss/halla/sbs/raw/*runnumber*@ |
||
25 | 7 | Sean Jeffas | |
26 | h1. Setting up the SBS Replay |
||
27 | 8 | Sean Jeffas | |
28 | * If you do not plan to make changes to the replay code you can simply use the version located at @/work/halla/sbs/SBS_OFFLINE@ and @/work/halla/sbs/SBS_REPLAY@ |
||
29 | * The SBS analysis is located at https://github.com/JeffersonLab/SBS-offline and https://github.com/JeffersonLab/SBS-replay |
||
30 | ** You can copy the github versions to your own work directory, if you plan to make your own changes to the analysis |
||
31 | 9 | Sean Jeffas | |
32 | h2. SBS Installation |
||
33 | |||
34 | 10 | Sean Jeffas | * Follow the README instructions on https://github.com/JeffersonLab/SBS-offline to install SBS-offline. |
35 | * After installing, there should be a directory, @install/run_replay_here@. |
||
36 | * Inside there should be one file named @.rootrc@ (it is a hidden file). |
||
37 | ** Wherever you run the replay, this file must be there to load the SBS-offline libraries. Either run your replays here, or move the @.rootrc@ file to the new destination. |
||
38 | 11 | Sean Jeffas | |
39 | h2. SBS Replay Environments |
||
40 | |||
41 | 12 | Sean Jeffas | * The following lines should be used in a script to define where the data and output files should be located |
42 | 11 | Sean Jeffas | <pre> |
43 | setenv SBS_REPLAY path-to-your-replay/SBS-replay |
||
44 | setenv DB_DIR $SBS_REPLAY/DB |
||
45 | setenv DATA_DIR /cache/mss/halla/sbs/raw |
||
46 | setenv OUT_DIR path-to-your-volatile/rootfiles |
||
47 | setenv LOG_DIR path-to-your-volatile/logs |
||
48 | setenv ANALYZER_CONFIGPATH $SBS_REPLAY/replay |
||
49 | </code></pre> |
||
50 | 12 | Sean Jeffas | * @DATA_DIR@ tells the replay where the EVIO files are. |
51 | * @OUT_DIR@ tells the replay where to put the output ROOT files. |
||
52 | 13 | Sean Jeffas | |
53 | h2. Running the SBS Replay |
||
54 | |||
55 | 14 | Sean Jeffas | * The main replay script, with all detectors, is @replay_gmn.C@ located at https://github.com/JeffersonLab/SBS-replay/tree/master/replay |
56 | * An example of running this using a shell script can be found here, @/work/halla/sbs/puckett/GMN_ANALYSIS/run_GMN_swif2.csh@ |
||
57 | 15 | Sean Jeffas | |
58 | h1. Working example scripts for GMN analysis on the batch farm |
||
59 | |||
60 | |||
61 | 19 | Sean Jeffas | The most efficient and convenient way to analyze GMN data on the batch farm is using the swif2 system. A general overview of the swif2 system is available from the computer center's documentation page "here":https://scicomp.jlab.org/cli/create.html. The first step in using the swif2 system is setting up a "workflow" under your CUE account using "swif2 create", as documented "here":https://scicomp.jlab.org/cli/create.html. Once you have created a workflow, it can be used to launch jobs on the batch farm. The general command-line reference for using swif2 can be found "here":https://scicomp.jlab.org/cli/swif.html. Working example scripts to launch GMN replay jobs on the batch farm can be found at |
62 | 15 | Sean Jeffas | |
63 | 19 | Sean Jeffas | <pre> |
64 | 15 | Sean Jeffas | /work/halla/sbs/puckett/GMN_ANALYSIS/launch_GMN_replay_swif2.sh |
65 | /work/halla/sbs/puckett/GMN_ANALYSIS/run_GMN_swif2.sh |
||
66 | 19 | Sean Jeffas | </code></pre> |
67 | 15 | Sean Jeffas | |
68 | These scripts both refer to directories and workflows that are specific to the "puckett" user account on the farm. They should be viewed as templates and examples for you to copy to your own work disk area and develop your own scripts and workflows. The first of these two scripts takes just two arguments: a run number and a maximum number of segments. The proper usage would be: |
||
69 | |||
70 | ./launch_GMN_replay_swif2.sh runnum maxsegment |
||
71 | |||
72 | |||
73 | Here "runnum" refers to the CODA run number and "maxsegment" is the number of EVIO file segments to be replayed. |
||
74 | |||
75 | The script will create one batch job per EVIO file (assuming the file exists in /mss/halla/sbs/raw) and add it the workflow "puckett_GMN_analysis". |
||
76 | |||
77 | So for example, if run 99999 has the following file segments: |
||
78 | |||
79 | /mss/halla/sbs/raw/e1209019_99999.evio.0.0 |
||
80 | /mss/halla/sbs/raw/e1209019_99999.evio.0.1 |
||
81 | /mss/halla/sbs/raw/e1209019_99999.evio.0.2 |
||
82 | /mss/halla/sbs/raw/e1209019_99999.evio.0.3 |
||
83 | /mss/halla/sbs/raw/e1209019_99999.evio.0.4 |
||
84 | /mss/halla/sbs/raw/e1209019_99999.evio.0.5 |
||
85 | /mss/halla/sbs/raw/e1209019_99999.evio.0.6 |
||
86 | /mss/halla/sbs/raw/e1209019_99999.evio.0.7 |
||
87 | /mss/halla/sbs/raw/e1209019_99999.evio.0.8 |
||
88 | /mss/halla/sbs/raw/e1209019_99999.evio.0.9 |
||
89 | Then ./launch_GMN_replay_swif2.sh 99999 9 will create one job for each file in the list above from 0 to 9 inclusive. In other words, the 2nd command line argument refers to the segment number of the last file we want to replay, NOT the total number of segments, which is generally equal to Njobs = maxsegment + 1, since the segment number always starts at zero. |
||
90 | |||
91 | After executing this script, if the workflow is not already running, I would tell it to start releasing jobs to the batch farm using the command: |
||
92 | |||
93 | swif2 run puckett_GMN_analysis |
||
94 | |||
95 | For your own workflow you would replace "puckett_GMN_analysis" with the name of your swif2 workflow. |
||
96 | |||
97 | The second script (run_GMN_swif2.sh) takes care of setting up the environment on the farm node and actually executing the analyzer, and copying the output files of the replay job to an appropriate directory on /volatile. You don't need to call this script directly, but it is called with appropriate arguments by the swif2 jobs created by the first script. |