Analysis How-To » History » Revision 16
Revision 15 (Richard Trotta, 04/30/2019 03:53 PM) → Revision 16/18 (Richard Trotta, 11/24/2019 03:09 PM)
h1. Analysis How-To {{>toc}} h2. How should I analyze data? * Doing things locally will always be your best option for actual analysis. ** Fork a repo of hallc_replay_kaonlt %{color:red}and% UTIL_KAONLT for your own custom version that you can play with !fork.png! * Once you have forked the repo, clone hallc_replay_kaonlt to a local directory <pre><code class="bash"> $USER> git clone https://github.com/USER/hallc_replay_kaonlt.git </code></pre> * Now the tricky part, UTIL_KAONLT is a submodule of hallc_replay_kaonlt so some intermediate steps will need to be made <pre><code class="bash"> $USER/hallc_replay> git submodule --init --recursive </code></pre> ** Check .gitmodules to make sure submod is listed <pre><code class="bash"> [submodule "UTIL_KAONLT"] path = UTIL_KAONLT url = https://github.com/USER/UTIL_KAONLT branch = <branchname> </code></pre> <pre><code class="bash"> $USER/hallc_replay> git submodule update --recursive --remote </code></pre> * If HEAD is detached (check with git branch -a) <pre><code class="bash"> $USER/hallc_replay> cd UTIL_KAONLT $USER/hallc_replay/UTIL_KAONLT> git branch -a * (HEAD detached at ###) </code></pre> * Check if head is really detached <pre><code class="bash"> $USER/hallc_replay/UTIL_KAONLT> git symbolic-ref HEAD fatal: ref HEAD is not a symbolic ref </code></pre> <pre><code class="bash"> $USER/hallc_replay/UTIL_KAONLT> git remote update Fetching origin </code></pre> * Change to master branch <pre><code class="bash"> $USER/hallc_replay/UTIL_KAONLT> git checkout master Switched to branch 'master' Your branch is up-to-date with 'origin/master' </code></pre> * Pull and check branch again, everything should be set! <pre><code class="bash"> $USER/hallc_replay/UTIL_KAONLT> git pull Already up-to-date $USER/hallc_replay/UTIL_KAONLT> git branch -a * master </code></pre> * Now that we have our repo locally we should set it up to pull from the “main” JeffersonLab version * First check your remote “origin” repo (this is where you will push to) <pre><code class="bash"> $USER/hallc_replay> git remote -v origin https://github.com/USER/hallc_replay_kaonlt.git (fetch) origin https://github.com/USER/hallc_replay_kaonlt.git (push) </code></pre> * Next lets set up the “upstream” which is the JeffersonLab repo (DO NOT push HERE) <pre><code class="bash"> $USER/hallc_replay> git remote add upstream https://github.com/JeffersonLab/hallc_replay_kaonlt.git $USER/hallc_replay> git remote -v origin https://github.com/USER/hallc_replay_kaonlt.git (fetch) origin https://github.com/USER/hallc_replay_kaonlt.git (push) upstream https://github.com/JeffersonLab/hallc_replay_kaonlt.git (fetch) upstream https://github.com/JeffersonLab/hallc_replay_kaonlt.git (push) </code></pre> * You will not be able to push to upstream unless you’re Stephen or me so don’t worry too much. Just be cautious. * A similar procedure can be performed with UTIL_KAONLT * Finally, let's talk about branches. Let’s add the develop branch to our local system… ** First create a branch locally called develop and change to it <pre><code class="bash"> $USER/hallc_replay> git branch develop $USER/hallc_replay> git checkout develop M UTIL_KAONLT M UTIL_OL Switched to branch 'develop' </code></pre> * Now simply pull develop <pre><code class="bash"> $USER/hallc_replay> git pull origin develop </code></pre> * To create a new branch you must first create it in github !newbranch.png! * Then simply repeat the steps for setting up a branch from the previous slide h2. Navigating hallc_replay_lt * The GitHub can be found "here":https://github.com/JeffersonLab/hallc_replay_lt * Attached is a summary for all directories in files in hallc_replay_lt [%{background:yellow}ONGOING%] attachment:hallc_replay_outline.pdf h2. Getting .dat files from tape * If *.dat for a particular run is not in /cache/hallc/spring17/raw follow the instructions below... ** In /cache/hallc.spring17/raw type: <pre> > jcache get /mss/hallc/spring17/raw/<YourRawFile>.dat </pre> ** This will take a little while to process. You can check the status of your process by typing: <pre> > jcache pendingRequest <JlabUserName> </pre> *->More information on using jcache can be found at* https://scicomp.jlab.org/docs/%20 h2. Replaying * Before we can analyze we must replay. This should be done in the farm to save yourself time and local cpu effort. The easiest way is to do a batch job submission, but this comes with some prep work. * Before a batch submission, I highly encourage two preliminary steps ## Do all debugging of replays locally, once this works move to the farm ## Once on the farm you have two options; your ifarm version or our group (discussed soon). This is for final debugging purposes to assure everything works in the farm, then you can submit a batch job. Save the root files in /volatile/hallc/c-kaonlt/<USER> (note: volatile is NOT backed up) * There is a batch script I have created and Stephen as changed with the help of Brad to assure it will not mess things up. Again, I highly recommend the two above steps before moving onto this script or you will be wasting time and resources. * Your final batch job submissions can be saved directly to tape. h2. Group environment * You can do replays under your farm directory or you can use our group environment. * We have set up a group environment with a version of our repo that currently mimics the cdaq as close as possible (although an updated hcana is used). ** This group environment is under /u/group/c-kaonlt ** I have made a directory USERS which you can use for person replay scripts and environments. DO NOT change any replays that are not under USERS without contacting Stephen or me first. ** There is an hcana already set up here, use this for any group replays. If you would like to use a different version of hcana please use your farm directory. If I find a hcana in USERS I will destroy it. * You may have issues with hcana, make sure you are in the JLab software environment version 2.1 ** source /site/12gev_phys/softenv.csh 2.1 (or .sh if using bash) * The group environment has a 100 gb quota and is backed up. This means two important things… ## DO NOT save root files here! Ever! ## It’s backed up so its good for important calibration work (*wink *wink) * Upon the request of Stephen, any improper use of this environment will incur a penalty of one beer/bottle of single malt or an owl shift (depending upon severity). h2. Writing to tape * Writing to tape info, read - https://scicomp.jlab.org/docs/write-through-cache. * In your batch script, specify OUTPUT_FILE:/cache/hallc/kaonlt/USER/ROOTfiles/ ** Material in /cache is automatically copied to tape after some time if it is static ** Small files (~1 MB) will not be backed up on tape ** Once copied to tape, you can view the tape stub (NOT the file itself) under /mss/hallc/kaonlt/… ** The tape does not handle overwriting well so if submit a job you must create a new "pass" directory… *** -->jput ... file.root /mss/hallc/kaonlt/USER/ROOTfiles/pass1/ ** The tape has FAR more space than we could get through so do not worry about "filling" it ** Write to tape once you're happy with your code... just do it correctly h2. Few more words of warning * Do not write analysis to tape unless you are 100% certain it works correctly (and you don't want to repeat it very soon). * For farm jobs some info is included below - ** See https://scicomp.jlab.org/docs/text_command_file for info on commands ** Do not set CPU above 1 (it will slow your job down in the queue and hcana is single threaded anyway so you gain nothing) ** Farm/Auger project: c-kaonlt ** For TEMPORARY output, write to volatile - /volatile/hallc/c-kaonlt/USER, this space is NOT backed up! ** Specify the FULL path to this in your symbolic link ** Make sure relvant directories are created * You can use our work environment (/work/hallc/kaon), but this is not backed up and I will no be setting up an environment similar to group there. It’s a good place to put personal scripts if you don’t want to take up space in your farm directory. h2. Analysis Starting Points * Starting from this page: https://redmine.jlab.org/projects/podd/wiki/Workshop2018 * In general, items with a (*) on that page denotes an interactive tutorial component. ** If you login to Bluejeans via: https://jlab.bluejeans.com/ then you can view the recorded video sessions. * Review the following presentations first: ** Farm Use and Computing Resources Tips and Tricks -- Brad Sawatzky ** Overview & Update of the Hall C Analyzer -- Eric Pooser ** Eric's talk from the 2019 Hall C Winter Collab meeting is also quite useful: https://www.jlab.org/indico/event/296/session/11/contribution/12/material/slides/0.pdf * Then move on to the Tuesday 'Hall C' sessions starting with the git howto: ** Effective Git use (*) -- Steve Wood ** Folks really do need to understand how to work with git. They should follow up on the tutorials/howtos Steve mentions in his talk before moving on. * If you want a space to work, log on to the ifarm and create a directory under /group/c-kaonlt/Users/. For example: <pre><code class="bash"> > cd /group/c-kaonlt/Users/ > mkdir trottar ## change 'trottar' to your username > cd trottar ## change 'trottar' to your username </code></pre> * If you want to follow along with the interactive sessions, the you can get the VM (virtualbox) setup working. Most of the interactive 'howtos' are in the Tuesday afternoon session. Ole's talks are likely a good place to start with the VM. The video recordings may be of particular use here. ** 14:30 Reading and processing trees (part2) (*) -- Ole Hansen * Next level analysis would include working through the 'Hall C Analysis' talks in the Tuesday morning session. They are on calibration tasks that we will need to perform ourselves. They also outline how the various detector maps, cut, and def files interact. (This is the kind of information I would like to get consolidated in a set of new 'howto' pages on the wiki.) * All the software and steps outlined on the "old" pages will match what we will do very closely, so it is *not* wasted time to start with the above documentation. * For those of you who can get started on a Detector Calibration procedure, you can start editing tasks in: https://redmine.jlab.org/projects/kltexp/wiki/Analysis_Tasks ** Start by selecting the task (i.e. detector) of interest ** If you have files/scripts that are not already online as part of a git repo, or similar, then you can upload a copy in "tasks":https://redmine.jlab.org/projects/kltexp/wiki/Analysis_Tasks (as stated above) ** Be sure you put a description in there too that says who you are, and where you got the scripts from. * Note that all of the calibrations steps have already been done by existing groups in Hall C! Talk to folks on the previous experiments (if that isn't you) and have them show you the documentation and notes they already have. Start with just getting those notes linked to this page. If different groups have different procedures, you can provide links/references to both. * Clean up/consolidation will be the next step.