Extracting files from SAM

by Greg Landsberg - Last update on 22-Nov-2005

Introduction

Here are instructions on how to extract (stage) files from SAM. They are applicable to data, MC, or any generic SAM query. In order to extract files you will need to know the following:

  1. The name of the SAM project, which was used to create particular dataset. This may be either your private query, or a standard query from the Common Sample Group. For example, to extract CAF root files corresponding to a particular MC run corresponding to the fixed p17.08.01 data, use CSG_CAF-MCv1-XXXXX as the project name, where XXXXX is the MC request ID. In order to stage files of a particular CAF skim fixed with reco version p17.07.00, use CSG_CAF_SKIM_v3, where SKIM is the name of the skim, e.g. 2EMhighpt.
  2. Working directory to which you would like to copy staged files. Note that on some systems (e.g., clued0) SAM cache is distributed locally among various machines in the cluster, so it is not seen cluster-wide. Thus you are best off by copying files over to your work directory as soon as they are staged on a particular local node and put in the particular local cache. Make sure that there is enough free space in the working directory. If you omit the name of the directory, files will be staged but not copied.
  3. Maximum number of files you want to stage. If the skim you are working is very large and you only need the first few files, this is a good option to reduce the output and processing time. If this parameter not set, a maximum of 1000 files will be staged.

Usage

The script is located on clued0 cluster, in the ~gll/sam/ directory: sam_query. You can run it from your own work space. The script submits a sam batch job; upon its completion your files will be copied to the working directory. You will have to do manual clean-up of the SAM batch job log files in your home directory and getroot_*.py* files in your working directory (or the current directory if no working directory was given), once the SAM job is finished.

The syntax is as follows:

% csh ~gll/sam/sam_query project_name [working_directory] [max_files]

Limitations

The script should be portable to other clusters, with appropriate changes of the script file locations in the script code. The script uses two auxilary files, _getroot.py and __getroot.py located in the same directory with the script. You may have to set environmental variable SAM_STATION on your local cluster, if it's not set by the setup sam command.

Topic revision: r2 - 2006-09-18 - SergioLietti
 

This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback

antalya escort bursa escort eskisehir escort istanbul escort izmir escort