Project

General

Profile

FIFE Data Handling » History » Version 7

Kenneth Herner, 05/09/2017 12:36 AM

1 1 Katherine Lato
h1. FIFE Data Handling
2 1 Katherine Lato
3 1 Katherine Lato
h2. Overview
4 1 Katherine Lato
5 1 Katherine Lato
This is a synopsis of the full data handling documents
6 1 Katherine Lato
* "(2014) FIFE Data Architecture ":http://cd-docdb.fnal.gov/cgi-bin/ShowDocument?docid=5180
7 1 Katherine Lato
* "(2011) Intensity Frontier Computing Model and GRID Tools":http://cd-docdb.fnal.gov/cgi-bin/ShowDocument?docid=4082 
8 1 Katherine Lato
9 1 Katherine Lato
h2. Kinds of data access
10 1 Katherine Lato
11 1 Katherine Lato
p. Computing jobs need to access various kinds of data, which we will attempt to outline here.
12 1 Katherine Lato
13 1 Katherine Lato
# Executables/libraries -- jobs need to access the actual code which will execute
14 1 Katherine Lato
# Conditions data -- Calibration information, beam status information, etc. is generally kept in a database, and jobs need a way to access the data that will not overload the databases.
15 1 Katherine Lato
# Input Files -- should be transferred in a manner that doesn't pollute caches and be obtained from a SAM-like data handling system that provides data files in an order that can be retrieved efficiently
16 1 Katherine Lato
# Output files -- should be returned from the job, possibly to a location where they can be automatically registered in the data handling system
17 1 Katherine Lato
# Logging/Monitoring -- information about job status should be communicated back to a central location to assist with monitoring.
18 1 Katherine Lato
19 1 Katherine Lato
h2. Storage resources
20 1 Katherine Lato
21 1 Katherine Lato
This is an executive summary of data resources, with some common characteristics
22 1 Katherine Lato
23 6 Arthur Kreymer
For illustration, we refer to a project named *<code>hypot</code>*
24 6 Arthur Kreymer
25 1 Katherine Lato
| RESOURCE | Net capacity | Net data rate | File size | Access limits | Interfaces | Comments |
26 6 Arthur Kreymer
| Bluearc app | Few TB | .5 GB/sec | any | none for common files | NFS /hypot/app, /grid/fermiapp/hypot | For executables, libraries, small common files |
27 6 Arthur Kreymer
| Bluearc data | 240 TB per vol | .5 GB/sec | 1 MB block | 5 files at once per project | NFS /hypot/data /grid/data/hypot, FTP | For unmanaged project and cache, use ifdh cp on grid |
28 1 Katherine Lato
| DCache | 3 PB | Multi GB/sec | 1 MB block | automatic, hundreds ? | NFS (SLF6+), dccp, webdav, FTP, xroot etc. | For managed files, non-scratch files backed to Enstore|
29 1 Katherine Lato
| Enstore | 10+ PB | Multi GB/sec | 2+ GB | access via DCache | DCache | |
30 6 Arthur Kreymer
31 6 Arthur Kreymer
h3. DO
32 6 Arthur Kreymer
33 6 Arthur Kreymer
Use ifdh cp or fetch to move data to and from local disk on worker nodes
34 6 Arthur Kreymer
* <50 GB local disk per job 
35 6 Arthur Kreymer
* See Auxiliary File task force for advice on highly shared files
36 6 Arthur Kreymer
* ifdh also works on OSG
37 6 Arthur Kreymer
38 6 Arthur Kreymer
Use Dcache for managed and high througput files
39 6 Arthur Kreymer
* archival - /pnfs/hypot/data
40 6 Arthur Kreymer
* scratch  - /pnfs/hypot/data/scratch/users/...
41 6 Arthur Kreymer
* directly available to SLF6.4+ clients, with NFS 4.1
42 6 Arthur Kreymer
43 6 Arthur Kreymer
Use Bluearc for temporary user analysis files ( project disk )
44 6 Arthur Kreymer
* /hypot/data
45 6 Arthur Kreymer
46 6 Arthur Kreymer
h3. DO NOT
47 6 Arthur Kreymer
48 6 Arthur Kreymer
Directly write or read Bluearc /hypot/data
49 6 Arthur Kreymer
* Limited disk heads per array, O(10s)
50 6 Arthur Kreymer
* Limited bandwidth, O(1 GByte/sec)
51 6 Arthur Kreymer
* Direct access by grid jobs at best slows everyone down drastically, producing alarms, idle grid slots and sad interactive users.
52 6 Arthur Kreymer
* At worst this can crash the Bluearc servers.
53 6 Arthur Kreymer
54 6 Arthur Kreymer
Try to edit or rewrite DCache files, it won't work  
55 1 Katherine Lato
56 1 Katherine Lato
57 1 Katherine Lato
h2. Interfaces
58 1 Katherine Lato
59 1 Katherine Lato
Where possible, web interfaces which could take advantage of GRID squid caches, etc. should be used.
60 1 Katherine Lato
61 1 Katherine Lato
| Data Type     | Tool           |
62 1 Katherine Lato
| Executables   | CVMFS          | 
63 1 Katherine Lato
| Conditions    | NuConDB        |
64 1 Katherine Lato
| File metadata | samweb         |
65 1 Katherine Lato
| Input         | ifdh           |
66 1 Katherine Lato
| Output        | ifdh/FTS       |
67 2 Arthur Kreymer
| Logging       | ifdh/numsg     |
68 5 Arthur Kreymer
69 2 Arthur Kreymer
h2. [[FermiGridBlue|Fermigrid Bluearc Unmount Task Force]]
70 2 Arthur Kreymer
71 2 Arthur Kreymer
There have been ongoing issues with Bluearc overloads 
72 4 Arthur Kreymer
due to accidental direct access to Bluearc file systems from Fermigrid jobs.
73 2 Arthur Kreymer
There is a short term Sep/Oct 2014 [[FermiGridBlue|Fermigrid Bluearc Unmount Task Force]] 
74 1 Katherine Lato
preparing plans for eliminating these overloads.
75 1 Katherine Lato
76 1 Katherine Lato
h2. Access Methods to dCache for Interactive use
77 1 Katherine Lato
78 1 Katherine Lato
There are several access methods for interactive use of dCache files. These include: DCap, dccp, and srm, and gridftp.  Currently gridftp is the preferred method, and the default for our "ifdh cp" utility, which is the recommended tool for getting files in and out of dcache for experimenters.
79 1 Katherine Lato
80 1 Katherine Lato
h3. gridftp
81 1 Katherine Lato
82 1 Katherine Lato
Gridftp is the underlying file transfer mechansim used by SRM.  Using it directly reduces some copy connection overhead imposed by SRM.
83 1 Katherine Lato
84 1 Katherine Lato
The ifdh utility, in the ifdhc ups product, is the recommended tool for doing Gridftp copies for Fermilab experiments, and gridftp is currently the default transfer mechanism for copies in and out of dcache.
85 1 Katherine Lato
86 1 Katherine Lato
ifdh cp /pnfs/nova/scratch/users/mengel/test.txt /tmp/localfile.txt
87 1 Katherine Lato
88 1 Katherine Lato
One can also give full gsiftp: URI's for specifying grifdtp servers, for example:
89 1 Katherine Lato
90 1 Katherine Lato
   @gsiftp://fndca1.fnal.gov/scratch@
91 1 Katherine Lato
   @gsiftp://fg-besman1.fnal.gov/grid/data@
92 1 Katherine Lato
93 1 Katherine Lato
Note that our current dcache configuration hides the first 4 components of the /pnfs/fnal.gov/usr/<experiment-name>/... path when you do gridftp access,(assuming the Grid proxy you are using is mapped in the usual fashion).
94 1 Katherine Lato
95 1 Katherine Lato
h3. nfs v4.1 
96 1 Katherine Lato
97 1 Katherine Lato
On NFSV4.1 mounted filesystem you can do anything you normally do except modifying file content.
98 1 Katherine Lato
99 1 Katherine Lato
@mount  -v -t nfs4 -o minorversion=1 localhost:/pnfs /pnfs/fs@
100 1 Katherine Lato
101 1 Katherine Lato
Can then do commands like cp, rm, and so on.
102 1 Katherine Lato
103 1 Katherine Lato
For more information, please look at:
104 1 Katherine Lato
https://srm.fnal.gov/twiki/bin/view/DcacheCorner/DcacheFAQ
105 1 Katherine Lato
106 1 Katherine Lato
h3. Webdav 
107 1 Katherine Lato
108 1 Katherine Lato
Web Distributed Authoring and Versioning (WebDAV) is an extension of the Hypertext Transfer Protocol (HTTP) that allows users to create and modify web content. Many operating systems provide built-in client support for WebDAV. To browse namespace and download data, the user directs a web browser to https://fndca4a.fnal.gov:2880. (This is read only.)
109 1 Katherine Lato
110 1 Katherine Lato
To access the data, the user needs to generate grid certificate proxy like so:
111 1 Katherine Lato
$ grid-proxy-init
112 1 Katherine Lato
Your identity: /DC=org/DC=doegrids/OU=People/CN=Dmitry Litvintsev 257737
113 1 Katherine Lato
Enter GRID pass phrase for this identity:
114 1 Katherine Lato
Creating proxy .......................................... Done
115 1 Katherine Lato
Your proxy is valid until: Tue Feb 12 04:37:20 2013
116 1 Katherine Lato
117 1 Katherine Lato
Use the following curl command to put/get data using WebDAV door:
118 1 Katherine Lato
# example of put
119 1 Katherine Lato
$ curl -L --capath /etc/grid-security/certificates \
120 1 Katherine Lato
--cert /tmp/x509up_u8637 -T /etc/fstab
121 1 Katherine Lato
https://fndca4a.fnal.gov:2880/fermigrid/volatile/fermilab/litvinse/curl.txt
122 1 Katherine Lato
123 1 Katherine Lato
# example of get
124 1 Katherine Lato
$ curl -L --capath /etc/grid-security/certificates
125 1 Katherine Lato
--cert /tmp/x509up_u8637 \
126 1 Katherine Lato
https://fndca4a.fnal.gov:2880/fermigrid/volatile/fermilab/litvinse/curl.txt\
127 1 Katherine Lato
-o curl1.txt
128 1 Katherine Lato
% Total % Received
129 1 Katherine Lato
130 1 Katherine Lato
More information is available at:
131 1 Katherine Lato
http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=5050;filename=webdav.pdf;version=2
132 1 Katherine Lato
133 1 Katherine Lato
h3. DCap
134 1 Katherine Lato
135 1 Katherine Lato
DCap provides POSIX-like open, create, read, write and lseek functions to the dCache storage. In addition there are some specific functions for setting debug level, getting error messages, and binding the library to a network interface. The dCap protocol requires specification of the dCache server host, port number, and domain, in addition to the inclusion of "/usr" ahead of the storage group designation in the PNFS path. Its structure is shown here:
136 1 Katherine Lato
137 1 Katherine Lato
dcap://<serverHost>:<port>/</pnfs>/<storage_group>/usr/<filePath> 
138 1 Katherine Lato
139 1 Katherine Lato
See http://www-dcache.desy.de/manuals/libdcap.html for usage information.
140 1 Katherine Lato
141 1 Katherine Lato
h3. dccp
142 1 Katherine Lato
143 1 Katherine Lato
The dccp command provides a cp-like functionality on the PNFS file system and has the following syntax:
144 1 Katherine Lato
145 1 Katherine Lato
% dccp [ options ] source_file [ destination_file ] 
146 1 Katherine Lato
 
147 7 Kenneth Herner
The options and command usage are described at http://www-dcache.desy.de/manuals/dccp.html. Note that on systems where PNFS is mounted via NFS 4.1, dccp will not work properly. In that case, just use cp or ifdh cp.
148 1 Katherine Lato
149 1 Katherine Lato
h3. srmcp
150 1 Katherine Lato
151 1 Katherine Lato
SRM is middleware for managing storage resources on a grid. The SRM implementation within the dCache manages the dCache/Enstore system. It provides functions for file staging and pinning2, transfer protocol negotiation and transfer url resolution.
152 1 Katherine Lato
153 1 Katherine Lato
The ifdh utility, in the ifdhc ups product, is the recommended tool for doing SRM copies for Fermilab experiments.  SRM is not currently the default
154 1 Katherine Lato
protocol for ifdh cp, so you need to specify it with a --force option to use it:
155 1 Katherine Lato
156 1 Katherine Lato
@ifdh cp --force=srm /pnfs/nova/scratch/users/mengel/test.txt /tmp/localfile.txt@
157 1 Katherine Lato
158 1 Katherine Lato
You can also give a full SRM protocol URI, used for the remote file specification, which  requires the SRM server host, port number, and domain. For the fnal.gov domain, the inclusion of "/usr" ahead of the storage group designation in the PNFS path is also required. Its structure is shown here:
159 1 Katherine Lato
160 1 Katherine Lato
@srm://<serverHost>:<portNumber>/service/path?SFN=/<root of fileSystem>/<storage_group>[/usr]/<filePath>@
161 1 Katherine Lato
 
162 1 Katherine Lato
The first two examples are for the fnal.gov domain, the third for cern.ch:
163 1 Katherine Lato
164 1 Katherine Lato
   @ srm://fndca1.fnal.gov:8443/srm/managerv2?SFN=/pnfs/fnal.gov/usr/nova/scratch@
165 1 Katherine Lato
   @ srm://cdfdca1.fnal.gov:8443/srm/managerv2?SFN=/pnfs/fnal.gov/usr/cdfen/filesets/<filePath>@
166 1 Katherine Lato
   @ srm://wacdr002d.cern.ch:9000/castor/cern.ch/user/<filePath> @
167 1 Katherine Lato
168 1 Katherine Lato
For details, please see:
169 1 Katherine Lato
http://www.fnal.gov/docs/products/enstore/enstore_may04/usingdcache.html#8346