Project

General

Profile

Data file creation and subsequent data flow

Extracting from Brett's 8/2/2016 mail message [and a followup message from Maxim]:

Currently we are indeed preferring the design based on an Xrootd "SAN".
If there are any concerns, let's discuss them.

To maybe help drive questions for EOS experts, here is a rough run
though of the actions I'm envisioning occurring on each file/stream as
it goes through DAQ/Buffer/FTS/EOS. In sequence:

1) DAQ opens Xrootd connection (with the central Xrootd redirector)
2) DAQ streams data into redirected Xrootd server
3) DAQ closes file/connection
4) Xrootd raises event announcing closed file
[Maxim adds this note: To add a tiny bit of precision to bullet "4", current thinking (as recommended to me by Andy Hanushevsky) is to use "ofs.notify" with appropriate "ofs.notifymsg"]
5) We handle this event by running a SAM metadata producer program
6) We send HTTP POST to FTS to register data and metadata files
[Maxim adds this note: the HTTP idea in "6" was proposed by Robert]
7) FTS initiates Buffer->EOS transfer over XRootd

During the spill we expect about 25 of these sequences to be in
operation concurrently/asynchronously.


EOS and Castor questions

This list supplied to Tanya for her 8/2016 CERN visit

EOS and Castor questions

Expectation is that there are multiple DAQ event builder nodes on which
files will be constructed on local file systems. Each file will have
associated metadata (possibly/additionally embedded within the file) and
checksum information. Alternatively there may be an aggregation node(s)
where raw data is collected. The main point is that there will be storage
local to the protoDUNE detector(s) that will buffer data and metadata.

The rate at which raw data is produced is estimated to be as large as
20 GBit/s. This is then the rate at which we are looking to move raw
data from local buffers to EOS, from EOS to Castor, from EOS to external
sites (specifically Fermilab).

These files need to be transferred to CERN EOS. The expected mechanism
is via 3rd party transfers managed by the Fermilab FTS service. There
may be a "single" FTS server or FTS servers on each of the event builder
nodes - but as the transfers are 3rd party the number and location of
such should be irrelevant.

The transfers may use any of the supported protocols, but specifically
we are interested in xrootd or gridftp. The questions for EOS and Castor
experts are then:
- What are the "doors" available into EOS?
- Which protocols?
- Which capabilities from remote (eg not fuse mounted) systems,
e.g. "cp", "ls", "mv", "stat",...
- if "ls", e.g. via xrdls, is allowed, what rates are allowed?
- Is there a single or multiple target addresses?
- How is the target path specified?
- How does EOS handle checksums, particularly in validating transfers?
- What limits bandwidth?
- Similar questions for transfers out of EOS, particularly:
- How can checksums be utilized?
- What limits bandwidth?
- How does a copy to Castor get invoked?
- Can this be invoked from a 3rd party? (i.e. from a system with
neither EOS or Castor "mounted")
- How can one check that a file is successfully on tape?
- What are the expectations for "cleaning up" EOS?