Project

General

Profile

Feature #10046

Read adler32 checksums directly from dCache

Added by Robert Illingworth about 5 years ago. Updated almost 3 years ago.

Status:
Feedback
Priority:
Normal
Target version:
Start date:
09/03/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Duration:

Description

dCache supports accessing the checksum through a pseudo file like .(get)(/pnfs/...)(checksum) . So if the input directory is in pnfs space we can read the checksum directly out of it and add it to the metadata. Or if the file already has metadata we should compare the db checksum to the actual one and reject the file if there's a mismatch.

Since dzero/cdf don't support adler32 checksums it should be possible to disable this feature.


Subtasks

Feature #10047: Compare enstore and adler32 checksumsResolvedRobert Illingworth

History

#1 Updated by Robert Illingworth about 5 years ago

  • Assignee set to Dennis Box
  • Target version set to 818

#2 Updated by Dennis Box almost 5 years ago

  • Status changed from Assigned to Feedback
  • Assignee changed from Dennis Box to Robert Illingworth

I have pushed some changes to origin/10046 git branch, and re-assigned this ticket to Robert for feedback.

changes:

config.py:readDCacheChecksums()  Looks for an entry 'read-dcache-checksums' in the [main] section 
of the config file. If entry is True or absent, method returns True.
If the entry is False, method returns False.
pnfs.py:getEnstoreFileDetails()  New optional parameter, readFromDCache=True .  If true,
reads the adler32 checksum from dcache metadata ,
and puts it in the dictionary that the method returns
UPDATE: after talking with Robert before I hit 'send' I see this
needs work, I was splitting off ADLER32 from ADLER32:some_value
filestate.py:_checkLabel()       Calls getEnstoreFileDetails() with config.readFromDCache() value
filestate.py:_handlePnfsResults():  Flow of control used to switch on whether getEnstoreFileDetails() returned
an empty dictionary or not. Since enstore (layer4) and dcache (get)
metadata can both now be called, returned dictionary can have one entry
in it if layer4 unreadable. Flow of control now switches on whether number
of entries in dictionary is greater than 1.
new directory python/test
new files test_config.py      Tests readFromDCache() use of 'read-dcache-checksums'
entry
test_config.ini     input file for test_config.py
test_pnfs.py        Tests that layer4 and dcache metadata being read correcly
Usage of the tests:  
export CONFIG_FILE=`pwd`/test_config.ini (or whatever file you want)
trial test_config.py
export PNFS_TEST_FILE=/pnfs/some/file/that/is/readable/and/in/enstore
trial test_pnfs.py
NB both of the test files only have one test_(something) method in their suite.  Adding a second 
test to test_pnfs.py confused the reactor. I am working on figuring out why, unit testing is in
general a good thing.

Questions:

#3 Updated by Robert Illingworth almost 3 years ago

  • Target version changed from 818 to v6_0_0


Also available in: Atom PDF