Read adler32 checksums directly from dCache
dCache supports accessing the checksum through a pseudo file like
.(get)(/pnfs/...)(checksum) . So if the input directory is in pnfs space we can read the checksum directly out of it and add it to the metadata. Or if the file already has metadata we should compare the db checksum to the actual one and reject the file if there's a mismatch.
Since dzero/cdf don't support adler32 checksums it should be possible to disable this feature.
#2 Updated by Dennis Box over 5 years ago
- Status changed from Assigned to Feedback
- Assignee changed from Dennis Box to Robert Illingworth
I have pushed some changes to origin/10046 git branch, and re-assigned this ticket to Robert for feedback.
config.py:readDCacheChecksums() Looks for an entry 'read-dcache-checksums' in the [main] section
of the config file. If entry is True or absent, method returns True.
If the entry is False, method returns False.
pnfs.py:getEnstoreFileDetails() New optional parameter, readFromDCache=True . If true,
reads the adler32 checksum from dcache metadata ,
and puts it in the dictionary that the method returns
UPDATE: after talking with Robert before I hit 'send' I see this
needs work, I was splitting off ADLER32 from ADLER32:some_value
filestate.py:_checkLabel() Calls getEnstoreFileDetails() with config.readFromDCache() value
filestate.py:_handlePnfsResults(): Flow of control used to switch on whether getEnstoreFileDetails() returned
an empty dictionary or not. Since enstore (layer4) and dcache (get)
metadata can both now be called, returned dictionary can have one entry
in it if layer4 unreadable. Flow of control now switches on whether number
of entries in dictionary is greater than 1.
new directory python/test
new files test_config.py Tests readFromDCache() use of 'read-dcache-checksums'
test_config.ini input file for test_config.py
test_pnfs.py Tests that layer4 and dcache metadata being read correcly
Usage of the tests:
export CONFIG_FILE=`pwd`/test_config.ini (or whatever file you want)
NB both of the test files only have one test_(something) method in their suite. Adding a second
test to test_pnfs.py confused the reactor. I am working on figuring out why, unit testing is in
general a good thing.