Review Request #6135: Data Management Workflow Umbrella Task
Can't check checksum in /pnfs/scratch, breaks some storage methods
dcache doesn't allow direct file reads so can't check checksums for files written to /pnfs/minerva/scratch.
This breaks the checks we make before putting files into the dropbox.
We either need to reenable this or eliminate all possible race conditions from productions.
Right now /pnfs/ gets the first version of multiple copies while the metadata on bluearc is last.
1) declare metadata on farm node - will force first copy to have correct metadata, erase output if declaration fails. Will require substantial testing
2) find a way to do checksums on /pnfs/ it's dangerous not to be able to validate transfers anyways. Switch to different checksum which does work? Have ifdh write the checksum someplace?
3) put in robust locking system to prevent multiple submissions - we have this (99.9% of the time) for keepup but failures still occur occasionally.
4) put unique timestamp in each filename - opens us up to multiple copies with the same content - will need to add a checker to prevent such duplicates.
#1 Updated by Gabriel Perdue over 5 years ago
Do files in dCache already have checksums?
There is a command-line option:
mknod /tmp/fifo$$ p
samweb file-checksum /tmp/fifo$$ &
ifdh cp $1 /tmp/fifo$$ &
From Marc Mengel, but we may want something built directly into samweb that offers a more unified interface.