Project

General

Profile

Feature #7021

Evil feature -- match blah-clued0:/some/path and use rsync...

Added by Marc Mengel over 5 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
09/16/2014
Due date:
% Done:

80%

Estimated time:
0.50 h
Spent time:
Duration:

Description

To make life easier for Dzero data preservation effort, we want ifdh
to recognize paths like
foo-clued0:/some/path
and use "rsync" to copy to them, and also to use
CPN_LOCK_GROUP=dzero/foo-clued0
so that there will be a lock group for each clued0 host thus used.

History

#1 Updated by Marc Mengel over 5 years ago

  • Description updated (diff)
  • Assignee set to Marc Mengel
  • Target version set to v1_5_1
  • % Done changed from 0 to 80
  • Estimated time set to 0.50 h

This change is in ee42ab3. It shouldn't affect other stuff too much, and it
only works for ifdh cp (not ifdh ls, etc.).

If we need the other tools (ifdh ls, etc.) we could add that, too..

#2 Updated by Marc Mengel over 5 years ago

Here's how a quick test looks, after setting up a lock area for nautilus-clued0.

<novagpvm02> IFDH_DEBUG=1 ifdh cp verify_pnfs_checksum.py nautilus-clued0:/tmp/verify_pnfs_checksum.py 
ifdh constructor: _baseuri is 'http://samweb.fnal.gov:8480/sam/nova/api'
entering ifdh::cp( verify_pnfs_checksum.py nautilus-clued0:/tmp/verify_pnfs_checksum.py adding cwd to verify_pnfs_checksum.py
parent of /tmp/verify_pnfs_checksum.py is /tmp
parent of /tmp/verify_pnfs_checksum.py is /tmp
local_access(/tmp , 4) -- local returning 0
In clued0 hack case..
Got addrs...
Saw ipv4 for Fermilab: onsite
LOCK - Wed Sep 17 15:14:57 UTC 2014 lock  /grid/data/dzero/nautilus-clued0/LOCK/LOCKS/20140917.15:14:57.0.novagpvm02.28482.mengel.mengel
running: rsync /tmp/verify_pnfs_checksum.py nautilus-clued0:/tmp/verify_pnfs_checksum.py 
ifdh cp: transferred: 2570 bytes in 4.0339 seconds 
LOCK - Wed Sep 17 15:14:58 UTC 2014 freed /grid/data/dzero/nautilus-clued0/LOCK/LOCKS/20140917.15:14:57.0.novagpvm02.28482.mengel.mengel
is_directory(/tmp/verify_pnfs_checksum.py) -> 0

#3 Updated by Kenneth Herner over 5 years ago

This would indeed be a great feature. Two points for additional improvements:

1) So that files are owned by the user as opposed to the group account, the rsync command should be something like rsync my_files ${GRID_USER}@foo-clued0:/some/path

2) Some D0 users copy back to so-called "/prj_root" areas, which are disks mounted as /projects disks on bluearc (and some dedicated D0 NAS machines.) So the output host may not always be foo-clued0, but could be d0srv123 for example. Maybe the D0 code could stick something like "D0:" in the front of the path, and that would trigger the rysnc bit?

#4 Updated by Marc Mengel over 5 years ago

Went a little fancier:

bool
is_dzero_node_path( std::string path ) {
 // it could be a clued0 node, or it could be a d0srv node...
 // and the d0srv is either at the front, or has user@ on the
 // front of it...
 return path.find("-clued0:") != std::string::npos ||
      path.find("-clued0.fnal.gov:") != std::string::npos ||
     (path.find("d0srv") == 0 && path.find(':') != std::string::npos) ||
     (path.find("d0srv") == path.find("@") + 1 &&
        path.find(':') != std::string::npos);
}

so it should match d0srvxxx: nodes, and blah-clued0: nodes, with or without
fnal.gov on the end.

#5 Updated by Marc Mengel over 5 years ago

...or do you Really Want the D0: on the front? so it would be

D0:nautilus-clued0:/some/path

We could do that..

#6 Updated by Kenneth Herner over 5 years ago

My thinking with the D0: in the front was that it protects against some node name we haven’t thought of. If anything changes in the future we wouldn’t have to change ifdhc, just the cpn lock pools.

Ken

#7 Updated by Marc Mengel about 5 years ago

  • Status changed from New to Closed


Also available in: Atom PDF