Integrating Experiments into FIFE » History » Version 143
« Previous -
Version 143/154
(diff) -
Next » -
Current version
Neha Sharma, 12/08/2015 04:43 PM
- Table of contents
- Integrating Experiments into FIFE
- Summarized SNOW Questions (from the process below)
- Starting FIFE Integration with a new experiment.
- VO creation (authz)
- Request group creation in VOMS, Request Role creation in VOMS
- Request access to fifemon
- Request creation of unix group and users (to Account management group)
- Add GUMS mapping for associated VOMS/GUMS Roles
- Add support for the new VO in FermiGrid (Gatekeepers, fifebatch ecosystem, Workernodes)
- Add GRATIA mapping for the new VO
- Request number of job slots
- Optional: Request access to publish software on OASIS/CVMFS
- Optional: Request SAM access --- other DM
- Request BlueArc Storage
- Request data Bluearc area mount/setup on Bestman server
- Request dCache scratch area/ Archive Area (Enstore)
- Optional: Request Interactive Node
- VO creation (authz)
- Other stuff (from Art's list)
- Build Node
- Users requesting membership
- Experiments Actively Integrating Services
Integrating Experiments into FIFE¶
For each experiment that is actively working to integrate services from FIFE, a wiki page exists to list the primary FIFE representative, the experiment representative, and the services that are being integrated. The page should quickly give an overview of the current status, on-going challenges, results of testing and successful milestones and target dates for integration.
See the appended document for details of the on-boarding process: "Best Practices to onboard new experiments to run on OSG"
Summarized SNOW Questions (from the process below)¶
- Full Name of Group, Organization or Experiment [eg Large Hadron Collider]
- Will you have collaborators that are not FNAL badged? [to create subgroup in fnal VO or own voms server]
- Virtual Organization Name [one word, no special characters, ideally keep it under 8 chars, this will be $VO, this will be your VO and Unix group name]
- Contacts: Group Manager, Operations Contact, Security Contact, Spokesperson, VO Manager
- How much dedicated storage do you need (if >0=trigger request to SPPM)? [in TB, this is dCache - Dmitry will provide link to an explanation page of the shared storage properties&usecase]
- How much Tape Storage do you need? (if >0=trigger request to SPPM) [in TB, this is Enstore]
- Do you require an Interactive Node? If so, how many users expected?
Notification sent after all this is done¶
- You're good to go, users can request membership through SNOW Catalog Item X.
- Point them to FIFE wiki for installing SW, there it will say something about CVMFS (& link to SNOW catalog request)
- Instructions (Dmitry's link) on how to access the shared ~200TB volatile scratch area (/pnfs/fnal.gov/usr/$VO/scratch/), with a link to IFDH/FTS/SAM docs
- Instructions how to submit a job via their interactive node
Other SNOW tasks¶
- FEF should create a new SNOW Catalog Item to Request new VO and access to OASIS/CVMFS
VO registration in OIM¶
Note: This task is not related to SNOW and is not needed if your experiment is a project (like genie)
- Liaison (or someone on his/her behalf) should register the VO in OIM
Starting FIFE Integration with a new experiment.¶
VO creation (authz)¶
Request group creation in VOMS, Request Role creation in VOMS¶
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Computing --> Setup Computing Support for an Experiment Request
- Need to modify the voms1:/opt/gcso/voms/VOMS_fermilab_VO_add_group.sh script to add the new VO (will need to know the submit node script name).
- This should also include adding the VO in the SNOW Item Catalog Affiliation\Experiment Computing Account Request
- Steve tells me it needs to ask for approval from CS Board, so task should be like a 'change' that requests the approval. This may be delegated to Gabriele Garzoglio already...
Non-fermilab VOs (we are ALREADY doing this for CDF and D0)¶
For those VOs that are out of the Fermilab VO (eg LBNE) we should adapt the 'add user' script so that it uses "voms-admin create-user" to create the user in the VO (and sign AUP for them as they've already signed the Fermilab Computing policy). We can approve the user's creation in VOMS because it was already approved in the Fermilab 'add user' process (via SNOW).
This VOs will have 2 ways to get users added:
1. User joins VO from Fermilab via SNOW: we run the script that adds the user to the VO's voms server, adding the KCA
2. User joins VO from out of Femrilab (no fermilab ID for that user): user goes directly to VOMS webpage and VO admin (someone from the experiment, probably spokesperson/validator) approves the request.
Request access to fifemon¶
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Computing --> Create a New Scientific Computing Request
TODO: Joe is creating a procedure for this...
Request creation of unix group and users (to Account management group)¶
Request creation of the VO group (today done with a 'General request' and assign to 'Account management group', eg RITM0103277):
- $VO
And request all this users to be created and belong to the $VO group:
- ${VO} [This username is just because the password generator is looking for the group from the username... this should be improved and then we won't need this user anymore, check with Neha she's fixing it 09/2014]
- ${VO}ana [Analysis role generic user]
- ${VO}pro [Production role generic user]
- ${VO}gli [Internal for glidein/pilot jobs, may be used when submitting opportunistic and takes the 'pilot' Role]
Add GUMS mapping for associated VOMS/GUMS Roles¶
When the VO is not within the fermilab VO: https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/GUMS#Support-of-new-VOMS-server
Adding support for a subgroup is not yet documented, Neha will do it soon https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/GUMS#Supporting-a-new-subgroup-of-a-VOMS-server-we-already-support
Add support for the new VO in FermiGrid (Gatekeepers, fifebatch ecosystem, Workernodes)¶
- Add ${VO} group: https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/Password_Generator#Adding-new-group-on-Worker-nodes
- Setup group quota and priority factor for new ${VO} group: https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/FermiGrid_condor_clusters
Already automated steps:
- Gatekeepers: Once GUMS maps the newly created users, the password generator will add the new users within ~24h (https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/Password_Generator#Pushing-new-password-file)
NOTE: when adding new users to a VO they get pulled in fifebatch by the keytab cron, every 2 hours. This will need to be a 'push' thing when creating the new user.
Add GRATIA mapping for the new VO¶
We need to submit one job, this will create the new entry in GRATIA and then we (GCSO) need to go in GRATIA and do the 'standard mapping' we get requests every now and then (they tipically come from Tanya, look in SNOW old requests).
If not big enough to be a VO ( hence a project), follow instructions here - https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/ProjectName_change_request
If VO, follow instructions here- https://cdcvs.fnal.gov/redmine/projects/grid_and_cloud_computing_operations/wiki/VOName_change_request
Request number of job slots¶
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Computing --> Increased Job Slots or Disk Space on FermiGrid Request
This task should actually be reviewed and improved so that it does not trigger a 'send an email' and wait for the answer but actually request approval from the people responsible (such list can be provided by Margaret: fermigrid-allocations list of people as approvers).
By default experiment has 0 slots (batch, with surplus enabled=oportunistic).
Optional: Request access to publish software on OASIS/CVMFS¶
The process is different depending on whether the repository will be for a group large enough to be Virtual Organization (VO) registered with OSG (even if it is a subgroup under the fermilab VO) or for a project too small to be a VO.
OASIS/CVMFS process for a VO¶
Create a new SNOW 'Generic request' for 'Scientific Server Support' and ask for a new opensciencegrid.org CVMFS repository hosted at FNAL.
Required info:
- VO name if it is for a VO or project name if it is for a project
- Repository name - usually it is the VO name or project name in lower case, but must be at least derived from them
- Kerberos id of authorized person (that person can add more people to the .k5login later), this can be 'liaison' or requestor
- Disk space required to start and an estimate for the next year
The person handling this request should request approval from the SPPM list of people (same as fermigrid job slots) before creating the repository.
Please see the OASIS/CVMFS section of the Introduction to FIFE and Component Services documentation for more details about how OASIS/CVMFS works and how to maintain the software repository.
OASIS/CVMFS process for a small project that is not a VO¶
Create a new SNOW 'Generic request' for 'FIFE Support' and ask to have the project's software added to the fermilab.opensciencegrid.org CVMFS repository.
The person handling the request will discuss what needs to be done next. Typically files first get published in /grid/fermiapp and then are automatically synced from there to CVMFS. If the project will need its own allocation for running jobs on the Open Science Grid, the person handling the request will make sure that the project is registered in the OSG Information Management system as a fermilab VO project.
Optional: Request SAM access --- other DM¶
As per Michael Gheith instructions we should open a SNOW request to 'REX-DH-Support' asking for
- 'SAM support for ${VO}'
- 'FTS support for ${VO}'
- 'IFDH support for ${VO}'
SAM support team will follow up with discussions afterwards.
Request BlueArc Storage¶
GOAL: storage for SW development, used from the interactive nodes.
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Computing --> BlueArc Storage Request
see https://fermi.service-now.com/kb_view.do?sysparm_article=KB0010929 for example request.
This should be mounted on the interactive node.
List of requirements:- if-nas-0.fnal.gov:/${VO}/app (this is VO specific area to build the SW, mounted RW on interactive nodes and RO on WorkerNodes). Request by default (1TB). This should trigger a SNOW ticket to FEF to mount this NFS area RO on the WorkerNodes and RW on the Interactive node.
By default no data bluearc (this should come as special requests from SPPM, all data should go to dCache):
- blue3.fnal.gov:/minos/data (project disk for user analysis files. Strictly scratch and accessed from interactive nodes and WN. Art says this is a must but GCSO does not want this area to exist on WNs... Discussion needed)
- blue2:/fermigrid-data (logs for bluearc data areas)
Request data Bluearc area mount/setup on Bestman server¶
This is needed only if an experiment has a data area on Bluearc and would like to use the Bestman server for data transfers
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Computing --> Create a New Scientific Computing Request
see RITM0118309 for example request.
Request dCache scratch area/ Archive Area (Enstore)¶
- By default any VO will get the scratch area in /pnfs/fnal.gov/usr/${VO}/scratch, need to open ticket to 'Storage Service' providing:
- VO name ($VO)
- GID ($VO)
- UID ($VO)
- This directory should be group writable
If the VO requires non-volatile storage then Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Data Storage and Access --> dCache Storage Request
You will need to provide:
- Approximate Total Amount of Space
- Number of users that will access the system
- Protocols expected to use
- If you want to archive your data to tape
If answering that they need storage: Allocation needs to be approved by fermigrid-allocations@fnal.gov (list of people), we should get it automated before ticket creation to dCache ops.
Probably we should never trigger: Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Data Storage and Access --> Add dCache Users Request
select users by using search
We need to see how to sync between user creation, group mapping (GUMS) and dCache config is needed.
- By default request dCache access to the shared 10TB volatile scratch area (not sure if automated or not... ask Dmitry). This is ONLY true if you get mapped to 'fermigrid' user (now all users are mapped like this unless they're a specific VO, so they all share this area).
- Now there are '2 ways' of doing this: http://fndca3a.fnal.gov:2288/pools/list/PoolManager//FermigridVolPools/spaces + /pnfs/fs/usr/fermigrid/volatile/*. This is 10TB, seems like a directory per VO 'needs' to be created. This is seen as a 'werid thing' and does not look to me like very well understood.
- 'FIFE way': http://fndca3a.fnal.gov:2288/pools/list/PoolManager//PublicScratchPools/spaces + /pnfs/fnal.gov/usr/*/scratch This is a sharec ~200TB pool that each VO can get access to. The PNFS access rights will be right here (each VO has it's own directory) but the dCache pool is shared and it removes files if the pool is full.
Optional: Request Interactive Node¶
GOAL: users will build code, test SW and submit jobs to the grid (typically via jobsub client)
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog.
"Create a New Scientific Computing Request"
Note: if your experiment not under "Select Experiment List" use "General .
Categorization: Scientific Computing -- Interactive Computing
You will need to provide the following information in this ticket:
- Experiment Name ($VO)
- Experiment Liaison [Should look at liaisons web page for experiment, if not ${VO Manager}
- How many cores (1), memory (3)? (will get 20GB of scratch disk)
- What OS do you prefer? SL6
- What packages you need to be installed on this server? krb5-fermi-getcert, osg-ca-certs, osg-client, jobsub (via ups/upd) (there's a list of standard stuff GCSO needs to tell)
- What areas of bluearc should be mounted? /grid/fermiapp, /grid/data, /$VO/app, dCache shared scratch
- How many users will there be?
- Do you need a group account? By default add: ${VO}, ${VO}ana, ${VO}pro, ${VO}gli
- Will you need cvmfs client? Default Yes
Users will use AFS home directories by default
dCache NFSv4.1 should be mounted here so that experiments do NOT need an if-admin machine (nor if-gridftp).
- Ed/Tyler proposal: to implement this requirement we could use something Marc Mengel wrote a while back that allows a certain user 'su' as any other user (without the need to be in the .k5login of each user).
Ask if they already have one, if so which (so that GCSO can do mapping). They may not do developmen nor jobs submission @ FNAL.
Needs approval from fermigrid-allocations@fnal.gov
Other stuff (from Art's list)¶
Add Redmine project¶
Create docdb instance¶
Create ECL instance¶
This is a logbook
Build Node¶
Ask Glenn about this, should this be driven by the 'create a new VO' workflow? Not for now (09/2015)
Users requesting membership¶
- Request user assignment to VOMS Role/Groups (This is for users to request membership, once all this process is done!)
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Accounts --> Affiliation\Experiment Computing Account Request
- Optional: Request Cloud Account on FermiCloud (GOAL: people testing their SW before they move it into CVMFS)
Open SNOW request: https://fermi.service-now.com/ --> Service Catalog --> Scientific Computing --> FermiCloud Account Request
!!Ask Neha what happens with a new user and fifebatch keytabs...