Project

General

Profile

May-08-2019 » History » Version 2

Parag Mhashilkar, 05/08/2019 04:42 PM

1 2 Parag Mhashilkar
h1. May-08-2019
2 1 Parag Mhashilkar
3 1 Parag Mhashilkar
Slides: https://indico.fnal.gov/event/17326/
4 1 Parag Mhashilkar
5 1 Parag Mhashilkar
----
6 1 Parag Mhashilkar
7 1 Parag Mhashilkar
h3. Present
8 1 Parag Mhashilkar
9 2 Parag Mhashilkar
Margaret Votava - Project Sponsor/SCS Quadrant Head
10 1 Parag Mhashilkar
Parag Mhashilkar - Project Lead
11 1 Parag Mhashilkar
Marco Mambelli - Technical Lead
12 1 Parag Mhashilkar
Dennis Box - Project Member
13 1 Parag Mhashilkar
Lorena Lobato - Project Member
14 1 Parag Mhashilkar
Marco Mascheroni - Project Member/OSG Factory Operations
15 2 Parag Mhashilkar
Burt Holzman - HEPCloud Project Sponsor
16 1 Parag Mhashilkar
Steve Timm - HEPCloud Technical Advisor
17 1 Parag Mhashilkar
Antonio Perez-Calero Yzquierdo - CMS
18 2 Parag Mhashilkar
Brian Lin - OSG Software
19 1 Parag Mhashilkar
Jeff Dost - OSG Factory Operations
20 2 Parag Mhashilkar
Marian Zvada - OSG Operations
21 1 Parag Mhashilkar
Tanya Levshina - FIFE
22 1 Parag Mhashilkar
Ken Herner - FIFE
23 1 Parag Mhashilkar
Joe Boyd - FIFE
24 1 Parag Mhashilkar
Mike Kirby - FIFE
25 1 Parag Mhashilkar
Edgar Hernandez - OSG/GLOW
26 1 Parag Mhashilkar
27 1 Parag Mhashilkar
----
28 1 Parag Mhashilkar
29 1 Parag Mhashilkar
h3. Communication
30 1 Parag Mhashilkar
31 2 Parag Mhashilkar
* Stakeholders/Admins should pay close attention to release notes for changes to HTCondor configuration. This is extremely important if the admins do not use configuration shipped with the glideinwms rpms but instead have their own version managed by puppet/chef
32 1 Parag Mhashilkar
33 1 Parag Mhashilkar
----
34 1 Parag Mhashilkar
35 1 Parag Mhashilkar
h3. Support
36 1 Parag Mhashilkar
37 1 Parag Mhashilkar
38 1 Parag Mhashilkar
----
39 1 Parag Mhashilkar
40 1 Parag Mhashilkar
h3. Project Management
41 1 Parag Mhashilkar
42 1 Parag Mhashilkar
* Next Stakeholders meeting on July 10, 2019. https://indico.fnal.gov/event/17328/
43 2 Parag Mhashilkar
* Burt: Is the list on the roadmap slide ordered? Working with the HPC sites without network connection is important to HEPCloud
44 2 Parag Mhashilkar
** Mambelli: Its not ordered. Work depends on HTCondor to provide a mechanism that lets GlideinWMS make HTCondor work
45 1 Parag Mhashilkar
 
46 1 Parag Mhashilkar
----
47 1 Parag Mhashilkar
48 1 Parag Mhashilkar
h3. Roadmap
49 1 Parag Mhashilkar
50 2 Parag Mhashilkar
https://cdcvs.fnal.gov/redmine/projects/glideinwms/wiki/RoadmapSummary
51 1 Parag Mhashilkar
52 1 Parag Mhashilkar
----
53 1 Parag Mhashilkar
54 1 Parag Mhashilkar
h3. Technical
55 1 Parag Mhashilkar
56 2 Parag Mhashilkar
* Pilots not terminating correctly on certain sites
57 2 Parag Mhashilkar
** Fix  in 3.4.5 results in HTCondor terminating correctly but there are some lingering process at Purdue and this is impacting CMS jobs. Marco is are working with the Purdue admins
58 1 Parag Mhashilkar
59 2 Parag Mhashilkar
* Singularity Discussion
60 2 Parag Mhashilkar
** Singularity changes in 3.4.5 is related to OSG and CMS singularity wrappers
61 2 Parag Mhashilkar
62 2 Parag Mhashilkar
* Share Port Discussion
63 2 Parag Mhashilkar
** Edgar: Which HTCondor daemons do the config related to shared port impact?
64 2 Parag Mhashilkar
*** Mambelli: Sometimes schedd and sometimes collector. In 3.4.5, we are allowing condor to use shared port based on the what you specify in the config. In past versions schedd was not using the shared port daemon
65 2 Parag Mhashilkar
** Edgar: does that mean we do not need secondary collectors or just secondary ports?
66 2 Parag Mhashilkar
*** Mambelli: We do not need secondary ports but still need secondary collectors
67 2 Parag Mhashilkar
*** Antonio: CMS has been using shared port already. Secondary collectors are configured to use shared port. Backup is using secondary ports
68 2 Parag Mhashilkar
** Complete migration to shared port needs close coordinations with the admins
69 2 Parag Mhashilkar
70 2 Parag Mhashilkar
* Singularity Discussion
71 2 Parag Mhashilkar
** Edgar: Does singularity works over WAN?
72 2 Parag Mhashilkar
*** Mambelli: We have some setup but if there are suggestions that will be useful. Its not easy because of site firewall issues. Condor started singularity will work if using condor 8.8. For condor_ssh_to_job to work you need to start singularity in unprivileged singularity mode since condor is not started as root
73 2 Parag Mhashilkar
*** James: we are encouraging t2 sites in USCMS to move to unprivileged singularity. Caltech is completely done and purdue on the way. requires red hat 7.6
74 2 Parag Mhashilkar
*** Steve: will it break other users having singularity scripts?
75 2 Parag Mhashilkar
**** Edgar: should not. to have unprivileged singularity you can use one available in CVMFS
76 2 Parag Mhashilkar
**** Mambelli: glidein is using singularity from CVMFS if needed behind the scene. Work done with condor support. There is a kernel version requirement and need to enable option that allows unprivileged singularity invocation
77 2 Parag Mhashilkar
**** Edgar: this is very cool. One way to do it is pilot advertise if it is using privileged or unprivileged?
78 2 Parag Mhashilkar
79 2 Parag Mhashilkar
* Python 2 -> Python3 discussion
80 2 Parag Mhashilkar
** Edgar: as long as OSG ships glideinwms 3.4 and supports the OSG version we need to support python2 based version
81 2 Parag Mhashilkar
** Brian: OSG 3.5 will drop support for RHEL6. Regular support for 3.4 version will continue for for 6 months once 3.5 is out. OSG support will support el7 which has default for python 2.7
82 2 Parag Mhashilkar
** Edgar: Migrating glideinwms from one machine to other is difficult. This is also applicable to all services. Also we are restricted to support certain versions for example: LIGO is starting run and we cant touch software until middle of next year.
83 2 Parag Mhashilkar
 
84 2 Parag Mhashilkar
* Brian: whats the support model for glideinwms 3.5 and 3.6
85 2 Parag Mhashilkar
** Mambelli: glideinwms 3.5 will go in upcoming and become 3.6.
86 2 Parag Mhashilkar
** Brian: osg 3.5 will be available by end of summer and that starts timer for 3.4 and glideinwms needs to be support one in 3.4.
87 2 Parag Mhashilkar
** Parag: Glideinwms version support model is similar to HTCondor. If the factory-frontend communication protocol does not change, we can support older and newer frontends provided factory is on latest release. However, older frontends may not have access to newer features.
88 2 Parag Mhashilkar
89 2 Parag Mhashilkar
* Discontinuing support for GT2/GT4
90 2 Parag Mhashilkar
** Jeff: There might be some lingering entries in the factory config. These sites have not been working for a while now and site admins have been communicated. These sites should not be an issue.
91 2 Parag Mhashilkar
92 2 Parag Mhashilkar
* Discontinuing support for glexec
93 2 Parag Mhashilkar
** Steve: Dune is still forced to use glexec at some site in Europe and they are working with the admins to stop using it. It will be couple of months for changes to take into effect
94 2 Parag Mhashilkar
95 2 Parag Mhashilkar
* Antonio: do you have info on the monitoring?
96 2 Parag Mhashilkar
** Edgar: we have a prototype solution for factory sending info to osg gracc. Need to have student spend more time and merge the code to glideinwms and then we need to understand how to do this in frontend
97 2 Parag Mhashilkar
** Mambelli: we had a student who did this last summer and will be available this summer. Work is on a branch.
98 2 Parag Mhashilkar
99 1 Parag Mhashilkar
----
100 1 Parag Mhashilkar
101 1 Parag Mhashilkar
h3. ACTION ITEMS
102 2 Parag Mhashilkar
103 2 Parag Mhashilkar
* Marco Mambelli
104 2 Parag Mhashilkar
** will send info on ticket related to "Pilots not terminating at certain sites" - #22509
105 2 Parag Mhashilkar
**  will look at the comments on GlideinWMS release notes made by Brian Lin and get back to him
106 2 Parag Mhashilkar
** will get back to Edgar on topic related to Singularity and WAN
107 2 Parag Mhashilkar
** good idea we can investigate pilot to advertise if it can use privileged or unprivileged singularity. Need to open a ticket with more details
108 2 Parag Mhashilkar
** Coordinate monitoring related tasks with Edgar based on the work done by students