Project

General

Profile

Stakeholders Meeting May-13-2020

Indico: https://indico.fnal.gov/event/17338/
(Slides)
Edgar's presentation


Present

Marco Mambelli - Project Lead (room)
Bruno Coimbra - Project Member (room)
Marco Mascheroni - Project Member/OSG Factory Operations
Dennis Box - developer
James Letts - CMS
Kevin Lennon - CMS
Jeff Dost - OSG Factory Operations
Krista - HEPCloud Technical Lead
Marian Zvada - OSG Operations
Edgar Hernandez - OSG/GLOW
Joe Boyd -
Tanya Levshina - Operations
Stu Fuess - Fermilab
Andrew Norman- HEPCloud Technical Lead
Steve Timm - HEPCloud Technical Advisor
Edgar Fajardo
Aashray Arora
Brian Lin - OSG


New Format
  • Status update
  • Communications
  • Important changes
  • Project Progress and Roadmap
  • Stakeholders Reports or Presentations
  • Roundtable
  • Open discussion

Main presentation from slides (including developers spotlight)

Repeating Important changes
  • 3.6.2 Requires HTCondor Python bindings
  • Scheduled for 3.6.3
    • Drop Python 2.6 support
    • TAR files distribution
  • Scheduled for 3.7.1
    • Drop GlExec support
  • Planned for 3.7.2
    • Default to shared port in User Collector

These have been already sent to the mailing list. Last chance to request changes. Stakeholders agreed to all the changes.

Questions during the presentation:
  • Edgar asked if condor_chirp will be linked in the singularity image from the host machine. It will
  • Marco Mas. pointed out that CMS will wait for 3.6.3 to test the new condor_chirp
  • Edgar asked if the CMS global pool was using singularity scripts. Marco Mas. confirmed
  • James and Marco Mas. found that the current frontend queries have a significant impact on production. More evident w/ many groups
  • Tanya asked if WLCG Tokens would also be supported. Marco Mam. confirmed.
  • Tanya asked for more details on the plans for LCF support in this release. There is no support in this release. A summer student will start working on it. Streamlining will come later.
  • James said that CMS has to go over their old Redmine tickets (cleanup and getting priority straight)
Questions to Edgar’s talk
  • Brian asked Edgar if they are planning to distribute the containerized frontend. Edgar said that they haven’t any plans yet, but it would be a nice future plan.
  • Steve asked if you have to restart the frontend container when you change a certificate. Edgar said that it is required and pointed out that he recreates the frontend containers once a week. It is easier to delete and restart than reconfigure.
    • Farruck did similar work at Fermilab with OKD (aka OpenShift) instead of k8s and PODS
    • Frequent restarts may cause problems, Glidein in flight may fail if httpd is not running. Edgar thinks it is short enough that there are no problems
    • state persistent, 1 GB limit, may need to add cleanup scripts
  • Jeff brought up concerns about disk usage. Edgar suggested having some cronjobs to clean the files from time to time
  • Marco Mas. pointed out that deploying services as containers will also impact development. He gave the example that HA for frontends might not be a requirement anymore since it would be naturally handled by the pods. Marco Mambelli reminded that HA helps also with more redundancy (pods are all on one site, with same FS for persistency)

Final Questions/Comments

No further questions/comments.

Next stakeholders meeting is in two months. 8th of July.