Project

General

Profile

Weekly Meeting Notes

Wednesdays, 10am central

Jump to the current Weekly Meeting Notes
Jump to the old Weekly Meeting Notes 2016
Jump to the old Weekly Meeting Notes 2017
Jump to the old Weekly Meeting Notes 2018
Jump to the old Weekly Meeting Notes 2019

-----
TEMPLATE
h2. July , 2020

Attending:

  • Releases
  • Developers
    • Marco Mascheroni
    • Dennis
    • Bruno
    • LeRayah
    • Mirica
    • Namratha
    • Marco Mambelli

TODO:
-----

July 29, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, LeRayah Neely-Brown, Namratha Urs, (second part) Dave Dykstra

  • Releases
    • 3.6.3
      • RC3 in OSG development
      • Dennis sent pull request to OSG w/ documentation changes
    • 3.7.1
      • Dennis fixing Fact-Fe token ticket
    • p3
      • python3 version, Factory almost working, mostly str/bytes conflicts
      • sort from a list: python2 requires 2 functions, python3 only one
      • lib/cleanupSupport sort function
      • / is no more integer division
  • Developers
    • Marco Mascheroni
      • Followed up w/ Edgar: Frequency of enabling/disabling is once a year, so it's OK to have some monitoring transient where it's incorrect
      • Opened a couple of issues: documenting XML monitoring, bug found by operations: global limits in the entry are not respected when you have multiple frontends (it may happen because the submit processes are in parallel)
    • Bruno
      • python3
    • Dennis
      • Feedback 24561, token communication broken
    • LeRayah
      • poster and presentation
    • Namratha
      • integration in GWMS code base
      • git repo in the ticket
    • Marco Mambelli
      • working on Python3 version
  • Dave
    • Singularity 3.6.1: fixed some uncommon bugs
      • Marco Mascheroni will test 3.6.1, there are some major changes in 3.6.x
      • When using privileged Singularity w/ Docker there is some change to do in the configuration. Dave will warn Tony
    • Token
      • htgettoken, utility to fetch tokens from Vault
      • Dave will send an email about the Vault server and htgettoken, Dennis will do some test
      • Dave will work on a Vault credmon. Give a long lived Vault token (supertoken) that reads from everybody access token. A more secure is to allow an user option to request longer token, more secure because can access only the tokens from that user.

July 22, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs

  • Releases
    • 3.6.3
      • Bruno working on RC
      • Marco Mascheroni is touching base w/ Edgar about pull request and effect on monitoring
    • 3.7.1
      • Factory-Frontend ticket communication in feedback
      • Ready for RC
    • p3
      • Fixing code and pylint scripts
  • Developers
    • Marco Mascheroni
      • Operation work - follow up on how much time the pilot is spending validation and how many failing validations there are
        • 20% of jobs at Purdue were not matching
      • see TODOs below
    • Bruno
      • Working on Python3 conversion, fixed upgrade and reconfig for Factory
        • Learned how to use pydebug, to attach to a running module - proposed for the group code review
    • LeRayah
      • Working on the testing application
      • Working on composing the Poster and paper
    • Mimi
      • Debugging and adding to the script
      • Move into GWMS once sure it is not working
      • Started writing the slides and paper
    • Namratha
      • Working on the feature: using cvmfsexec and mountrepo
    • Marco Mambelli
      • working on Python3 version
  • TODO:
    • Marco Mascheroni and Bruno will write a wiki document about remote python debugging techniques (breakpoints, dealing w/ fork and multiprocesses)
    • Add a ticket about the possibility of not spawning to ease debug

July 15, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, (second part) Dave Dykstra

  • Releases
    • 3.6.3
      • Ready for RC
      • Dennis sent pull request to OSG w/ documentation changes
    • 3.7.1
      • Factory-Frontend ticket communication in feedback
      • Ready for RC
    • p3
      • Fixing code and pylint scripts
  • Developers
    • Marco Mascheroni
      • working on OSG autoconf - improvements mentioned on stakeholders slides - will be merging to 3.6.3
      • will complete the testing for Edgar's pull request
    • Bruno
      • Working on Python3 conversion, fixed upgrade and reconfig for Factory
        • Learned how to use pydebug, to attach to a running module - proposed for the group code review
    • Dennis
      • completed token documentation
      • Re-reading the RPM generation
    • LeRayah
      • Incorporating feedback from Farruck, Maria, and Namratha
    • Mimi
      • Testing the application in a local directory, will test in GlideinMonitor
    • Namratha
      • Working on the feature: using cvmfsexec and mountrepo
    • Marco Mambelli
      • working on Python3 version
  • Dave
    • singularity 3.6.0 released in EPEL, added digital signatures on image files.
    • cvmfsexec documentation updated
      • Richard Jones using CVMFSEXEC on his site.
    • tokens
      • htgettoken to retrieve from Vault
      • OIDC (Open ID connect, designed for web system, adapted for command line
      • Vault, secret manager already supporting OIDC protocol, there will be one at Fermilab
        • Regular flow works well w/ web browser as client, does a call back to a server (the client needs to start a little web server)
        • Device flow makes the user go to the web browser but only once. After approval all is automated, the command line will continue
        • htgettoken will use a Vault token (lasting for a week), a grid token (refresh token, lasting for a week) and access tokens used for access

TODO: thread about tokens for pilots. Meeting next Thursday


July 8, 2020

Stakeholders meeting


July 1, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, (second part) Dave Dykstra, Brian Lin

  • Releases
    • 3.6.3
      • Bruno finishing tar-ball install elimination (ticket feedback, finishing documentation changes)
      • A couple of tickets in feedback to be merged
      • Bruno will do RC on Thursday or Friday
      • Dennis to send pull request to OSG w/ documentation changes
    • 3.7.1
      • All blocking tickets are in feedback ready to be merged
      • Dennis working on the Factory-Frontend ticket communication (not a release-blocking ticket)
    • p3
      • Code changes applied, being tested
  • Developers
    • Marco Mascheroni
      • Release work
      • See TODOs
    • Dennis
      • v3.7 tickets, token and documentation work mentioned above
    • Bruno
      • tarball removal ticket
    • LeRayah
      • Reviewing how jobs are submitted, look at the job submission file
      • Tests of jobs submissions changing universe, ...
    • Mirica
      • Adding unit tests to the anonymization script, looking into Python logging
    • Namratha
      • Working on 24546 (addition to CVMFS): testing in the WN environment
    • Marco Mambelli
      • Frontend output to alternative collector
      • Working on python3 migration
  • Reminders and notes
    • Add google-style docstrings when you edit code
    • Bigger code changes could be reserved for Python3 migration, e.g. use of yapf or removing blanks at the end of lines (to limit changes confusing for git blame)
    • yapf or blank removal could be added to commit hook (Marco Mascheroni has one)
TODO:
  • Marco Mascheroni
    • will work on both pull requests from Edgar
    • will follow-up and test why the osg-wn tarball setup is needed even if the RPM is installed (Marco Mambelli suspects some interference by the environment modifications done in the CMS sw - LD_LIBRARY_PATH, ...)

No news about the OSG Singularity wrapper, Mats and Edgar did not join


June 24, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, Brian Lin

  • Releases
    • 3.6.3
      • Bruno documenting tar-ball install elimination.
      • Bruno will complete and do RC by end of the Week
    • 3.7.1
      • TODO Define tickets
        • Meeting w/ Dennis, blockers: [#24544, #24561, #24565, #23278]; if possible the Frontend/Factory token communication [#24448]
        • Then Dennis will do RC w/ Marco
  • Developers
    • Bruno
      • Mostly busy w/ shift and HEPCloud
    • LeRayah
      • running jobs and talking about the design
    • Mimi
      • complete user name recognition in condor logs
      • planning on improving it and adding IP addresses
    • Namratha
      • adding the ability to use CVMFS
      • testing CVMFSExec
    • Mascheroni
      • Tickets in feedback
      • Will work on merging pull request
    • Marco Mambelli
      • Work on python3 migration (prep work, tests w/ git): initial version will have the old dir structure
      • Frontend output to alternative collector

June 17, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, Dave Dykstra, Mats Rynge

  • Releases
    • 3.6.3
      • Bruno documenting tar-ball install elimination. Marco will send feedback on code
      • Bruno will do RC on Thursday
    • 3.7.1
      • fixed some bugs w/ 3.7 (tarball creation, setup)
  • Developers
    • Marco Mascheroni
      • Thread w/ Shasad (images, certificates and osg-wn-client)
      • Fixes for OSG autoconf
    • Bruno
      • working on documentation for tarball removal
      • Will do Release Candidate
    • LeRayah
      • Hiccup w/ Factory, now all works
      • Will give feedback on documents
    • Mirica
      • GlideinMonitorinf in the
      • Troubleshooting jobs on Factory and Frontend
      • Notes for documents feedback
      • Initial version of a script to remove IP addresses
    • Namratha
      • Installation and configuration of Factory and Frontend. Making config correctly, reached out to Dennis and Bruno. Proxy not renewed. Finishing setup
      • MARCO TODO: Do ticket CVMFSExec
      • Working on documentation feedback
    • Marco Mambelli
      • Frontend secondary output
      • Fix tar ball generation in 3.7

Edgar is interested in using tokens for the startd connecting to the collector

TODO (Marco Mambelli):
  • priority to the script
  • changes in the frontend

-----
TEMPLATE
h2. June 10, 2020

Attending:

  • Releases
  • Developers
    • Marco Mascheroni
    • Dennis
    • Bruno
    • LeRayah
    • Mirica
    • Namratha
    • Marco Mambelli

TODO:


June 3, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, Dave Dykstra, Mats Rynge

  • Releases
    • 3.6.3
      • Bruno making progress on tar-ball install elimination. Factory completed, Frontend by tomorrow
      • Marco Mascheroni
  • Bruno will do RC on Friday
  • 3.7.1
    • 24250 in review
    • 22413, 22378 condor config, should be closed today.
  • Developers
    • Marco Mascheroni
    • Dennis
      • Working mainly on jobsub
    • Bruno
      • Dropping of tar-ball installation
    • LeRayah
      • installed factory and frontend, will start working on the project
    • Mirica
      • Install of Factory and Frontend
      • Research on anonymization techniques
    • Namratha Urs
      • Joined the team, getting onboard
    • Marco Mambelli
      • Frontend submitting to a secondary collector
      • Singularity wrapper script
  • Dave
    • Singularity 3.6 closer, 4th RC.
      • significant changes to the environment variable handling
        TODO: Marco Mascheroni, Mats Rynge will test the new Singularity: /cvmfs/oasis.opensciencegrid.org/mis/singularity/3.6.0~rc.4/bin/singularity

    • CVMFS 2.7.3 tagged next week
      • ducc - CVMFS tool to publish a container
      • Quick replication of pads in CVMFS (reuse already published containers) will be in CVMFS 2.8
  • Singularity discussion
    • Marco
      • The script is not ready yet
      • Discussion w/ LIGO yesterday to move towards HTCondor invocation of Singularity
        • Drivers: condor_ssh_to_job, simplify the GWMS/VO software, HTCondor people do not like wrappers, no progress in the past year
        • Two roads:
          • GWMS script will use condor and allow VO scripts
          • Use directly condor and no wrapper. LIGO will test this way (Edgar, LIGO person, HTCondor person)
    • Mats Rynge
      • About the LIGO initiative
        • Does not see the point
        • VOs, LIGO especially currently ask for much extra stuff
        • anyway the more we can simplify the better
      • waiting on the script to run in Singularity
      • working on things he wants to test before signing off
    • Dave asked about CVMFSExec
      • no progress on it yet

May 27, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown

  • Releases
    • 3.6.3
      • Bruno making progress on tar-ball install elimination. Working on Factory
      • Marco Mascheroni did a hot patch to reduce the Factory load. It considers grid jobs completed as stage-out instead of the current unknown that was included in the idle (see below).
      • Bruno will do RC on Monday
    • 3.7.1
      • 24250 - was breaking the CI, fixed, fixed problems and behaves more correctly, almost ready for review
      • 22413, 22378 condor config, should be closed today.
  • Developers
    • Marco Mascheroni
      • Troubleshoot load issue on the factory, mostly due to condor
        • Trouble filling Syracuse: condor is not seeing immediately the changes on the CE (stageouts are blocking) and the Factory handles badly unknown grid job status. Some jobs in completed state were categorized as unknown, preventing job completion because counted as idle. The patch is considering them as running/stageout.
        • Running jobs with grid job status completed were classified as unknown (hash status in glidefactorylib was not considering completed -> was unknown, returning 1100) Unknown was counted as idle, now added to the running/stageout.
        • TODO Will create ticket
      • Will finish other 3.6.3 tickets
    • Dennis
      • 3.7.1 work above
      • 2448 - condor token auth between factory and frontends. Condor week provided good info (besides the renaming they improved the documentation of tokens). How tokens are handled w/ regex in the mapfile came handy for the ticket. Hopefully completed this week.
        • Default - condor operator, accept all token from this offsite machine, IP. Other authentications/exchanges may be better. TODO: Will touch base w/ Brian Lin
    • LeRayah
      • Did the orientation ticket, is installing GWMS
      • Deciding how to proceed for the project
    • Mirica
      • Working on orientation, getting into the VM still problematic.
      • Some work w/ TARGET material changed the PowerPoint, updating the site
    • Marco Mambelli
      • Creating intern tickets, and administrative tasks
      • Script for running VO scripts in Singularity
      • Frontend publishing on a different collector
  • Follow-up w/ Mirica, TODO:
    • Will post on Slack advertising her availability for support
    • Will check daily the slack channel to answer questions, especially on the Python Notebooks

May 20, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Brian Lin, Mirica Yancey, LeRayah Neely-Brown
Joining for the wrapper discussion (10:30): Dave Dykstra, Mats Rynge

  • Releases
    • 6.3 - move tickets, release in 3 weeks
    • 3.7 - Marco Marcheroni ran sleep jobs on CMS ITB dev
    • 3.7.1 - Locked on the Token ticket, almost ready for review (unit test failing)
  • Guests
    • Brian Lin
      • Interested in Frontend-Factory token auth progress, will touch base w/ Dennis
  • Developers
    • Dennis
      • working on CI problem for 3.7.1
      • Working on token communication between Frontend and Factory
    • Bruno
      • Working on Tar-ball removal, expected 3.6.3
    • Marco Mascheroni
      • Tickets from last week
      • Tickets in 3.7 on how to handle attributes
    • Marco Mambelli
      • interns onboarding
      • ticket for Frontend output on different collector
      • singularity wrapper for scripts
      • Worked w/ Thomas to restart the python3 migration effort
    • Mirica
      • on-boarding
    • LeRayah
      • on-boarding
TODO: Marco Mascheroni will send an email about the WMAgent requirement for condor_chirp
  • It is a WMAgent bug, will be fixed but will take months, so should be supported
    • missing or empty CONDOR_CONFIG will cause WMAgent not to use condor_chirp
    • w/ CONDOR_CONFIG it will look at the condor_chirp location there
    • a fake CONDOR_CONFIG w/ no condor_chirp location (e.g. only a comment) or a CONDOR_CONFIG with unreachable condor_chirp location, will cause WMAgent to look for condor_chirp in the PATH. This is what we want.

TODO: Check w/ stakeholders if developers version could be python3 (after the first python3 version is reliable, use OSG contrib in the meantime)

  • Dave
    • Singularity
      • Company behind Singularity, Greg C, Sylab CEO and Singularity ideator, started a new company, not for profit, to make it easier to work w/ grants: ctrl-cmd (control command), retaining technical leadership of Singularity. Sylab is for-profit and will self support. Both company will be supporting Singularity.
      • New GitHub project hpc-ng, for singularity and other HPC projects
      • some delay with 3.6.0
    • CVMFS 2.7.3 next month
      • CVMFS integration

May 13, 2020

Stakeholders meeting, see: Stakeholders_Meeting_May-13-2020


May 6, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Jeff Dost, Brian Lin
Joining for the wrapper discussion: Dave Dykstra, Mats Rynge

  • Releases
    • 3.6.2
    • 6.3 - move tickets, release in 3 weeks
    • 3.7.1 - Locked on the Token ticket, almost ready for review
  • Guests
    • Jeff - There is a RedHat 8 test site - are we ready (worker nodes)?
      TODO: Marco will send email to J Dost, B Lin
  • Developers
    • Dennis
      • Review of CI ticket
      • Documentation on SciToken to CE
      • Working on HTCondor token between Factory and Frontend
        • Condor ping between the 2 machines works,
    • Marco Mascheroni
      • Finish the ticket about cleaning up old reconfigure
      • Adding pre and post reconfigure hooks
      • 2 more things would like in 3.6.3:
        • Fix for draining glideins
        • Adding limits to OSG autoconf
    • Marco Mambelli
      • Sphynx documentation
      • Checking releases
    • Bruno
      • Mainly HEPCloud
    • Marco Mambelli
      • Singularity tickets
      • Planning for 3.6.3 release
      • Preparing for Summer Interns
  • Singularity wrapper discussion
    • Mats and Edgar
      • will test on OSG with an "ITB" group
      • will start separating the parts of the wrapper that run inside vs outside singularity
    • Marco
      • Will prepare a helper script to allow scripts ("files" in the configuration) to run w/ the default (for the group) singularity image
    • On a first instance the ability to run at startup will be OK
    • HTCondor will work on allowing to run startd cron jobs in Singularity
    • Overarching ticket and specific sub tickets to trach the work
TODO
  • Marco Mambelli will prepare the tickets to trach this activity

April 22, 2020

Attending: Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni, Dave Dykstra, Brian Lin

  • Dave
    • New functionality fo CVMFS synccvmfs started through Singularity
    • CVMFSexec has been improved
    • CVMFS 2.7.2 tagged, released on Monday
    • Singularity 3.6 tagged, will be released in a couple of weeks
  • Releases
    • 3.6.2 is in OSG production
    • 3.7 is in OSG upcoming-testing and will be soon in upcoming
      • TODO: Send note about the file to touch in Factory to Tim Theisen and M.Mascheroni
      • Marco Marcheroni will test 3.7 on CERN ITB dev (Factory and Frontend)
    • 3.7.2 with tokens between Factory and FE will be out in about 2 months (asked by B.Lin)
  • Developers
    • Bruno
      • condor_chirp added to the PATH, will work in Singularity [#] . Make 2 tickets. Inside and outside singularity
      • Tickets review
    • Dennis
      • Sidetracked w/ Jobsub stuff
      • Will work on GWMS starting later this week
      • Tested 3.7 but not the logging yet
    • Marco Mascheroni
      • Baby is doing well besides not sleeping at night (that is OK)!
      • The cache should be already cleared when the Frontend is started [#]
        • Marco Mambelli will try to verify better
      • Singularity wrapper
        • Patch to clear pythonpath
        • source osgui from cvmfs - should be probably sourced by the CMS script, Marco Mambelli will look into it.
      • PIC has no outbound connectivity, requires 8.9.7. Problem in testing: Constant attribute in the entry is overwritten by the Frontend. The global attribute is not constant. There should be a warning at least. Will troubleshoot.
      • Glidein in downtime: is a time in the past an oversight or machine still down (doth could be OK depending on interpretation)
        • TODO Mascheroni: will check at the SI meeting which could be the correct policy
      • X509 ticket troubleshooting: frogot to merge the change
        • TODO: put an automatic process to avoid in the future: easy, low overhead for developers
    • Marco Mambelli
      • Troubleshoot w/ Edgar the schedd problem - red herring
      • Update the Singularity wrapper w/ OSG chenges
      • Manage better the environment cleanup

TODO:


April 8, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra,

Marco Mascheroni in paternity leave. Welcome Sofia!

*Releases
  • 3.6.2 in OSG production
  • 3.7
  • Dave
    • CVMFS 2.7.2 available
    • New Singularity next month. Message today from CEO Greg that will grow the team.
      • Current environment handling: -e for clean environment, variables w/ singularity_env_VARNAME -> VARNAME
      • There is a singularity discussion in GitHub tinking about redoing how variables are handled: https://github.com/sylabs/singularity/issues/5040
    • RH 7.8 is going to have unprivileged fuse mount in user namespaces (like RH8). Released and beta SL. This way they will support Podman.
  • Developers
    • Marco Mascheroni
    • Dennis
      • [#24285] - Problem in the tarball generator, Those files should not be on RH6 tarballs
      • Scitoken stuff
    • Bruno
      • Worked on HEPCloud this week
    • Marco Mambelli
      • Release 3.7
      • Test singularity wrapper for CMS w/ Marco Mascheroni
      • Remove PYTHONPATH from singularity jobs

April 1, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
  • Developers
    • Marco Mascheroni
    • Dennis
      • HTCondor CE working with scitoken
    • Bruno
    • Marco Mambelli

March 25, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni, Dave Dykstra

  • Release
    • 3.6.2
      • Mascheroni has it installed on UCSD, CERN/-ITB, and CMS ITB FE
      • Test singularity
        • Not clearing PYHONPATH
        • OSGUI missing on some sites, failing because GFAL is not finding plugins
        • arc command missing, the singularity startup script ends up calling recursively itself w/ a lot of stderr messages (3GB) (will work w/ Dave Dykstra)
        • condor_chirp could not be found (command -v condor_chirp) - assign ticket to Bruno
    • 3.7
  • Dave
    • Singularity, developers call, plugin to run unprivileged also for building images and running image files, using the Unix kernel library, can do fuse mount unprivileged also in old OS. Plugin support may be in 3.6 (July), plugin may be available later.
    • Investigating vault. vault plugin to store oauth2 secrets (from puppet labs). Store a refresh token as secret and each time you read it gives a regular token.
      • it is a heavier service, not to be deployed on glideins, but could be used if installed on nodes
  • Developers
    • Marco Mascheroni
      • CRIC ticket will be separated and
      • Now Factory ops are more proactive in providing condor tarball; each time there is a reconfig there is a new version of the tarball. We need a better handling of condor tarball. New copy is done also if the file is the same
        • TODO: Marco Mascheroni will create a ticket
    • Dennis
    • Bruno
    • Marco Mambelli

March 18, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • 3.6.2
      • RC2
      • Test
    • Release 3.7
      • Ticket reviewed, ready, RC
  • Dave
    • New Singularity RC 3.5.3, mostly bug fixes
    • Version in CVMFS is 3.5.2, many sites have still 3.4.x
    • Expecting new CVMFS release (maxmine GOIP, still free but you need a license)
  • Developers
    • Marco Mascheroni
      • Found a missing os.realpath() in entry-ls
      • OSG collector autoconf limit added
      • Reported to a face to face all-hands
      • Have a section in the 99-local-ini file a section to configure the pilot (will be used by CRIC generate entry configuration)
      • Right now we are depending on having an entry in the factory to run manual submit
      • Add hooks on the reconfigure script - will open a ticket
    • Dennis
      • Install new CE that works w/ SciToken
      • Transporting the SciToken to the Factory that is using it for the authentication
    • Bruno
      • Testing
    • Marco Mambelli
      • Testing, release RC
      • Ticket about handling properly module setup (MODULE_USE)

TODO MM - Follow up w/ Sakib about the module function/ env variable (today)


March 11, 2020

Stakeholders meeting


March 4, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • The following tickets should be included in 3.6.2
      • Marco Masc has a new ticket not yet defined about reducing Frontend queries (caching the schedd info)
      • CRIC, may go to the next release. There are lib and cfg directories. The content of lib should go in the GWMS lib abd util.py should get a more specific name. Cfg file should start w/ a template/default and go in the /etc directory, possibly consider yaml not to add a new configuration language and mix code w/ configuration
    • Release 3.7
      • Ticket reviewed, ready
    • Cutting releases
      • Bruno and Dennis to try the whole process
  • Developers
    • Marco Mascheroni (via email)
    • Dennis
      • Ticket for the release
      • Progress w/ scitoken authentication - condor maps token to identities via mapfile, getting close to have the full chain on the Frontend
        • Authenticating schedd w/ collector using token. Probably that step will not be in the final architecture, the authentication is w/ CE
      • Will start automated tests for 3.6.2.rc2
    • Bruno
      • Ticket reviews
      • Started working on dropping tarball
    • Marco Mambelli
      • Fixing release script
      • Reviewing tickets
      • Cut 3.6.2.rc2
  • TODO
    • Everyone
      • provide stakeholders slide to Marco Mmabelli
    • Marco Mambelli
      • Check w/ Margaret about stakeholders meeting calendar
      • Collect slides and send for review to developers

February 26, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 out, will be in the last OSG 3.4 release and OSG 3.5
  • CVMFS: pushing for wider adoption of EGI configuration instead of CERN one, so OSG repositories are available
    • uBoone is the first osg.storage.org replicated in Europe, others read directly from FNAL or BNL
    • A monitor to log accesses is being added (to know which repos are used in Europe)
  • Tokens: HashiCorp (same company of Vagrant) Vault, general-purpose storage for secrets, including refresh tokens (e.g. for AWS). Dave is testing it.
  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
    • Release 3.7
      • Ticket reviewed, ready
  • Developers
    • Marco Mascheroni (via email)
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • Will check and integrate the feedback.
      • Will send an email to Brian and Brian because the current solution is using sudo
      • Did a demo using sci-token authentication. Was able to authenticate to the CE
    • Bruno
      • reviewing Dennis ticket
    • Marco Mambelli
      • GPU monitoring in Singularity
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
      • Will cut soon a RC

February 19, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 soon to fix filesystem bug, 3.5.2 is the currently recommended one (OASIS has still 3.4.2, will change)
  • CVMFS bug fix in the client for stashcache type operation, w/ xrootd redirector, was not retrying multiple servers
  • Brian Linn:
    • Interested on an update on token support and offered OSG support
  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • The ticket about condor token-auth will be ready today
    • Bruno
      • condor_chirp replacements, former pychirp, ready for feedback
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • TODO
    • Show to Dennis and Bruno the release procedures

February 12, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 soon to fix filesystem bug, 3.5.2 is the currently recommended one (OASIS has still 3.4.2, will change)
  • CVMFS bug fix in the client for stashcache type operation, w/ xrootd redirector, was not retrying multiple servers
  • Brian Linn:
    • Interested on an update on token support and offered OSG support
  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • The ticket about condor token-auth will be ready today
    • Bruno
      • condor_chirp replacements, former pychirp, ready for feedback
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • TODO
    • Show to Dennis and Bruno the release procedures

February 5, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • 3.6.2 RC1 available
    • Waiting for Bruno and Dennis logins at UW to do next release candidates
  • Developers
    • Marco Mascheroni
      • CHEP paper
      • Feedback of tickets applied
      • SI meeting, pressure about the monitoring
        • 3 factories, have to check in 3 different places
        • breakdown per entry, group by site
    • Dennis
      • The ticket about condor token-auth will be ready soon
    • Bruno
      • changes done in pychirp
      • will be working on tokens
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
      • worked on tickets feedback
      • RC release and updated release documents
AOB-TODO
  • queries to the collector, reducing the load
    Frontend is doing 1400 queries/hr, mostly to find the schedd name
    changing to cache the results for condor_status for the schedd named
  • Antonio working on configurating resources w/o outbound connectivity,
    condor version using file system instead of network
    Use the one already installed in the WN instead of the one we ship
    patched condor_startup to use a local condor instead of the one we ship
  • Facilitate CMS-Kevin collaboaration
    -----

January 29, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • The ticket about condor token-auth will be ready today
    • Bruno
      • condor_chirp replacements, former pychirp, ready for feedback
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • TODO
    • Show to Dennis and Bruno the release procedures

January 22, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn

  • Release
    • The following tickets should be included in 3.6.2
      • Extend the proxy lifetime copying the renewed proxy
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
      • pychirp (condor_chirp replacement)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • Will be getting some monitoring information about monitoring to help Antonio
    • Dennis
      • Solved problem w/ CI (to fix the merge conflict than is OK)
      • Playing w/ instructions w/ sci-tokens but have not been working so far
    • Bruno
      • Working on shipping pychirp, including it in release
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory

January 15, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 soon to fix filesystem bug, 3.5.2 is the currently recommended one (OASIS has still 3.4.2, will change)
  • CVMFS bug fix in the client for stashcache type operation, w/ xrootd redirector, was not retrying multiple servers
  • Brian Linn:
    • Discussion about the CE
  • Developers
    • Marco Mascheroni
      • Finished the work about held glideins
      • Improvement on manual submit glidein, will open the ticket
    • Dennis
      • Progress on token-auth, using separate passwords on each token #23092
    • Bruno
      • Working on shipping pychirp, startup was expecting to run the wrapper
    • Marco Mambelli
      • Work on Use of Python constructs like context managers #22470
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • Brian and Marco discussing hosted CE 34 configuration

January 8, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn

  • Releases
    • 3.6.1 in OSG production
      • ITB factories are 3.6.1
      • 3.6.1 Singularity scripts applied to 3.4.6 production factories
    • 3.6.2 in 2 weeks
    • 3.7 in 2 weeks
      • No updates on 3.7
      • Dennis for token-auth
      • Marco for Leonardo's log and glidein_startup changes
  • Brian is interested in an update over the token integration
    • token-auth for daemon (glideins) will be in 3.7
    • Dennis and Marco contacted Andrea Ceccanti and have been experimenting w/ wlcg-token but no GWMS development
    • Brian Lin, will let us know about the workflow to get a token for sci-token submission
  • Developers
    • Marco Mascheroni
      • Worked on #23340, avoid releasing glideins
      • Discussion w/ Edgar about running HTCondor 8.9 in glideins, will open ticket
      • Will complete some 3.6.2 work this week and will be on vacation the following 2 weeks
    • Marco Mambelli
      • Troubleshoot and work on #22245, the problem is CONDOR env variables not being used by python bindings
      • Troubleshooting frontend problem w/ Steve
      • Packaging GlideinMonitor
    • Dennis
      • 23467, CI changes: works in command line, failing in nightly CI. TODO: check and follow-up w/ Vito
      • tokens: tested new condor 8.9.5, fixes some problems noted before, changes not completely understood
    • Bruno
      • Learning more about GWMS code
      • Working on pychirp, #21711, will be adding the script to the Factory scripts
    • Marco
      • Working on Frontend reporting to separate collector, #23451
      • Closed some other tickets
      • Cleanup and organization of wiki and meetings