Project

General

Profile

Weekly Meeting Notes

Wednesdays, 10am central

Jump to the current Weekly Meeting Notes
Jump to the old Weekly Meeting Notes 2019
Jump to the old Weekly Meeting Notes 2018
Jump to the old Weekly Meeting Notes 2017
Jump to the old Weekly Meeting Notes 2016

-----
TEMPLATE
h2. November , 2020

Attending:

  • Releases
  • Developers
    • Marco Mascheroni
    • Dennis
    • Bruno
    • Marco Mambelli

TODO:


November 25, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Diego Davila Shreyas Bhat

  • Containers
    • Frontend (Diego)
      • Production Frontend deployed in production cluster in Madison (from Edgar's student)
      • Image ready to commit (once there
      • Some environment variables will be added to the documentation: which repos to clone for the validation scripts, ...
      • Work on separating the Web server
      • Documentation will go in the README file
    • Factory (Shreyas)
      • Talking w/ Jeff, next week they will compare notes and start moving forward
      • The plan is to use Tiger at Madison for the development
      • Some info about how secrets are handled are in the tiger documentation https://github.com/opensciencegrid/tiger-osg-config
  • Releases
    • Production and ITB OSG are both 3.7.1
      • Will do a CMS full test. Other problems delaying this HTC 8.9.9 negotiator bug, Patching gridmanager
    • Idea to make a new production release 3.8
    • 3.9
      • The "Multiple classads" problem seems gone away
  • Developers
    • Marco Mascheroni
      • enabled intelligent removal of glidein in entries (tracking) to stop micro-managing the limits. Test w/ Vanderbilt, FNAL, PIC, UCSD. Removing only 10 pilots at the time, maybe should be connected to the difference (a percentage) to limit delay
      • Will work on pull requests
      • Will work on cvmfsexec
    • Bruno
      • Tracking down the change that fixed the behavior in 3.9
    • Marco Mambelli
      • Review tickets
      • GlExec removal
      • Moving SL6 VMs, including gwms-web

TODO:


November 18, 2020

Attending: Brian Lin, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box, Mats Rynge, Dave Dykstra, Namratha Urs

  • Releases
    • 3.6.5 in production, all OK
    • 3.7.1 in OSG factory and ITB frontend
      • Bub where the Frontend is creating tokens also for "default" entries and it takes 4sec per token/entry. This is not acceptable for production Factories w/ many entries. The time may have been caused by a misconfiguration of the CAs. Option to have a single token for all entries or one per trust domain.
      • need a different way to decide about token: default is not clear which version of condor it corresponds to
      • token name should use the entry_name (glidein_site may not be defined)
    • 3.9 - python3, workable, Marco Mambelli started merge of 3.7.1
  • Developers
    • Marco Mascheroni
      • Busy w/ talk about CRAB in analysis tools workshop (100s people)
      • testing 3.7.1
      • Support for Ubuntu testing: manual_glidein_startup is the one in the factory, files are downloaded from there and should be updated
    • Dennis
      • troubleshooting 3.7.1
    • Bruno
      • Busy w/ DE
    • Marco Mambelli
      • Added Singularity override path
      • Fixed script for Ubuntu (Marco Mascheroni did another change, not communicated yet)
      • Merging of 3.7.1 in 3.9 and conflict resolution
  • Dave
    • new singularity 3.7.0 RC1, Marco Mascheroni tested it in ITB and is OK
    • CVMFS
      • Occasionally crashes at UChicago
      • One of the images created for CMS having a problem. CVMFS project tool not preserving the timestamp, wrong timestamps cause python rejecting .pyc/.pyo
  • Singularity Discussion
    • scripts in Singularity feature not tested yet by Mats
    • updated frontend to 3.7.1
    • Problem w/ negotiator when condor 8.9.9, will send email
    • How much of a problem is to assume that CVMFS is not in /cvmfs ?
      • This should be done in a generic way (was done at TACC)

November 4, 2020

Attending: Brian Lin, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box, Mats Rynge, Dave Dykstra, Namratha Urs

  • Releases
    • 3.6.5 in production, all OK
    • 3.7.1 in osg-upcoming-development
      • Started testing
      • OSG ITB 3.7.1.rc9
    • 3.9 - python3, workable, merge 3.7.1 once it is in osg-testing
  • Developers
    • Marco Mascheroni
      • OSG autoconf fails if a CE is in the whitelist but down, i.e. not in the OSG collector (FactoryOps)
      • Started working on cleanup script
      • Site using glidein in a vacuum. Glidein_singularity_use to required
    • Dennis
      • Tagged and pushed 3.7.1 to osg-upcoming-development
        • started automatic tests
        • committed ansible scripts
    • Bruno
      • Troubleshooting problem w/ DE
    • Marco Mambelli
      • Cleanup script
      • Support tickets for FNAL Singularity images
  • Dave
    • Singularity 3.6.4
    • Install Vault tutorial, send an email to Dave if interested
  • Singularity Discussion
    • Mats will check w/ Edgar and run a 3.7.1 Frontend
    • How to handle the case of missing images?
      • Is the image missing because of a wrong path of because CVMFS is failing?
      • If the image is missing is logging in the HTCondor error file or stderr? 1. is causing the wrapper to fail and HTCondor retries the job 2. is causing

October 28, 2020

Attending: Diego Davila, Jeff Dost, Edgar Fajardo, Brian Lin, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box, Maria

  • Containers discussion:
    • Shreyas started to upload software
      • will talk to Marco or Jeff about a basic implementation
    • Farrukh maintains the FE container for Fermilab
    • Edgar maintains the containers for OSG
    • Edgar, Farrukh, and Diego (UCSD) will be meeting tomorrow
    • Repositories
      • TODO: Marco will add Diego (ddavila0) to GitHub and Docker Hub. Done
      • the Docker Hub glideinwms members are limited to 3, Marco, Shreyas, Diego, these will link the GitHub repos for auto-builds
    • Discussion/Suggestions
      Roadblock:
      • Edgar: each frontend has a set of validation scripts that need to be the latest version provided by the VOs
        • Currently there is a script at the beginning of the container that does a git clone
        • Would be nice if GWMS provided a mechanism - way to download setup scripts, e.g. given an env variable
        • TODO: Marco will open a Redmine ticket
  • Releases
    • 3.6.5 in production, all OK
    • 3.7.1.rc9 out in a couple of days
      • will include 3.6.5 fixes
      • Documentation updates, fixed bug on tokens copy
      • Make sure that will include merged HTC 8.9 fix
    • 3.9 - python3, workable
  • Developers
    • Bruno
      • new deployment 3.7 to test tokens
      • working w/ LeRayah. Having a FE replica to replicate the environment
    • Dennis
      • merged ticket in branch_v3_7
      • wrote ansible scripts to upgrade and downgrade FE, will add them to the repo
    • Marco Mascheroni
      • some more testing of intelligent pilot removal found a bug [#25113]
      • if the request hits a factory limit, it should still use the requested, not the factory limit
      • need some more testing w/ multiple VOs
      • idle pilots in the FE, may be
    • Marco Mambelli
      • Working on singularity support for test scripts

October 21, 2020

Attending: Jeff Dost, Brian Lin, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box, Dave Dykstra

  • Releases
    • 3.6.5 in OSG production, all OK
    • 3.7.1.rc9 out soon
      • ID tokens enabled by default
    • 3.9 - python3, workable
  • Dave
    • singularity 3.6.4 - unsquash fs could be squashed (cif image or downloaded image or container build based on an other untrusted image)
      • htgettoken updated in vault. Fermilab populating cilogon w/ people and scopes (DUNE scopes). Retrievable via htgettoken and Vault. Can restrict the list and paths
      • disabling fuse mount in 3.6 is currently broken
    • CVMFS, new version handling better configuration
      • SL6 regression test failing, mounting in EL6, not access to CVMFS mounted in SL7
  • Developers
    • Bruno
      • 3.7.1 testing: Pay attention to install CA certificates on both Frontend and Factory
      • Working w/ LeRaya - making some progress, would be useful to have a "fake frontend"
    • Marco Mascheroni
      • New Frontend, security asked to use https -> it is not possible because we need caching
      • CMS enabled glidein tracking. Documentation may be inverting the action:
        • idle removing only the idle
        • wait is removing also the waiting in the queue
      • working on feature to execute scripts at the end of the Glidein
    • Dennis
    • Marco Mambelli

October 14, 2020

Attending: Jeff Dost, Brian Lin, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box

TODO: link to ticket to Mascheroni about CMS overriding

  • Releases
    • 3.6.5 in OSG production, already on CERN factory, all OK
      • known issue w/ HTCondor 8.9.x, not used for production, anyway fix ready will be in 3.6.6
    • 3.7.1.rc9 out in a couple of days
      • will include 3.6.5 fixes
      • Documentation updates, fixed bug on tokens copy
      • Make sure that will include merged HTC 8.9 fix
    • 3.9 - python3, workable
Brian
Jeff
  • One site w/ hosted CE. User jobs are using too much disk
    • condor disk variable, gets into the config, but by the time it runs it gets overriddeen w/ total disk amount
    • reserved disk, subtracts from total but is not OK
    • TODO open a ticket
  • Developers
    • Bruno
      • 3.7.1 testing
      • Working w/ LeRayah
    • Marco Mascheroni
      • started work on cleanup script, will ping Namratha
      • discussion on why the script patch is needed
    • Dennis
      • Condor 8.9 not playing nicely w/ pilot [#25068]. Those lines are needed for HTC 8.9, will do nothing in 8.8
      • 3.7.1 almost ready for 3.7.1.rc9
    • Marco Mambelli
      • Working on ability to run test scripts in Singularity [#21885], will work on adding the ability to have VO scripts in Singularity before the job

October 7, 2020

Attending: Dave Dykstra, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box

  • Dave
    • Singularity 3.6.3 tomorrow in OSG production
    • TODO: Marco will restart meetings w/Mats in 2 weeks
  • Brial Lin
    • Tim Theisen will follow up on GlideinWMS in OSG release - done
    • OSG 3.6 - deprecating OSG env variables, check if GlideinWMS is using them. TODO: will send email
  • Releases
    • 3.6.5 - tests all OK, will be released in OSG tomorrow
    • 3.7.1
      • Bruno has been testing 3.7.1.rc8, done some fixes
      • Dennis should release later today w/ the fixes
    • 3.9.1
      • waiting for 3.9.1 to merge token fixes
  • Developers
    • Bruno
      • Testing 3.7.1
      • Working w/ LeRayah on her Capstone, submitting custom Jobs
    • Dennis
      • Debug scitoken w/ Bruno
      • Will work on the ticket about problems w/ the new HTCondor version in 3.6.x (less urgent since HTCondor 8.10 -> 9.0 has been postponed to January)
    • Marco Mascheroni
      • Working on validation of 3.6.5
      • Fixing a couple of issues in OSG_autoconf and gfdiff
      • Manual glidein startup to use the config directory
      • TODO will check PRs
      • TODO feedback report for CMS
      • TODO CMS will use more GPU autodiscovery
      • TODO Will write to Namratha
    • Marco Mambelli

TODO:


September 30, 2020

Attending: Brian Lin, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box

  • Releases
    • 3.6.4
      • TODO: Marco Mambelli will troubleshoot a problem w/ some sites (MIT, ...)
    • 3.7.1 release candidate RC8, the announcement did not work
      • Ran smoke test and tested instructions, interoperability test. Had to restart condor. Condor status was claiming that was working, but had to restart to get things working. 8.8.10+3.6.4 / 8.8.8+3.6.2 - 8.9.8+3.7.1
      • 3.7.x running on UCSD ITB
        • TODO: Dennis will update the testing infrastructure in the wiki by adding fermicloud127, token CE, accepts both self-signed and WLCG
    • 3.9.0, in osg-contrib, some known bug, will be fixed once 3.7.1 is released
  • Brian Lin presented the new authentication model proposed for HTCondor-CEs using the OSG Collector
    • will allow getting away w/o host certificates or DNS entries
    • the collector is the trusted intermediary between the factory (submitter) and the CE
    • QUESTION from Marco Mambelli: will this overhead be handled in HTCondor (some internal submission loop, storing the idtokens and getting them when needed, only the collector needs to be added to the config) or will this be handled externally by the submitter that has to do the steps and store and renew idtokens? Off course the 1st solution would be preferred
  • Developers
    • Marco Mascheroni
      • Troubleshoot 3.6.4
      • Will start working w/ Namratha
    • Dennis
      • The email did not go out. Outlook has an error
    • Bruno
      • Working on 3.9.0. The new 3.9 factory accumulates multiple classads (same CE). The number prefixing the classad (RSA ID?) keeps changing
      • Working w/ LeRayah
      • TODO: will
    • Marco Mambelli
      • Tested 3.6.4 and 3.9.0
      • Reorganized tickets
      • Added Github actions for unit tests, pylint, pycodestyle, BATS

TODO:


September 23, 2020

Attending: Dave Dykstra, Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box

  • Dave
    • Singularity, 2 new releases, current unprivileged user namespaces or 3.5.2, 3.6.2, soon 3.6.3
      • Nebraska using unprivileged user namespaces w/ Docker, disabled setuid,
      • Suggested to disable network namespaces (increases security, known exploit in default CentOS8)
      • Here a summary of the Singularity use: http://mmascher.web.cern.ch/mmascher/singularity_versions.txt
        $ condor_status -af SINGULARITY_VERSION GWMS_SINGULARITY_MODE GWMS_SINGULARITY_PATH GLIDEIN_Entry_Name | sort | uniq -c
             48 2.5.0-dist privileged /usr/bin/singularity CMS_T2_PK_NCP_pcncp04
           1511 2.5.2-dist privileged /usr/bin/singularity CMS_T3_FR_IPNL_lyo07
            154 2.6.0-dist privileged /usr/bin/singularity CMS_T2_TR_METU_Cox01
              2 2.6.1-dist privileged /bin/singularity CMSHTPC_T1_ES_PIC_ce13-multicore
              2 2.6.1-dist privileged /bin/singularity CMSHTPC_T1_ES_PIC_ce14-multicore
             70 2.6.1-dist privileged /bin/singularity CMSHTPC_T3_US_PuertoRico_UPRM
             74 2.6.1-dist privileged /bin/singularity CMS_T2_US_Purdue_hadoop_condce
           2473 2.6.1-dist privileged /usr/bin/singularity CMSHTPC_T1_RU_JINR_ce01_mcore12
           1635 2.6.1-dist privileged /usr/bin/singularity CMSHTPC_T1_RU_JINR_ce02_mcore12
            540 2.6.1-dist privileged /usr/bin/singularity CMSHTPC_T1_UK_RAL_arc_ce01_multicore_el7
            423 2.6.1-dist privileged /usr/bin/singularity CMSHTPC_T1_UK_RAL_arc_ce02_multicore_e
        
  • Releases
    • 3.6.4 released, Mambelli did the smoke test, Dennis will do automated tests
    • 3.7.1 Last week hackathon helped flush several problems, 3.7.1.rc8 expected today or tomorrow
      • OSG ITB has 3.7.1.rc5
      • Still there is no provision to update sci-tokens when they expire
    • 3.9.0 released, works w/ GSI, smoke test OK, problems w/ tokens and in monitoring
      • Bruno found also a bug in the credential handling
  • Developers
    • Marco Mascheroni
      • Running manual_glidein_submit on the HEPCloud factory found and fixed a bug [#]. TODO: will make ticket
      • Will work on integrating Namratha's work - TODO: Mambelli will assign the ticket
    • Dennis
      • Mainly busy w/ Hackathon, fixing 3.7.1 as discussed above
    • Bruno
      • TODO: will open tickets for 3.9 as discussed
      • LeRayah using GlideinWMS as Capstone project, manual submission to test GWMS
    • Marco Mambelli
      • Python3 bug fixes
      • Released 3.6.4
      • Fixing tickets and for releases completed and future ones (3.6.x, 3.7.x, 3.9.x)
      • Work w/ Thomas and Mirica for presentation at HTCondor Fall Week
TODO:
  • ResourceSlot used to discover GPUs:
    <attr name="GLIDEIN_Resource_Slots" const="True" glidein_publish="True" job_publish="False" parameter="True" publish="True" type="string" value="GPUs,1,type=main"/>
    

September 16, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, (Dennis Box is attending the AuthN/Z hackathon)

  • Releases
    • 3.6.4
      • to be released before the weekend, once condor_chirp is fixed
      • fixed the variable w/ quoting
    • 3.7.1, RC6 - out
    • 3.9 python3
      • fixed the last bugs, Frontend can request glideins, to test and release
  • Developers
    • Marco Mascheroni
      • In HTC 8.8.10, there is a fix for sharedport_port, was ignored. By default it was the same as the collector.
        collector.use_shared_port was false
        #-- Collectors are behind shared port starting in HTCondor 8.4
        # Disable the use of shared port by collector
        COLLECTOR_USES_SHARED_PORT=False
        # In HTCondor 8.6 this seems to be needed as well (otherwise the collector uses shared port)
        COLLECTOR.USE_SHARED_PORT=False
        
        • These setting could become True or the lines could be commented/removed (True is the default)
      • There is a CMS request to auto-discover GPUs: GPU AUTO
      • Frontend operators are interested in knowing "Why some jobs are not matching"
        • There are some lines in the frontend to dump profilig expression (commented). Working on a tool to tell if cluster can match
        • Marco mambelli suggested to check existing Frontend tools, some do the same
      • CMS interested in being early adopter for DE, Dec/Jan time frame, will announce at the stakeholders meeting
    • Bruno
      • Factory, problem w/ the VM
      • Fixed bug: calling explicitly .next --> .__next__()
    • Marco Mambelli
      • Fixed Frontend bug in advertising
TODO:
  • Marco Mascheroni will open tickets for the requests reported
  • Marco Mascheroni will contact Marco Mambelli about using GLIDEIN_Resource_Slots for GPU AUTO

September 2, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Dennis Box

  • Releases
    • 3.6.4, add OSG autoconf improvements, waiting for condor_chirp fix
    • 3.7.1
      • RC4, update the token documentation
      • fixed the automatic ID-token if it can (glidein), for Factory and Frontend you need to generate a token
    • 3.9 (ex v3p3), RC1, unit tests OK but not working
  • Developers
    • Mascheroni
      • Started officially as CMS Level2 SI manager, only 25% on GWMS
      • Troubleshooting condor_pychirp
        • pythonpath not set
        • directory not found
      • Test entry and glidein_startup_vrapper: http://gfactory-itb-1.opensciencegrid.org/vacuum/
        • This could be moved to GWMS, a post-reconfigure script, could be documented in the Factory, you don't need to query the factory, find out entry name, ... reduces the work for the admin.
      • tickets in 3.7 related to how we handle attributes
    • Dennis
      • 3.7.1 RC4 - reading the trust model document
    • Marco Mambelli
      • working on Python3
    • Bruno
      • working on Python3

TODO: Prepare slides for next week's stakeholders meeting


August 19, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Namratha Urs, Brian Lin, Joe Boyd

  • Releases
    • 3.6.3 has been released in Monday and will be released in OSG 3.4 and OSG 3.5
      • Next 3.6.x may be included only in the OSG 3.4 development repo w/o being officially released and the CMS Factory can install it from there
    • 3.7.1.rc1 released yesterday, smoke test are OK, initial tests needed
      • Brian Lin - Kubernete's Hackathon, one subject could be the Factory
    • 3.p3. certificate problem. Factory can open and authenticate CE certificate:
      • RSA Key that cannot be parsed, it comes in a classad
      • a reconfigure is not fixing it
  • Brian Lin
    • trust model paper, look at factory specific bits
    • TODO: Marco will read the document, Brian available for meeting to discuss it
  • Developers
    • Bruno
      • Worked on v3p3
    • Dennis
      • did 3.7.1 RC1 will test more
    • Marco Mascheroni
      • more changes to OSG Autoconf
      • 3.6.3 in ITB CMS. Initial tests OK, will test more condor_chirp
    • Namratha
      • Last week, completing the integration
    • Marco Mambelli
      • New CI test scripts
      • P3 migration

August 12, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Namratha Urs, Brian Lin, Dave Dykstra

  • Dave
    • Singularity 3.6.1 new default
      • TODO: Namratha will open an issue request to select the platform to build cvmfsexec
    • Default configuration for CERN VM (when not coming form OSG and EGI) based more on config repository like OSG and EGI
    • Tokens
      • pull request for volt to work better w/ device flow beside code flow
        • code flow: in OIDC, intended for browser that starts the request, issuer, to IDP, then issuer and original browser it started from). volt was starting a web browser on the command line client. Not practical
        • device flow: intended for set top box (no browser in it) makes a URL that you use on a browser, then you get a code for the device that checks periodically the issuer
        • Dave is working on a hybrid flow that would modify the current code flow not to redirect to the client
  • Brian OSG CE registry
    • admin register w/ OSG central collector
    • authenticating the CE, the CE would give to the Factory permission to submit
    • TODO: continue offline the discussion about trust model wether include
  • Releases
    • 3.6.3
      • RC3 in OSG development
    • 3.7.1
      • no news Dennis working on a ticket
    • p3
      • Factory fully working, up to submitting glidein
      • fixes in glidein_startup_script were committed w/ still conflict line in it.
  • Developers
    • Bruno
      • working on python3
    • Marco Mascheroni
      • came back from vacation, worked on osg autoconfig, some changes added for migration to SLATE CE to be documented and committed
    • Namratha
      • CVMFS feature for GWMS: completed changes and verification
      • working on testing the scripts and packaging different distributions depending on platform
      • TODO: look at platform names to use
    • Mambelli
      • worked on python3 and CI tests

August 5, 2020

Attending: Marco Mambelli, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs

  • Releases
    • 3.6.3
      • RC3 in OSG development
      • Dennis sent pull request to OSG w/ documentation changes
    • 3.7.1
      • Dennis fixing Fact-Fe token ticket
    • p3
      • Working on Factory, pylint and tests
  • Developers
    • Bruno
      • Python 3 migration
    • Mirica
      • Last week, working on wrapping up the project
    • LeRayah
      • Last week, working on wrapping up the project
    • Namratha
      • Refactoring of her code
    • Marco MAmbelli
      • Python3 migration

July 29, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, LeRayah Neely-Brown, Namratha Urs, (second part) Dave Dykstra

  • Releases
    • 3.6.3
      • RC3 in OSG development
      • Dennis sent pull request to OSG w/ documentation changes
    • 3.7.1
      • Dennis fixing Fact-Fe token ticket
    • p3
      • python3 version, Factory almost working, mostly str/bytes conflicts
      • sort from a list: python2 requires 2 functions, python3 only one
      • lib/cleanupSupport sort function
      • / is no more integer division
  • Developers
    • Marco Mascheroni
      • Followed up w/ Edgar: Frequency of enabling/disabling is once a year, so it's OK to have some monitoring transient where it's incorrect
      • Opened a couple of issues: documenting XML monitoring, bug found by operations: global limits in the entry are not respected when you have multiple frontends (it may happen because the submit processes are in parallel)
    • Bruno
      • python3
    • Dennis
      • Feedback 24561, token communication broken
    • LeRayah
      • poster and presentation
    • Namratha
      • integration in GWMS code base
      • git repo in the ticket
    • Marco Mambelli
      • working on Python3 version
  • Dave
    • Singularity 3.6.1: fixed some uncommon bugs
      • Marco Mascheroni will test 3.6.1, there are some major changes in 3.6.x
      • When using privileged Singularity w/ Docker there is some change to do in the configuration. Dave will warn Tony
    • Token
      • htgettoken, utility to fetch tokens from Vault
      • Dave will send an email about the Vault server and htgettoken, Dennis will do some test
      • Dave will work on a Vault credmon. Give a long lived Vault token (supertoken) that reads from everybody access token. A more secure is to allow an user option to request longer token, more secure because can access only the tokens from that user.

July 22, 2020

Attending: Marco Mambelli, Marco Mascheroni, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs

  • Releases
    • 3.6.3
      • Bruno working on RC
      • Marco Mascheroni is touching base w/ Edgar about pull request and effect on monitoring
    • 3.7.1
      • Factory-Frontend ticket communication in feedback
      • Ready for RC
    • p3
      • Fixing code and pylint scripts
  • Developers
    • Marco Mascheroni
      • Operation work - follow up on how much time the pilot is spending validation and how many failing validations there are
        • 20% of jobs at Purdue were not matching
      • see TODOs below
    • Bruno
      • Working on Python3 conversion, fixed upgrade and reconfig for Factory
        • Learned how to use pydebug, to attach to a running module - proposed for the group code review
    • LeRayah
      • Working on the testing application
      • Working on composing the Poster and paper
    • Mimi
      • Debugging and adding to the script
      • Move into GWMS once sure it is not working
      • Started writing the slides and paper
    • Namratha
      • Working on the feature: using cvmfsexec and mountrepo
    • Marco Mambelli
      • working on Python3 version
  • TODO:
    • Marco Mascheroni and Bruno will write a wiki document about remote python debugging techniques (breakpoints, dealing w/ fork and multiprocesses)
    • Add a ticket about the possibility of not spawning to ease debug

July 15, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, (second part) Dave Dykstra

  • Releases
    • 3.6.3
      • Ready for RC
      • Dennis sent pull request to OSG w/ documentation changes
    • 3.7.1
      • Factory-Frontend ticket communication in feedback
      • Ready for RC
    • p3
      • Fixing code and pylint scripts
  • Developers
    • Marco Mascheroni
      • working on OSG autoconf - improvements mentioned on stakeholders slides - will be merging to 3.6.3
      • will complete the testing for Edgar's pull request
    • Bruno
      • Working on Python3 conversion, fixed upgrade and reconfig for Factory
        • Learned how to use pydebug, to attach to a running module - proposed for the group code review
    • Dennis
      • completed token documentation
      • Re-reading the RPM generation
    • LeRayah
      • Incorporating feedback from Farruck, Maria, and Namratha
    • Mimi
      • Testing the application in a local directory, will test in GlideinMonitor
    • Namratha
      • Working on the feature: using cvmfsexec and mountrepo
    • Marco Mambelli
      • working on Python3 version
  • Dave
    • singularity 3.6.0 released in EPEL, added digital signatures on image files.
    • cvmfsexec documentation updated
      • Richard Jones using CVMFSEXEC on his site.
    • tokens
      • htgettoken to retrieve from Vault
      • OIDC (Open ID connect, designed for web system, adapted for command line
      • Vault, secret manager already supporting OIDC protocol, there will be one at Fermilab
        • Regular flow works well w/ web browser as client, does a call back to a server (the client needs to start a little web server)
        • Device flow makes the user go to the web browser but only once. After approval all is automated, the command line will continue
        • htgettoken will use a Vault token (lasting for a week), a grid token (refresh token, lasting for a week) and access tokens used for access

TODO: thread about tokens for pilots. Meeting next Thursday


July 8, 2020

Stakeholders meeting


July 1, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, (second part) Dave Dykstra, Brian Lin

  • Releases
    • 3.6.3
      • Bruno finishing tar-ball install elimination (ticket feedback, finishing documentation changes)
      • A couple of tickets in feedback to be merged
      • Bruno will do RC on Thursday or Friday
      • Dennis to send pull request to OSG w/ documentation changes
    • 3.7.1
      • All blocking tickets are in feedback ready to be merged
      • Dennis working on the Factory-Frontend ticket communication (not a release-blocking ticket)
    • p3
      • Code changes applied, being tested
  • Developers
    • Marco Mascheroni
      • Release work
      • See TODOs
    • Dennis
      • v3.7 tickets, token and documentation work mentioned above
    • Bruno
      • tarball removal ticket
    • LeRayah
      • Reviewing how jobs are submitted, look at the job submission file
      • Tests of jobs submissions changing universe, ...
    • Mirica
      • Adding unit tests to the anonymization script, looking into Python logging
    • Namratha
      • Working on 24546 (addition to CVMFS): testing in the WN environment
    • Marco Mambelli
      • Frontend output to alternative collector
      • Working on python3 migration
  • Reminders and notes
    • Add google-style docstrings when you edit code
    • Bigger code changes could be reserved for Python3 migration, e.g. use of yapf or removing blanks at the end of lines (to limit changes confusing for git blame)
    • yapf or blank removal could be added to commit hook (Marco Mascheroni has one)
TODO:
  • Marco Mascheroni
    • will work on both pull requests from Edgar
    • will follow-up and test why the osg-wn tarball setup is needed even if the RPM is installed (Marco Mambelli suspects some interference by the environment modifications done in the CMS sw - LD_LIBRARY_PATH, ...)

No news about the OSG Singularity wrapper, Mats and Edgar did not join


June 24, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, Brian Lin

  • Releases
    • 3.6.3
      • Bruno documenting tar-ball install elimination.
      • Bruno will complete and do RC by end of the Week
    • 3.7.1
      • TODO Define tickets
        • Meeting w/ Dennis, blockers: [#24544, #24561, #24565, #23278]; if possible the Frontend/Factory token communication [#24448]
        • Then Dennis will do RC w/ Marco
  • Developers
    • Bruno
      • Mostly busy w/ shift and HEPCloud
    • LeRayah
      • running jobs and talking about the design
    • Mimi
      • complete user name recognition in condor logs
      • planning on improving it and adding IP addresses
    • Namratha
      • adding the ability to use CVMFS
      • testing CVMFSExec
    • Mascheroni
      • Tickets in feedback
      • Will work on merging pull request
    • Marco Mambelli
      • Work on python3 migration (prep work, tests w/ git): initial version will have the old dir structure
      • Frontend output to alternative collector

June 17, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, Dave Dykstra, Mats Rynge

  • Releases
    • 3.6.3
      • Bruno documenting tar-ball install elimination. Marco will send feedback on code
      • Bruno will do RC on Thursday
    • 3.7.1
      • fixed some bugs w/ 3.7 (tarball creation, setup)
  • Developers
    • Marco Mascheroni
      • Thread w/ Shasad (images, certificates and osg-wn-client)
      • Fixes for OSG autoconf
    • Bruno
      • working on documentation for tarball removal
      • Will do Release Candidate
    • LeRayah
      • Hiccup w/ Factory, now all works
      • Will give feedback on documents
    • Mirica
      • GlideinMonitorinf in the
      • Troubleshooting jobs on Factory and Frontend
      • Notes for documents feedback
      • Initial version of a script to remove IP addresses
    • Namratha
      • Installation and configuration of Factory and Frontend. Making config correctly, reached out to Dennis and Bruno. Proxy not renewed. Finishing setup
      • MARCO TODO: Do ticket CVMFSExec
      • Working on documentation feedback
    • Marco Mambelli
      • Frontend secondary output
      • Fix tar ball generation in 3.7

Edgar is interested in using tokens for the startd connecting to the collector

TODO (Marco Mambelli):
  • priority to the script
  • changes in the frontend

-----
TEMPLATE
h2. June 10, 2020

Attending:

  • Releases
  • Developers
    • Marco Mascheroni
    • Dennis
    • Bruno
    • LeRayah
    • Mirica
    • Namratha
    • Marco Mambelli

TODO:


June 3, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown, Namratha Urs, Dave Dykstra, Mats Rynge

  • Releases
    • 3.6.3
      • Bruno making progress on tar-ball install elimination. Factory completed, Frontend by tomorrow
      • Marco Mascheroni
  • Bruno will do RC on Friday
  • 3.7.1
    • 24250 in review
    • 22413, 22378 condor config, should be closed today.
  • Developers
    • Marco Mascheroni
    • Dennis
      • Working mainly on jobsub
    • Bruno
      • Dropping of tar-ball installation
    • LeRayah
      • installed factory and frontend, will start working on the project
    • Mirica
      • Install of Factory and Frontend
      • Research on anonymization techniques
    • Namratha Urs
      • Joined the team, getting onboard
    • Marco Mambelli
      • Frontend submitting to a secondary collector
      • Singularity wrapper script
  • Dave
    • Singularity 3.6 closer, 4th RC.
      • significant changes to the environment variable handling
        TODO: Marco Mascheroni, Mats Rynge will test the new Singularity: /cvmfs/oasis.opensciencegrid.org/mis/singularity/3.6.0~rc.4/bin/singularity

    • CVMFS 2.7.3 tagged next week
      • ducc - CVMFS tool to publish a container
      • Quick replication of pads in CVMFS (reuse already published containers) will be in CVMFS 2.8
  • Singularity discussion
    • Marco
      • The script is not ready yet
      • Discussion w/ LIGO yesterday to move towards HTCondor invocation of Singularity
        • Drivers: condor_ssh_to_job, simplify the GWMS/VO software, HTCondor people do not like wrappers, no progress in the past year
        • Two roads:
          • GWMS script will use condor and allow VO scripts
          • Use directly condor and no wrapper. LIGO will test this way (Edgar, LIGO person, HTCondor person)
    • Mats Rynge
      • About the LIGO initiative
        • Does not see the point
        • VOs, LIGO especially currently ask for much extra stuff
        • anyway the more we can simplify the better
      • waiting on the script to run in Singularity
      • working on things he wants to test before signing off
    • Dave asked about CVMFSExec
      • no progress on it yet

May 27, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Mirica Yancey, LeRayah Neely-Brown

  • Releases
    • 3.6.3
      • Bruno making progress on tar-ball install elimination. Working on Factory
      • Marco Mascheroni did a hot patch to reduce the Factory load. It considers grid jobs completed as stage-out instead of the current unknown that was included in the idle (see below).
      • Bruno will do RC on Monday
    • 3.7.1
      • 24250 - was breaking the CI, fixed, fixed problems and behaves more correctly, almost ready for review
      • 22413, 22378 condor config, should be closed today.
  • Developers
    • Marco Mascheroni
      • Troubleshoot load issue on the factory, mostly due to condor
        • Trouble filling Syracuse: condor is not seeing immediately the changes on the CE (stageouts are blocking) and the Factory handles badly unknown grid job status. Some jobs in completed state were categorized as unknown, preventing job completion because counted as idle. The patch is considering them as running/stageout.
        • Running jobs with grid job status completed were classified as unknown (hash status in glidefactorylib was not considering completed -> was unknown, returning 1100) Unknown was counted as idle, now added to the running/stageout.
        • TODO Will create ticket
      • Will finish other 3.6.3 tickets
    • Dennis
      • 3.7.1 work above
      • 2448 - condor token auth between factory and frontends. Condor week provided good info (besides the renaming they improved the documentation of tokens). How tokens are handled w/ regex in the mapfile came handy for the ticket. Hopefully completed this week.
        • Default - condor operator, accept all token from this offsite machine, IP. Other authentications/exchanges may be better. TODO: Will touch base w/ Brian Lin
    • LeRayah
      • Did the orientation ticket, is installing GWMS
      • Deciding how to proceed for the project
    • Mirica
      • Working on orientation, getting into the VM still problematic.
      • Some work w/ TARGET material changed the PowerPoint, updating the site
    • Marco Mambelli
      • Creating intern tickets, and administrative tasks
      • Script for running VO scripts in Singularity
      • Frontend publishing on a different collector
  • Follow-up w/ Mirica, TODO:
    • Will post on Slack advertising her availability for support
    • Will check daily the slack channel to answer questions, especially on the Python Notebooks

May 20, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Brian Lin, Mirica Yancey, LeRayah Neely-Brown
Joining for the wrapper discussion (10:30): Dave Dykstra, Mats Rynge

  • Releases
    • 6.3 - move tickets, release in 3 weeks
    • 3.7 - Marco Marcheroni ran sleep jobs on CMS ITB dev
    • 3.7.1 - Locked on the Token ticket, almost ready for review (unit test failing)
  • Guests
    • Brian Lin
      • Interested in Frontend-Factory token auth progress, will touch base w/ Dennis
  • Developers
    • Dennis
      • working on CI problem for 3.7.1
      • Working on token communication between Frontend and Factory
    • Bruno
      • Working on Tar-ball removal, expected 3.6.3
    • Marco Mascheroni
      • Tickets from last week
      • Tickets in 3.7 on how to handle attributes
    • Marco Mambelli
      • interns onboarding
      • ticket for Frontend output on different collector
      • singularity wrapper for scripts
      • Worked w/ Thomas to restart the python3 migration effort
    • Mirica
      • on-boarding
    • LeRayah
      • on-boarding
TODO: Marco Mascheroni will send an email about the WMAgent requirement for condor_chirp
  • It is a WMAgent bug, will be fixed but will take months, so should be supported
    • missing or empty CONDOR_CONFIG will cause WMAgent not to use condor_chirp
    • w/ CONDOR_CONFIG it will look at the condor_chirp location there
    • a fake CONDOR_CONFIG w/ no condor_chirp location (e.g. only a comment) or a CONDOR_CONFIG with unreachable condor_chirp location, will cause WMAgent to look for condor_chirp in the PATH. This is what we want.

TODO: Check w/ stakeholders if developers version could be python3 (after the first python3 version is reliable, use OSG contrib in the meantime)

  • Dave
    • Singularity
      • Company behind Singularity, Greg C, Sylab CEO and Singularity ideator, started a new company, not for profit, to make it easier to work w/ grants: ctrl-cmd (control command), retaining technical leadership of Singularity. Sylab is for-profit and will self support. Both company will be supporting Singularity.
      • New GitHub project hpc-ng, for singularity and other HPC projects
      • some delay with 3.6.0
    • CVMFS 2.7.3 next month
      • CVMFS integration

May 13, 2020

Stakeholders meeting, see: Stakeholders_Meeting_May-13-2020


May 6, 2020

Attending: Marco Mambelli, Marco Mascheroni, Dennis Box, Bruno Coimbra, Jeff Dost, Brian Lin
Joining for the wrapper discussion: Dave Dykstra, Mats Rynge

  • Releases
    • 3.6.2
    • 6.3 - move tickets, release in 3 weeks
    • 3.7.1 - Locked on the Token ticket, almost ready for review
  • Guests
    • Jeff - There is a RedHat 8 test site - are we ready (worker nodes)?
      TODO: Marco will send email to J Dost, B Lin
  • Developers
    • Dennis
      • Review of CI ticket
      • Documentation on SciToken to CE
      • Working on HTCondor token between Factory and Frontend
        • Condor ping between the 2 machines works,
    • Marco Mascheroni
      • Finish the ticket about cleaning up old reconfigure
      • Adding pre and post reconfigure hooks
      • 2 more things would like in 3.6.3:
        • Fix for draining glideins
        • Adding limits to OSG autoconf
    • Marco Mambelli
      • Sphynx documentation
      • Checking releases
    • Bruno
      • Mainly HEPCloud
    • Marco Mambelli
      • Singularity tickets
      • Planning for 3.6.3 release
      • Preparing for Summer Interns
  • Singularity wrapper discussion
    • Mats and Edgar
      • will test on OSG with an "ITB" group
      • will start separating the parts of the wrapper that run inside vs outside singularity
    • Marco
      • Will prepare a helper script to allow scripts ("files" in the configuration) to run w/ the default (for the group) singularity image
    • On a first instance the ability to run at startup will be OK
    • HTCondor will work on allowing to run startd cron jobs in Singularity
    • Overarching ticket and specific sub tickets to trach the work
TODO
  • Marco Mambelli will prepare the tickets to trach this activity

April 22, 2020

Attending: Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni, Dave Dykstra, Brian Lin

  • Dave
    • New functionality fo CVMFS synccvmfs started through Singularity
    • CVMFSexec has been improved
    • CVMFS 2.7.2 tagged, released on Monday
    • Singularity 3.6 tagged, will be released in a couple of weeks
  • Releases
    • 3.6.2 is in OSG production
    • 3.7 is in OSG upcoming-testing and will be soon in upcoming
      • TODO: Send note about the file to touch in Factory to Tim Theisen and M.Mascheroni
      • Marco Marcheroni will test 3.7 on CERN ITB dev (Factory and Frontend)
    • 3.7.2 with tokens between Factory and FE will be out in about 2 months (asked by B.Lin)
  • Developers
    • Bruno
      • condor_chirp added to the PATH, will work in Singularity [#] . Make 2 tickets. Inside and outside singularity
      • Tickets review
    • Dennis
      • Sidetracked w/ Jobsub stuff
      • Will work on GWMS starting later this week
      • Tested 3.7 but not the logging yet
    • Marco Mascheroni
      • Baby is doing well besides not sleeping at night (that is OK)!
      • The cache should be already cleared when the Frontend is started [#]
        • Marco Mambelli will try to verify better
      • Singularity wrapper
        • Patch to clear pythonpath
        • source osgui from cvmfs - should be probably sourced by the CMS script, Marco Mambelli will look into it.
      • PIC has no outbound connectivity, requires 8.9.7. Problem in testing: Constant attribute in the entry is overwritten by the Frontend. The global attribute is not constant. There should be a warning at least. Will troubleshoot.
      • Glidein in downtime: is a time in the past an oversight or machine still down (doth could be OK depending on interpretation)
        • TODO Mascheroni: will check at the SI meeting which could be the correct policy
      • X509 ticket troubleshooting: frogot to merge the change
        • TODO: put an automatic process to avoid in the future: easy, low overhead for developers
    • Marco Mambelli
      • Troubleshoot w/ Edgar the schedd problem - red herring
      • Update the Singularity wrapper w/ OSG chenges
      • Manage better the environment cleanup

TODO:


April 8, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra,

Marco Mascheroni in paternity leave. Welcome Sofia!

*Releases
  • 3.6.2 in OSG production
  • 3.7
  • Dave
    • CVMFS 2.7.2 available
    • New Singularity next month. Message today from CEO Greg that will grow the team.
      • Current environment handling: -e for clean environment, variables w/ singularity_env_VARNAME -> VARNAME
      • There is a singularity discussion in GitHub tinking about redoing how variables are handled: https://github.com/sylabs/singularity/issues/5040
    • RH 7.8 is going to have unprivileged fuse mount in user namespaces (like RH8). Released and beta SL. This way they will support Podman.
  • Developers
    • Marco Mascheroni
    • Dennis
      • [#24285] - Problem in the tarball generator, Those files should not be on RH6 tarballs
      • Scitoken stuff
    • Bruno
      • Worked on HEPCloud this week
    • Marco Mambelli
      • Release 3.7
      • Test singularity wrapper for CMS w/ Marco Mascheroni
      • Remove PYTHONPATH from singularity jobs

April 1, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
  • Developers
    • Marco Mascheroni
    • Dennis
      • HTCondor CE working with scitoken
    • Bruno
    • Marco Mambelli

March 25, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni, Dave Dykstra

  • Release
    • 3.6.2
      • Mascheroni has it installed on UCSD, CERN/-ITB, and CMS ITB FE
      • Test singularity
        • Not clearing PYHONPATH
        • OSGUI missing on some sites, failing because GFAL is not finding plugins
        • arc command missing, the singularity startup script ends up calling recursively itself w/ a lot of stderr messages (3GB) (will work w/ Dave Dykstra)
        • condor_chirp could not be found (command -v condor_chirp) - assign ticket to Bruno
    • 3.7
  • Dave
    • Singularity, developers call, plugin to run unprivileged also for building images and running image files, using the Unix kernel library, can do fuse mount unprivileged also in old OS. Plugin support may be in 3.6 (July), plugin may be available later.
    • Investigating vault. vault plugin to store oauth2 secrets (from puppet labs). Store a refresh token as secret and each time you read it gives a regular token.
      • it is a heavier service, not to be deployed on glideins, but could be used if installed on nodes
  • Developers
    • Marco Mascheroni
      • CRIC ticket will be separated and
      • Now Factory ops are more proactive in providing condor tarball; each time there is a reconfig there is a new version of the tarball. We need a better handling of condor tarball. New copy is done also if the file is the same
        • TODO: Marco Mascheroni will create a ticket
    • Dennis
    • Bruno
    • Marco Mambelli

March 18, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • 3.6.2
      • RC2
      • Test
    • Release 3.7
      • Ticket reviewed, ready, RC
  • Dave
    • New Singularity RC 3.5.3, mostly bug fixes
    • Version in CVMFS is 3.5.2, many sites have still 3.4.x
    • Expecting new CVMFS release (maxmine GOIP, still free but you need a license)
  • Developers
    • Marco Mascheroni
      • Found a missing os.realpath() in entry-ls
      • OSG collector autoconf limit added
      • Reported to a face to face all-hands
      • Have a section in the 99-local-ini file a section to configure the pilot (will be used by CRIC generate entry configuration)
      • Right now we are depending on having an entry in the factory to run manual submit
      • Add hooks on the reconfigure script - will open a ticket
    • Dennis
      • Install new CE that works w/ SciToken
      • Transporting the SciToken to the Factory that is using it for the authentication
    • Bruno
      • Testing
    • Marco Mambelli
      • Testing, release RC
      • Ticket about handling properly module setup (MODULE_USE)

TODO MM - Follow up w/ Sakib about the module function/ env variable (today)


March 11, 2020

Stakeholders meeting


March 4, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • The following tickets should be included in 3.6.2
      • Marco Masc has a new ticket not yet defined about reducing Frontend queries (caching the schedd info)
      • CRIC, may go to the next release. There are lib and cfg directories. The content of lib should go in the GWMS lib abd util.py should get a more specific name. Cfg file should start w/ a template/default and go in the /etc directory, possibly consider yaml not to add a new configuration language and mix code w/ configuration
    • Release 3.7
      • Ticket reviewed, ready
    • Cutting releases
      • Bruno and Dennis to try the whole process
  • Developers
    • Marco Mascheroni (via email)
    • Dennis
      • Ticket for the release
      • Progress w/ scitoken authentication - condor maps token to identities via mapfile, getting close to have the full chain on the Frontend
        • Authenticating schedd w/ collector using token. Probably that step will not be in the final architecture, the authentication is w/ CE
      • Will start automated tests for 3.6.2.rc2
    • Bruno
      • Ticket reviews
      • Started working on dropping tarball
    • Marco Mambelli
      • Fixing release script
      • Reviewing tickets
      • Cut 3.6.2.rc2
  • TODO
    • Everyone
      • provide stakeholders slide to Marco Mmabelli
    • Marco Mambelli
      • Check w/ Margaret about stakeholders meeting calendar
      • Collect slides and send for review to developers

February 26, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 out, will be in the last OSG 3.4 release and OSG 3.5
  • CVMFS: pushing for wider adoption of EGI configuration instead of CERN one, so OSG repositories are available
    • uBoone is the first osg.storage.org replicated in Europe, others read directly from FNAL or BNL
    • A monitor to log accesses is being added (to know which repos are used in Europe)
  • Tokens: HashiCorp (same company of Vagrant) Vault, general-purpose storage for secrets, including refresh tokens (e.g. for AWS). Dave is testing it.
  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
    • Release 3.7
      • Ticket reviewed, ready
  • Developers
    • Marco Mascheroni (via email)
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • Will check and integrate the feedback.
      • Will send an email to Brian and Brian because the current solution is using sudo
      • Did a demo using sci-token authentication. Was able to authenticate to the CE
    • Bruno
      • reviewing Dennis ticket
    • Marco Mambelli
      • GPU monitoring in Singularity
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
      • Will cut soon a RC

February 19, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 soon to fix filesystem bug, 3.5.2 is the currently recommended one (OASIS has still 3.4.2, will change)
  • CVMFS bug fix in the client for stashcache type operation, w/ xrootd redirector, was not retrying multiple servers
  • Brian Linn:
    • Interested on an update on token support and offered OSG support
  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • The ticket about condor token-auth will be ready today
    • Bruno
      • condor_chirp replacements, former pychirp, ready for feedback
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • TODO
    • Show to Dennis and Bruno the release procedures

February 12, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 soon to fix filesystem bug, 3.5.2 is the currently recommended one (OASIS has still 3.4.2, will change)
  • CVMFS bug fix in the client for stashcache type operation, w/ xrootd redirector, was not retrying multiple servers
  • Brian Linn:
    • Interested on an update on token support and offered OSG support
  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • The ticket about condor token-auth will be ready today
    • Bruno
      • condor_chirp replacements, former pychirp, ready for feedback
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • TODO
    • Show to Dennis and Bruno the release procedures

February 5, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • 3.6.2 RC1 available
    • Waiting for Bruno and Dennis logins at UW to do next release candidates
  • Developers
    • Marco Mascheroni
      • CHEP paper
      • Feedback of tickets applied
      • SI meeting, pressure about the monitoring
        • 3 factories, have to check in 3 different places
        • breakdown per entry, group by site
    • Dennis
      • The ticket about condor token-auth will be ready soon
    • Bruno
      • changes done in pychirp
      • will be working on tokens
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
      • worked on tickets feedback
      • RC release and updated release documents
AOB-TODO
  • queries to the collector, reducing the load
    Frontend is doing 1400 queries/hr, mostly to find the schedd name
    changing to cache the results for condor_status for the schedd named
  • Antonio working on configurating resources w/o outbound connectivity,
    condor version using file system instead of network
    Use the one already installed in the WN instead of the one we ship
    patched condor_startup to use a local condor instead of the one we ship
  • Facilitate CMS-Kevin collaboaration
    -----

January 29, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Marco Mascheroni

  • Release
    • The following tickets should be included in 3.6.2
      • All tickets currently in Feedback/Accepted
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • working on CHEP proceedings this week
    • Dennis
      • The ticket about condor token-auth will be ready today
    • Bruno
      • condor_chirp replacements, former pychirp, ready for feedback
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • TODO
    • Show to Dennis and Bruno the release procedures

January 22, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn

  • Release
    • The following tickets should be included in 3.6.2
      • Extend the proxy lifetime copying the renewed proxy
      • CRIC, would like to include all the changes and scripts that are used in production( there has been meeting w/ Jeff and Edita about future directions)
      • pychirp (condor_chirp replacement)
  • Developers
    • Marco Mascheroni
      • worked on the proxy and CRIC tickets
      • Will be getting some monitoring information about monitoring to help Antonio
    • Dennis
      • Solved problem w/ CI (to fix the merge conflict than is OK)
      • Playing w/ instructions w/ sci-tokens but have not been working so far
    • Bruno
      • Working on shipping pychirp, including it in release
    • Marco Mambelli
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory

January 15, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn, Dave Dykstra

  • Dave
  • Singularity 3.5.3 soon to fix filesystem bug, 3.5.2 is the currently recommended one (OASIS has still 3.4.2, will change)
  • CVMFS bug fix in the client for stashcache type operation, w/ xrootd redirector, was not retrying multiple servers
  • Brian Linn:
    • Discussion about the CE
  • Developers
    • Marco Mascheroni
      • Finished the work about held glideins
      • Improvement on manual submit glidein, will open the ticket
    • Dennis
      • Progress on token-auth, using separate passwords on each token #23092
    • Bruno
      • Working on shipping pychirp, startup was expecting to run the wrapper
    • Marco Mambelli
      • Work on Use of Python constructs like context managers #22470
      • Work on #23451, Allow a Frontend to run in parallel w/o affecting the Factory
  • Brian and Marco discussing hosted CE 34 configuration

January 8, 2020

Marco Mambelli, Dennis Box, Bruno Coimbra, Brian Linn

  • Releases
    • 3.6.1 in OSG production
      • ITB factories are 3.6.1
      • 3.6.1 Singularity scripts applied to 3.4.6 production factories
    • 3.6.2 in 2 weeks
    • 3.7 in 2 weeks
      • No updates on 3.7
      • Dennis for token-auth
      • Marco for Leonardo's log and glidein_startup changes
  • Brian is interested in an update over the token integration
    • token-auth for daemon (glideins) will be in 3.7
    • Dennis and Marco contacted Andrea Ceccanti and have been experimenting w/ wlcg-token but no GWMS development
    • Brian Lin, will let us know about the workflow to get a token for sci-token submission
  • Developers
    • Marco Mascheroni
      • Worked on #23340, avoid releasing glideins
      • Discussion w/ Edgar about running HTCondor 8.9 in glideins, will open ticket
      • Will complete some 3.6.2 work this week and will be on vacation the following 2 weeks
    • Marco Mambelli
      • Troubleshoot and work on #22245, the problem is CONDOR env variables not being used by python bindings
      • Troubleshooting frontend problem w/ Steve
      • Packaging GlideinMonitor
    • Dennis
      • 23467, CI changes: works in command line, failing in nightly CI. TODO: check and follow-up w/ Vito
      • tokens: tested new condor 8.9.5, fixes some problems noted before, changes not completely understood
    • Bruno
      • Learning more about GWMS code
      • Working on pychirp, #21711, will be adding the script to the Factory scripts
    • Marco
      • Working on Frontend reporting to separate collector, #23451
      • Closed some other tickets
      • Cleanup and organization of wiki and meetings