Project

General

Profile

Bug #6529

ClassAd UpdateSequenceNumber increasing too fast

Added by Igor Sfiligoi over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Parag Mhashilkar
Category:
-
Target version:
Start date:
06/20/2014
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

While debugging lost classads in the Factory collector, we discovered that the
UpdateSequenceNumber
generation is broken.

That number gets increased at a global level, instead of per-classad level.
So two classads with the same name will have it insreased by ~200, instead of ==1.

This makes monitoring the loss statistics impossible using Condor tools.

History

#1 Updated by Igor Sfiligoi over 5 years ago

Here is the output of two consecutive condor_advertise's for one random classad:

grep -e '^Name ' -e '^UpdateSequenceNumber ' gfi_adm_gf_353279066_5156.140620_084435.140620_084440 |grep -B 1 CMSHTPC_T3_US_Omaha_tusker@v3_0@SDSC
UpdateSequenceNumber = 29748
Name = "CMSHTPC_T3_US_Omaha_tusker@v3_0@SDSC" 

grep -e '^Name ' -e '^UpdateSequenceNumber ' gfi_adm_gf_353279299_5156.140620_084828.140620_084834 |grep -B 1 CMSHTPC_T3_US_Omaha_tusker@v3_0@SDSC 
UpdateSequenceNumber = 29797
Name = "CMSHTPC_T3_US_Omaha_tusker@v3_0@SDSC" 

And here you can see the (filtered) content of one condor_advertise:

grep -e '^Name ' -e '^UpdateSequenceNumber ' gfi_adm_gf_353279066_5156.140620_084435.140620_084440 |tail -20
UpdateSequenceNumber = 29739
Name = "CMS_T2_ES_IFCA_gridce01@v3_0@SDSC" 
UpdateSequenceNumber = 29740
Name = "CMS_T2_DE_RWTH_grid-ce@v3_0@SDSC" 
UpdateSequenceNumber = 29741
Name = "CMS_T2_US_Wisconsin_cms01@v3_0@SDSC" 
UpdateSequenceNumber = 29742
Name = "CMS_T2_UK_SGrid_Bristol_lcgce04_medium@v3_0@SDSC" 
UpdateSequenceNumber = 29743
Name = "CMS_T2_EE_Estonia_ce3_main@v3_0@SDSC" 
UpdateSequenceNumber = 29744
Name = "CMS_T3_TW_NTU_HEP_grid5@v3_0@SDSC" 
UpdateSequenceNumber = 29745
Name = "CMS_T2_UK_SGrid_Bristol_lcgce03_medium@v3_0@SDSC" 
UpdateSequenceNumber = 29746
Name = "CMS_T2_US_Caltech_cit@v3_0@SDSC" 
UpdateSequenceNumber = 29747
Name = "Engage_US_MWT2_iut2@v3_0@SDSC" 
UpdateSequenceNumber = 29748
Name = "CMSHTPC_T3_US_Omaha_tusker@v3_0@SDSC" 

#2 Updated by Burt Holzman over 5 years ago

  • Priority changed from High to Normal

#3 Updated by Parag Mhashilkar over 5 years ago

  • Target version changed from v3_2_6 to v3_2_7

#4 Updated by Parag Mhashilkar over 5 years ago

  • Status changed from New to Feedback
  • Assignee changed from Parag Mhashilkar to Marco Mambelli

Fixed in v3/6529. Please review

#5 Updated by Marco Mambelli over 5 years ago

  • Assignee changed from Marco Mambelli to Parag Mhashilkar

Change looks OK.
The only observation is that "advertizeGFCCounter" could be named something more clear like "advertizeGFClientCounter" to reduce confusion with "advertizeGFCounter" but it was already there. On the other side it's only 5 occurrences in case you want to change it. Up to you.

#6 Updated by Parag Mhashilkar over 5 years ago

  • Status changed from Feedback to Resolved

Merged to branch_v3_2 and master

#7 Updated by Parag Mhashilkar over 5 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF