Project

General

Profile

Bug #17119

Factory job stats are empty

Added by Marco Mambelli over 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Immediate
Category:
-
Target version:
Start date:
07/07/2017
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:

Factory Ops

Duration:

Description

Factory is not collecting an advertising any more job stats

The following error (something similar) is visible in the entry log:

[2017-07-06 13:40:57,743] ERROR: glideFactoryEntryGroup:532: Error writing stats for entry 'ITB_FC_CE2':
Traceback (most recent call last):
  File "/usr/sbin/glideFactoryEntryGroup.py", line 528, in iterate
    entry.writeStats()
  File "/usr/lib/python2.6/site-packages/glideinwms/factory/glideFactoryEntry.py", line 684, in writeStats
    self.gflFactoryConfig.log_stats.write_file(monitoringConfig=self.monitoringConfig)
  File "/usr/lib/python2.6/site-packages/glideinwms/factory/glideFactoryMonitoring.py", line 981, in write_file
    self.get_xml_total(indent_tab=xmlFormat.DEFAULT_TAB, leading_tab=xmlFormat.DEFAULT_TAB) + "\n" +
  File "/usr/lib/python2.6/site-packages/glideinwms/factory/glideFactoryMonitoring.py", line 863, in get_xml_data
    data = self.get_data_summary()
  File "/usr/lib/python2.6/site-packages/glideinwms/factory/glideFactoryMonitoring.py", line 856, in get_data_summary
    completed_stats = self.get_completed_stats(entered_list)
  File "/usr/lib/python2.6/site-packages/glideinwms/factory/glideFactoryMonitoring.py", line 709, in get_completed_stats
    enle_jobs_duration = enle_condor_stats['Total']['secs']
KeyError: 'Total'

0001-fixed-stats-start-marker.patch (847 Bytes) 0001-fixed-stats-start-marker.patch Marco Mambelli, 07/07/2017 04:39 AM

History

#1 Updated by Marco Mambelli over 2 years ago

  • Status changed from Assigned to Feedback
  • Assignee changed from Marco Mambelli to Marco Mascheroni

The marker for the Stats start had been corrupted in a previous commit.
Fixed in branch v3/17119.

#2 Updated by Marco Mambelli over 2 years ago

Here is a patch.
In RPM installs the file is /var/lib/gwms-factory/web-base/condor_startup.sh
Run an upgrade after patching

#3 Updated by Marco Mambelli over 2 years ago

To correct the log files with the wrong marker you can run:

cd /var/log/gwms-factory/client/
for i in $(find . -name 'job.*.out') ; do  sed -i.bck 's|===   Stats of \(..*\)   ===|=== Stats of \1 ===|' "$i"; done

#4 Updated by Dennis Box over 2 years ago

  • Assignee changed from Marco Mascheroni to Dennis Box

#5 Updated by Dennis Box over 2 years ago

Tested change, it fixes problem. Had to go back and look several times at 3.2.18 to see why it stopped working. That made me need to check all the other parse expressions in factoryLogParser.py against the quoting in condor_startup.sh. The line in the patch was the only one that was messed up by double quoting, which is normally good practice.

OK to merge to v3_2_20

#6 Updated by Dennis Box over 2 years ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Dennis Box to Marco Mambelli

#7 Updated by Marco Mambelli over 2 years ago

  • Stakeholders updated (diff)

#8 Updated by Marco Mambelli over 2 years ago

  • Status changed from Resolved to Closed

#9 Updated by Marco Mambelli about 2 years ago

  • Stakeholders updated (diff)


Also available in: Atom PDF