Feature #2667

Upgrading glideinwms with changes to some of the rrds

Added by Parag Mhashilkar almost 9 years ago. Updated over 8 years ago.

Douglas Strain
Target version:
Start date:
Due date:
% Done:


Estimated time:


We need to find a good way to upgrade frontend & factory without having to create a new instance when there are changes to the rrds. This is important for rpm based installation which does not have default means to create multiple instances and same monitoring/staging area is used between upgrades.

Currently there are two proposals -

1) Provide a standalone tool as needed during a release to update the rrds
Pros: Simple enough
Cons: If the admin forgets to run it, monitoring will be busted and no information will be gathered till this is fixed.

2) Provide a tool that checks the rrd schema and updates as necessary. Integrate it with either upgrade/config process or run it once every time the factory/frontend starts.
Pros: Since the validating and updating of the schema is automated, the necessary checks will be default and errors will be caught beforehand. This will be integrated in the release/testing avoiding the risk of releasing a broken glideinwms version.

Cons: Complex than 1)


#1 Updated by Igor Sfiligoi almost 9 years ago

The 3rd option is to not make schema changes, of course;
i.e. never change an existing file, just create a new one when you need more attributes.

The drawback of this is that it increases the disk IO on updates, and possibly makes the code a bit more complex (both for writing and for reading)

My 2c,

#2 Updated by Burt Holzman almost 9 years ago

  • Assignee set to Douglas Strain

#3 Updated by Douglas Strain almost 9 years ago

Status update on this ticket:

I have written (standalone) python functions to:
1) read the existing rrds and compare them to the existing schema
2) in case of missing attributes, [rrdtool dump then] add the existing attributes to the xml [then rrddump restor]

I still need to go back and integrate these with the existing code base and clean them up a bit.

I think I am going to go with option 2:
- During reconfig (or maybe startup), check rrds vs known schema
- If they are not matched, error out with a message
- User can either delete them or run a tool to update them.

I have not committed to svn yet, so stay tuned for more updates next week.

#4 Updated by Douglas Strain almost 9 years ago

This is done in branch_v2plus_rrd

Both factory and frontend now check monitor rrd files vs known rrd schema
- If errors are found, they are printed out with an ERROR message:
ERROR: total/state_Unmatched/Status_Attributes.rrd missing attribute ReqNew
Run with fix_rrd option to update errors
WARNING: back up your existing rrds before auto-fixing rrds
- For missing attributes, you can reconfig with fix_rrd to auto-add attrs
This adds attributes on to the end of the rrd row (last index)

Commit numbers are:
commit:5d437ab Feature 2667: Reconfig factory/frontend checks rrd schema, add -fix_rrd option
commit:ba3aad2 Feature 2667: Reconfig of factory now checks rrd schema
commit:a943698 Feature 2667: Initial commit of verify_rrd function and global variables

I still need to port this change over to master branch, but I think I am done in v2plus.

#5 Updated by Douglas Strain almost 9 years ago

  • Status changed from New to Resolved

I have also transfered the changed to master branch in branch_master_rrd. These changes are now ready for review.

#6 Updated by Burt Holzman almost 9 years ago

  • Status changed from Resolved to Feedback

#7 Updated by Parag Mhashilkar almost 9 years ago

  • Target version set to v2_7_x

#8 Updated by Parag Mhashilkar almost 9 years ago

I reviewed the code. Code looks good but got couple of comments-

  • When we ask the admin to back up the rrds, we should either give them list of files to backup or create a backup automatically and let admin know where they were backed up. This will simplify admin's life and reduce human error.
  • When you do rrd restore after adding new data stores, is the process atomic? What if someone presses ctrl-c by accident? To mitigate it, maybe we can restore it to a temporary file and move it over to actual file. Move is atomic and this extra step means, admin doesn't have to restore from backup in case restore crashed or was cancelled by mistake.

#9 Updated by Douglas Strain almost 9 years ago

  • Status changed from Feedback to Resolved

I have addressed Parag's comments. Also, I have taken care of the case where the rrd is missing. I also added checks for a few rrd files that I missed the first time around.

These changes have been merged into branch_v2plus and master

#10 Updated by Parag Mhashilkar almost 9 years ago

  • Target version changed from v2_7_x to v2_6

#11 Updated by Parag Mhashilkar over 8 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF