Project

General

Profile

Milestone #9351

Resolve Database issues

Added by Nicholas Peregonow about 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
Normal
Start date:
07/02/2015
Due date:
07/09/2015
% Done:

0%

Estimated time:
Duration: 8

Description

Currently have a database issue with replication from gratiadb02 -> gratiadb03, as well as queries taking 10-12 hours to complete.

Once these are resolved, we will need to forward data again between gratia systems, and then proceed with testing.

I don't have an ETA on when these issues are going to be fixed, so setting timeframe to 7/23, and this will be adjusted as work is done.


Related issues

Precedes (5 days) Gratia - Task #8981: Point gratiaweb-itb to the new databaseAssigned07/15/201507/16/2015

Precedes (10 days) Gratia - Bug #8978: Gratia Backlog Information Extremly behindAccepted07/20/201508/01/2015

History

#1 Updated by Nicholas Peregonow about 4 years ago

  • Precedes Task #8981: Point gratiaweb-itb to the new database added

#2 Updated by Nicholas Peregonow about 4 years ago

From Svetlana

David still working on recovering the slave, the issue - we restored fresh backup from the master on gratiadb03 but slave does not see some tables although the physical files are on the disk, and command mysql> desc tablename
does show the table description, but when you try to select from table it says "table ... does not exist"

David is looking right now the explanation and solution for this situation, we definitely have to understand this behavior in mysql v5.6.

#3 Updated by Nicholas Peregonow about 4 years ago

  • Related to Bug #8978: Gratia Backlog Information Extremly behind added

#4 Updated by Nicholas Peregonow about 4 years ago

  • Related to deleted (Bug #8978: Gratia Backlog Information Extremly behind)

#5 Updated by Nicholas Peregonow about 4 years ago

  • Precedes Bug #8978: Gratia Backlog Information Extremly behind added

#6 Updated by Gerard Bernabeu Altayo about 4 years ago

  • Due date changed from 07/23/2015 to 07/09/2015

#7 Updated by Nicholas Peregonow about 4 years ago

Databases have been properly synced between gratiadb02/db03

From svetlana

Hi Nick,

With RDX help we finally recovered the Slave on gratiadb03, now it's in sync and ready for the users.

We need to discuss the clustering all 3 nodes gratiadb01,02,03 that would provide H/A for the mysql in such extreme case like we had this time.

On gratiadb01 (dev) wealready installted Galera software with re-installation mysql libs, there is no any impact for the existing MySQL server yet, just prepaire the Galera software.
Major requirements for the MySQL databases would be existence PK for all tables and currently Galera replication is supporting the InnoDB tables only, there are still a few MyISAM tables.

Any questions, let us know.

Thanks a lot to Keith and David for hard work recovering the databases on gratiadb03.

Svetlana.

#8 Updated by Nicholas Peregonow about 4 years ago

Gratiadb03 appears to be far behind gratiadb02. When I run the queries, and check the differences this shows we are 700975 seconds(8 days) behind and 8,024,518 records behind. Filed incident INC000000569640

What I ran on gratiadb02

mysql> select * from JobUsageRecord_Meta where dbid = (select max(dbid) from JobUsageRecord_Meta) limit 1;
---------------------------------------+---------------------+-----------------------+-----------------+----------------------+-----------------------+----------------------+------+---------------------+------+------------------+-----------------------------+---------+-----------------+----------------------------------+ | dbid | recordId | CreateTime | CreateTimeDescription | RecordKeyInfoId | RecordKeyInfoContent | ProbeName | ProbeNameDescription | Grid | ServerDate | md5 | ReportedSiteName | ReportedSiteNameDescription | probeid | GridDescription | md5v2 |
---------------------------------------+---------------------+-----------------------+-----------------+----------------------+-----------------------+----------------------+------+---------------------+------+------------------+-----------------------------+---------+-----------------+----------------------------------+ | 1708834146 | atlas-net2.bu.edu:12081.385 | 2015-06-25 07:43:04 | NULL | NULL | NULL | sge:atlas-net2.bu.edu | NULL | OSG | 2015-07-10 18:21:37 | NULL | NET2 | NULL | 1756 | NULL | 90FF4CCB05FD9F84BCE85BB3FF64D851 |
---------------------------------------+---------------------+-----------------------+-----------------+----------------------+-----------------------+----------------------+------+---------------------+------+------------------+-----------------------------+---------+-----------------+----------------------------------+
1 row in set (0.00 sec)

mysql>

And when I check on gratiadb03 I get

mysql> select * from JobUsageRecord_Meta where dbid = (select max(dbid) from JobUsageRecord_Meta) limit 1;
--------------------------------------------+---------------------+-----------------------+-----------------+----------------------+-------------------------------+----------------------+------+---------------------+------+------------------+-----------------------------+---------+-----------------+----------------------------------+ | dbid | recordId | CreateTime | CreateTimeDescription | RecordKeyInfoId | RecordKeyInfoContent | ProbeName | ProbeNameDescription | Grid | ServerDate | md5 | ReportedSiteName | ReportedSiteNameDescription | probeid | GridDescription | md5v2 |
--------------------------------------------+---------------------+-----------------------+-----------------+----------------------+-------------------------------+----------------------+------+---------------------+------+------------------+-----------------------------+---------+-----------------+----------------------------------+ | 1700813307 | submit-3.chtc.wisc.edu:9247.2875 | 2015-06-19 17:00:42 | NULL | NULL | NULL | condor:submit-3.chtc.wisc.edu | NULL | OSG | 2015-07-07 20:29:35 | NULL | CHTC | NULL | 2512 | NULL | 640AC1871830D2FAE52ADDCAD1B70BE6 |
--------------------------------------------+---------------------+-----------------------+-----------------+----------------------+-------------------------------+----------------------+------+---------------------+------+------------------+-----------------------------+---------+-----------------+----------------------------------+
1 row in set (0.00 sec)

mysql>

#9 Updated by Nicholas Peregonow about 4 years ago

Currently from the ticket

Ok, we identified about 40 tables from gratia_itb database where metadata is corrupted.
We discard the tablespace for each of these tables, made fresh backup for the tables only and restored on gratiadb03.
The lag is still big but it's moving forward, it should catch the Master very soon.

Thanks,
Svetlana.



Also available in: Atom PDF