Project

General

Profile

Bug #11340

MemoryTracker and TimeTracker file sqlite db crash

Added by Herbert Greenlee about 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Application
Target version:
Start date:
01/04/2016
Due date:
% Done:

100%

Estimated time:
2.00 h
Spent time:
Occurs In:
Scope:
Internal
Experiment:
MicroBooNE
SSI Package:
art
Duration:

Description

I get the following error when running MemoryTracker or TimeTracker service out of bluearc nfs-mounted working areas on uboonegpvmXX.

%MSG
%MSG-s ArtException: lar 04-Jan-2016 15:38:38 CST JobSetup
cet::exception caught in art
---- OtherArt BEGIN
ServiceCreation
---- SQLExecutionError BEGIN
database is locked
---- SQLExecutionError END
cet::exception caught during construction of service type art::MemoryTracker:
---- OtherArt END
%MSG

This problem is related to the sqlite disk file created by Memory/TimeTracker. I can work around the crash by creating a symbolic link to a local disk file.

$ ln -sf /tmp/mem.db mem.db
$ ln -sf /tmp/time.db time.db

It would be nice if Memory/TimeTracker could be taught not to crash. Alternatively, if there is a way to configure our gpvms or bluearc nfs mounts so they whatever non-supported feature is being used by sqlite (e.g. record locking, or whatever), then please instruct us so that we can request it (i.e. tell us the magic words to put in a snow ticket).

Associated revisions

Revision 6aaaaaa5 (diff)
Added by Kyle Knoepfel about 4 years ago

Further tweaks to issue #11340 fix: forbid user-specified URI

Revision 1ede0336 (diff)
Added by Kyle Knoepfel about 4 years ago

Further tweaks to issue #11340 fix: forbid user-specified URI

Revision 140dd177 (diff)
Added by Kyle Knoepfel about 4 years ago

Further tweaks to issue #11340 fix: forbid user-specified URI

History

#1 Updated by Kyle Knoepfel about 4 years ago

  • Status changed from New to Assigned
  • Assignee set to Kyle Knoepfel

#2 Updated by Kyle Knoepfel about 4 years ago

  • Estimated time set to 2.00 h

I can reproduce the problem; and its source is understood.

Explanation

By default, SQLite uses locking mechanisms to ensure that multiple processes/threads can read or write to the same database file in a safe manner. However, nfs is notorious at having deficient locking mechanisms that SQLite (and other applications) may depend upon. SQLite does not attempt to solve nfs's locking deficiencies, and in fact, the documentation states upfront that using SQLite on an nfs-mounted system can be problematic.

Yes, a "database is locked" error is emitted during art execution. The incompatibility between nfs and SQLite, however, can be gleaned simply by attempting to read a database file at the command-line on (e.g.) /uboone/app/ (nfs version 3):

# on nfs (finds no tables)
$ sqlite3 memoryTracker.db
SQLite version 3.8.10.2 2015-05-20 18:17:19
Enter ".help" for usage hints.
sqlite> .tables
sqlite>

Whereas for a local filesystem:

# local file system
$ sqlite3 memoryTracker.db 
SQLite version 3.8.10.2 2015-05-20 18:17:19
Enter ".help" for usage hints.
sqlite> .tables
EventInfo ModuleInfo Summary

While it is conceivable that nfs version 4 may have fixed some of these problems, more investigation would be needed, and possibly the same behavior would be observed.

Bottom line: the nfs locking mechanisms are deficient, which makes using SQLite on an nfs system fragile whenever locking is enabled.

The solution

It is possible to tell SQLite to turn the locking mechanisms off so that database files could be read and written to on both local disks and nfs-mounted ones. We intend to include the fix in the next art 1.17 release. Until then, however, you may need to resort to Herb's work around, or refrain from enabling MemoryTracker and TimeTracker on nfs systems.

#3 Updated by Kyle Knoepfel about 4 years ago

  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100

Implemented with commit art:39e8db03. Will sync with the 1.17, 1.18 and develop branches.

#4 Updated by Kyle Knoepfel about 4 years ago

  • Category set to Application
  • Target version set to 1.17.06
  • SSI Package art added
  • SSI Package deleted ()

#5 Updated by Kyle Knoepfel over 3 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF