Project

General

Profile

Feature #18541

Log actual dCache door used in transferred event

Added by Kevin Retzke over 1 year ago. Updated 9 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
12/13/2017
Due date:
% Done:

100%

Estimated time:
Duration:

Description

Digging through the ifdh logs in Elasticsearch we're finding two particular pain points:

1. Unable to determine what actual door was used for dCache
2. Difficulty correlating "starting_transfer" and "transferred" events.

One solution for both of these I think would be to add the source and destination to the "transferred" event, showing what was actually transferred.

History

#1 Updated by Marc Mengel about 1 year ago

  • Status changed from New to Resolved
  • Target version set to v2_3_0
  • % Done changed from 0 to 90

Making a stab at this for v2_3_0; unfortunately the layer that does the door rotation doesn't know what command it's in, it just string-replaces in the command being retried... so I'm logging the command being run in any case, which will log the underlying "rm" calls, "cp" calls, "ls" calls, etc. with the full replaced door names, etc.

#2 Updated by Kevin Retzke about 1 year ago

Thanks Marc. So this will add another log message with the actual command?

I'm a bit concerned about how we can correlate all the log messages for a given transfer, since it's not unusual for several messages for a job to have the same timestamp, which can mix up the message order. Maybe that's on us to ensure strict ordering or combine the messages somehow...

I see that we do need to revisit the log parsing, since all the "starting transfer" and "finished transfer" messages aren't getting parsed into usable fields (I guess they're relatively new?), and looking at the "transferred" message--which encompasses a single ifdh command that may include multiple transfers--may not be what we really want.

#3 Updated by Marc Mengel about 1 year ago

  • Target version deleted (v2_3_0)

Well, I'll leave this change in for the v2_3_0 'cause the larsoft folks need to cut this version to build with the clang++ compiler on MacOS...

Maybe we should get together and decide what the logging should look like, and we'll add sufficient plumbing in the code to log it properly.

#4 Updated by Marc Mengel about 1 year ago

  • Status changed from Resolved to Under Discussion
  • % Done changed from 90 to 10

#5 Updated by Marc Mengel 9 months ago

  • Target version set to v2_3_4

#6 Updated by Marc Mengel 9 months ago

  • % Done changed from 10 to 100

#7 Updated by Marc Mengel 9 months ago

  • Status changed from Under Discussion to Closed


Also available in: Atom PDF