Project

General

Profile

TeamMeeting2015-04-29

Paola is documentifying the Requirements wiki, with Katherine Lato.

Discussed initial release plans: version 1 will be targeted at just the OPOS group;
no experimenter interface screens yet, etc. Focus on:
  • tracking jobs
  • triage jobs

Divying up design work for v1

Database: Steve White
GUI: Michael Gheith , Paoula
Job Info: Diesburg
Monitoring Info: Mengel

Another discussino of job log retention -- plan is for v1 to
rely on jobsub_client's 2 week retention; longer term we may want to
standardize archiving logs for experiments in their DCache area.

Other Notes

Discussed what will be in the releases. Plan is to start with a smaller scope, that will grow out later.
-First release will contain basics for OPOS group.
-Second release will contain more complex features.

How do we get information for the triage screens? Logs go in the fifemon logs directory.

Need a longer holding time for log storage from jobsub people. Currently it is just two weeks.

Briefly touched base on technology stack:
-Apache Server
-CherryPy
-SQLAlchemy
-Template Engine (mako, jinja, ...)

Users aren't supposed to use condor directly. Only jobsub should talk to condor.
We may need to hammer out how to get job information from the jobsub people. Maybe via a web API?

Talked about logs. We shouldn't have our own document DB, as we would be duplicating efforts with the jobsub people.
Probably dump it on tape, and use SAM for long term log storage?
Mu2e is already managing all their logs.
The OPOS group would only need to go 3 weeks back in regards to logs.

Want to keep all the log files for a campaign until all the tasks are finished. People will not look at it until the last task in the campaign is finished.