Project

General

Profile

UPATED: 22FEB2016 (A.Norman)

Reporting Procedures

During NOvA operations and maintenance different problems will arise with the data acquisition (DAQ) and computing systems. When this occurs the following reporting procedures should be used to document it and escalate the problem through the proper groups so that it is resolved correctly.

The DAQ technical escalation chain (summarized) is:
  • Shifter Reporting
  • DAQ On-call support personnel
  • DAQ group general support (via partner system)
  • DAQ Senior support personnel
  • DAQ Management
  • Pageable professional support
  • Non-pageable professional support

There is also a management chain that is escalated in parallel to the technical escalation:

  • DAQ Management
  • NOvA Operations Management
  • NOvA Collaboration Management
  • Lab Management

This document describes the procedures for advancing between stages.
The following procedures are tailored to the individuals who are interacting with the systems.

NOvA Shifters (Location: NOvA Control Room)

When a NOvA shifter encounters a problem with the DAQ or computing they should:

  • Read the basic trouble shooting page for the DAQ (put link here)

If the problem persists:

  • Open an issue on the DAQ Redmine issue tracking system. Include a description of the problem any relevant error messages or screen shots. (link to instructions for submitting an issue)
  • Call the run coordinator for guidance. Tell the run coordinator the issue number that you submitted. The run coordinator will assess the problem and direct you to the appropriate "expert on call" for the system that you are having problems with.
    • The run coordinator will assign the issue to the appropriate "expert on call"
  • Call the "expert on call" for the subsystem that the run coordinator identified.
    • Tell the expert the issue number that you submitted.
    • Work with the expert as he/she resolves the issue
    • If the issue is able to be resolved, the expert will resolve and close the issue.

If the expert is UNABLE to resolve the problem:

  • The expert will escalate or reassign the issue
    • If the issue is internal to the NOvA subsystem (i.e. requires a different DAQ expert who wasn't on call)
      • The expert will reassign the issue to the appropriate party through the redmine ticket system
      • The expert will contact the appropriate party
    • If the issue is external to NOvA and associated with a supported service (i.e. networking)
      • The expert on call (or run coordinator) will evaluate the situation and escalate the problem
      • If the problem is a critical/pagable event the expert will call the FNAL service desk at (630) 840-2345
      • If the problem not a pagable event, the expert will open a Service Now Incident ticket and include all relevant information
      • The expert on call will update the Redmine issue with the Service Now Ticket number and close the Redmine ticket.
      • The expert on call will make a NOvA logbook entry relating to the incident.
  • The expert will call the NOvA control room and alert them that the incident has been escalated.

NOvA Technical Personnel (Location: Ash River or FNAL)

When a NOvA technical person encounters a problem with the DAQ or computing they should:

  • Read the basic trouble shooting page for Technical systems (put link here)

If the problem persists:

  • Call the NOvA run coordinator for guidance. Describe the problem you are having to the run coordinator.
    • The run coordinator will assess the problem
    • If the problem relates to an internal NOvA system
      • The run coordinator will open a Redmine issue relating to the problem and assign it to the "expert on call" for the system that is effected.
      • The run coordinator will contact the "expert on call" for the system that is having problems with.
      • The expert on call will work on the problem
  • If the problem relates to an external system (networking etc...)
    • The run coordinator will
    • The run coordinator will open a Service Now Ticket.