Service Registry

There are several overlapping requirement sets that can be met with a single service registry web service; to wit:

  1. Letting SAM services find each other
  2. Helping liasons, admins, and users relate network down-times to SAM services
  3. Sam At A Glance monitoring functionality

This document will briefly outline each of these needs, and then discuss an interface which would support them.

Letting SAM services find each other

As the various SAM components become web services rather than CORBA services, they either need to have static configuration data to tell them where the other services are, or determine the location at runtime. Past experience tells us that repeating this location information accross multiple configuration files is error prone. We propose instead a single well-known registry service, which will let SAM components look up the URL's associated with a given experiment, tier (i.e. production, development, etc.), and service type and name.

SAM services would then need to register at start-up, so we would know where they live, and via what URL they are reached.

Helping folks relate network down-times to SAM services

Assuming a registry of SAM services also includes the subnet of the host that each service resides upon, this registry could provide, given a list of subnets that will be offline, a list of services which will be affected.

SAM At A Glance

Assuming each service also registers a "ping" URL which can be used to check whether the service is alive, and a "status" URL which gives detailed current status, a separate process could periodically poll the host and the "ping" URL's and update the current status of the service; and a web page report very similar to our current SAM At A Glance page could be provided on demand from the web service.

Proposed Service definition

To provide this service, the following web calls would be needed

Each service should if possible post at least a "ping", "status", and "service" URL, but others can be added
as needed.

Usage by various clients

SAM Services

Services starting up would call

Then they would look up any services they need to use:

Sam At A Glance

Folks wanting Sam At A Glance display would call

Network Downtime lookups

Folks wanting to know what would be affected by a downtime on the 131.225.110 subnet would call

Service Pingers

A service pinger would first call:

to get a full dump, then it would go through all the hosts (both the hosts where the services run, and
any hosts from their URLs) and ping them, and report their status with

and then for services whose hosts were awake, it would call the "ping" urls
and for each one, and report its status with

(reporting "unknown" if the host was not pingable)