Collect service performance stats every iteration for Factory & Frontend
Frontend does several tasks in each iteration, like:
- Query WMS collector for factory classads
- Query scheds for jobs
- Perform computations to compute min_idle and max_running per group,
- Create different types of classads
- Advertise classads
Similar argument for for different tasks Factory performs during each iteration.
Currently, we do not gather enough timing info to understand how much time each of these tasks take. We can only infer this from looking at the logs. Collecting and logging this info at the end of each iteration can be very useful in identifying potential problems and help in debugging issues only in production in large scale. These stats can also be used to identify potential bottlenecks and performance impact across multiple versions.
Once we have this information logged, we can look at advertising them or put them in the monitoring for performance plots.