Project

General

Profile

Feature #17349

job reporters should split updates into multiple queues by task_id (odd/even, modulo 3, etc).

Added by Marc Mengel about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
08/03/2017
Due date:
% Done:

100%

Estimated time:
Scope:
Internal
Experiment:
-
Stakeholders:
Duration:

Description

Basically the database code is locking Task rows; so if we have multiple bulk updates that involve the same task_id's, they get serialized. But if we have multiple bulk updates with distinct task_id's, then we can run them in parallel. Since most of our updates have task_id's in them (except currently Completed updates -- which could, if we got task_id's with the job_ids when we get active_jobs) then we could really improve our throughput.

So:
  • update active_jobs code to report jobsub_job_id's and task_ids
  • update jobsub_q_scraper to pass those task_id's in when marking jobs completed
  • add a number of queues (nqueues) to the job_reporter setup and assorted reporters config files
  • file job updates in (task_id % nqueues) queues
  • queues each get bulk reporting thread.

Then we can have actual parallel updates going reporting jobs that don't get serialized.

History

#1 Updated by Marc Mengel about 3 years ago

  • Description updated (diff)

#3 Updated by Marc Mengel about 3 years ago

  • Status changed from New to Work in progress

#4 Updated by Marc Mengel about 3 years ago

  • Status changed from Work in progress to Resolved
  • % Done changed from 40 to 100

This has been running in development for some time.

#5 Updated by Anna Mazzacane about 3 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF