Task #14468

Monitoring/OnDemand: Check network rates on squid on way up

Added by Steven Timm over 4 years ago. Updated over 3 years ago.

Start date:
Due date:
% Done:


Estimated time:


Look at cpu and network rates of the squid servers.. see when we get to 10 are they pegged and do we really need more?
use iftop, and also the network monitoring stuff in the google console


#1 Updated by Neha Sharma over 4 years ago

All squid instance groups were running at their max capacity.

CPU utilization graphs are as follows-

Here is some captured iftop output

Instance group is resizing. 9 instances, scaling to 10. sc-demo-us-central1-a

[root@sc-demo-us-central1-a-g58f sekhrineha]# hostname
[root@sc-demo-us-central1-a-g58f sekhrineha]#

246Mb                  491Mb                  737Mb                  982Mb            1.20Gb
└──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴────────────────────── => condor-c0ff430d-7907-47b7-8489-984b9a85fa7 46.7Mb 47.0Mb 46.7Mb
<= 184Kb 150Kb 270Kb => condor-d999a297-9953-472d-a567-e489ac50520 39.7Mb 41.4Mb 40.9Mb
<= 453Kb 242Kb 139Kb
sc-demo-us-central1-a-g58f.c.fermilab-poc. => 162Kb 110Kb 129Kb
<= 39.1Mb 27.4Mb 30.6Mb => condor-7ca4bd1b-9ae0-459f-beb0-0920ab525c3 27.0Mb 21.2Mb 5.29Mb
<= 82.3Kb 69.3Kb 17.7Kb => condor-cd148d85-bb84-468d-ad16-27ab0e49c3d 51.8Mb 15.5Mb 3.89Mb
<= 183Kb 57.7Kb 14.4Kb => condor-9312edaa-15c5-4474-91e3-9bb4115bc33 0b 15.5Mb 4.97Mb
<= 0b 40.5Kb 14.2Kb => condor-8624ca52-db0e-4746-b2b6-b0cc393a0f2 0b 12.8Mb 3.21Mb
<= 0b 34.1Kb 8.53Kb => condor-74a240ef-34cc-4d63-b4cc-6536ff83e57 37.0Kb 11.6Mb 2.94Mb
<= 3.99Kb 66.2Kb 18.8Kb => condor-d24c5cc4-2db8-4d58-a094-a962383b068 48.5Mb 9.69Mb 6.30Mb
<= 123Kb 24.6Kb 19.2Kb => condor-aefef505-5ddf-408b-b8e7-5d71ea095b1 7.65Mb 9.10Mb 3.29Mb
<= 27.9Kb 27.5Kb 12.1Kb => condor-d842cd7d-5525-4ea3-ac9a-69c1d18d01e 0b 6.01Mb 17.5Mb
<= 0b 70.7Kb 180Kb
sc-demo-us-central1-a-g58f.c.fermilab-poc. => 0b 63.3Kb 205Kb
<= 0b 5.96Mb 17.3Mb => condor-3ac7e2b2-ced0-4754-96af-ce5e607a783 28.8Mb 5.82Mb 6.63Mb
<= 88.1Kb 22.7Kb 26.5Kb => condor-9f4c3ed1-6006-446c-b726-fa2a87e09d4 24.9Mb 5.04Mb 5.94Mb
<= 75.6Kb 21.4Kb 31.1Kb => condor-aaea7e18-af84-4a30-8105-925658b75aa 0b 3.56Mb 2.16Mb
<= 0b 9.21Kb 5.67Kb => condor-6e9f7759-aa70-441c-bbfa-b3df924d479 265Kb 3.26Mb 5.95Mb
<= 17.0Kb 15.4Kb 26.2Kb => condor-c681914f-fed0-44c9-9af5-62b6f158104 0b 1.08Mb 277Kb
<= 0b 10.4Kb 2.61Kb => condor-6ffd331e-adff-41cf-8aa7-3304c32b5ca 160Kb 191Kb 5.14Mb
<= 11.6Kb 12.1Kb 18.7Kb
TX: cum: 0.98GB peak: 280Mb rates: 276Mb 209Mb 189Mb
RX: 247MB 185Mb 40.4Mb 34.3Mb 49.1Mb
TOTAL: 1.22GB 439Mb 317Mb 244Mb 238Mb
246Mb                  491Mb                  737Mb                  982Mb            1.20Gb
└──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴────────────────────── => condor-2acffa68-9a54-4716-9875-0cb48fa0922 0b 13.1Mb 3.27Mb
<= 0b 35.8Kb 8.94Kb => condor-d5196f16-2b77-400f-8faf-88f91b5bda0 0b 10.3Mb 15.2Mb
<= 0b 29.2Kb 736Kb => condor-8624ca52-db0e-4746-b2b6-b0cc393a0f2 0b 5.09Mb 2.23Mb
<= 0b 13.1Kb 5.77Kb => condor-0833a1f1-8cd9-4976-b121-a4438ac626e 0b 4.11Mb 2.01Mb
<= 0b 12.0Kb 6.21Kb => condor-c0ff430d-7907-47b7-8489-984b9a85fa7 0b 3.20Mb 35.1Mb
<= 0b 35.4Kb 380Kb
sc-demo-us-central1-a-g58f.c.fermilab-poc. => 12.1Kb 47.6Kb 160Kb
<= 136Kb 1.12Mb 43.9Mb => condor-d842cd7d-5525-4ea3-ac9a-69c1d18d01e 183Kb 896Kb 2.81Mb
<= 4.89Kb 23.1Kb 30.6Kb
sc-demo-us-central1-a-g58f.c.fermilab-poc. => 6.04Kb 30.4Kb 42.3Kb
<= 180Kb 884Kb 2.79Mb => condor-9db6244c-73e6-4cba-b728-5430d335e7b 322Kb 174Kb 2.08Mb
<= 23.7Kb 12.2Kb 14.3Kb => condor-9f4c3ed1-6006-446c-b726-fa2a87e09d4 90.7Kb 171Kb 126Kb
<= 5.89Kb 12.0Kb 8.74Kb => condor-6ffd331e-adff-41cf-8aa7-3304c32b5ca 89.9Kb 139Kb 2.53Mb
<= 5.68Kb 8.99Kb 17.4Kb => condor-6e9f7759-aa70-441c-bbfa-b3df924d479 77.6Kb 79.5Kb 73.8Kb
<= 4.19Kb 7.38Kb 5.58Kb => condor-aefef505-5ddf-408b-b8e7-5d71ea095b1 82.8Kb 75.5Kb 58.3Kb
<= 4.80Kb 5.33Kb 3.90Kb => condor-279e6a74-0c03-41ac-92c3-db90f90a949 0b 59.7Kb 97.9Kb
<= 0b 4.40Kb 7.11Kb => condor-74a240ef-34cc-4d63-b4cc-6536ff83e57 95.1Kb 46.9Kb 51.4Kb
<= 6.49Kb 3.57Kb 4.11Kb => condor-ab2e673b-8d77-4e1c-8bfe-eefe729d314 11.8Kb 46.6Kb 3.30Mb
<= 1.69Kb 3.32Kb 16.6Kb => condor-d8c978df-6ae2-4bdb-8d04-a367943071d 32.9Kb 40.0Kb 57.1Kb
<= 2.30Kb 3.06Kb 4.30Kb => condor-3ac7e2b2-ced0-4754-96af-ce5e607a783 12.1Kb 30.3Kb 84.4Kb
<= 1.69Kb 1.60Kb 5.70Kb
TX: cum: 1.83GB peak: 165Mb rates: 1.07Mb 37.7Mb 89.6Mb
RX: 743MB 122Mb 398Kb 2.21Mb 48.1Mb
TOTAL: 2.56GB 263Mb 1.46Mb 39.9Mb 138Mb

sekhrineha@sc-demo-us-central1-f-f81i ~]$ sudo su
[root@sc-demo-us-central1-f-f81i sekhrineha]# iftop
interface: eth0
IP address is:
MAC address is: 42:01:0a:ffffff80:05:fffffff9
[root@sc-demo-us-central1-f-f81i sekhrineha]#

sc-demo-us-central1-f-f81i.c.fermilab-poc. => 35.9Kb 118Kb 110Kb
<= 509Kb 41.0Mb 27.4Mb => condor-f937e377-2318-40d6-90eb-652e724531f 0b 33.4Mb 30.9Mb
<= 0b 153Kb 194Kb => condor-4600ed45-9a5c-4825-84bf-2653ee47535 285Kb 17.7Mb 7.12Mb
<= 14.4Kb 98.6Kb 61.2Kb => condor-c321b271-84cb-4e01-831e-5c99b8f7df9 0b 10.4Mb 3.47Mb
<= 0b 136Kb 45.3Kb => condor-ccd20b50-1377-4e46-a458-fc634ad8441 52.2Kb 9.39Mb 4.38Mb
<= 5.68Kb 65.9Kb 48.1Kb => condor-d942d30a-12c7-4569-8013-07427bc55d0 29.3Mb 5.87Mb 1.96Mb
<= 59.1Kb 11.9Kb 3.98Kb => condor-b4a252c8-0020-4fff-8743-3ff3839d2aa 0b 5.87Mb 1.96Mb
<= 0b 11.3Kb 3.76Kb => condor-d07f1b08-3117-478a-a265-66125e3aadc 0b 5.52Mb 1.84Mb
<= 0b 65.1Kb 21.7Kb => condor-d120f50e-4e55-4cf3-ad91-640c48d68b3 25.5Mb 5.09Mb 2.98Mb
<= 58.4Kb 11.7Kb 6.74Kb => condor-fc71e4f2-a9fa-47fc-b431-4a3f8182490 24.3Mb 5.03Mb 5.66Mb
<= 77.6Kb 26.7Kb 24.9Kb => condor-d4fdbc68-3652-4b96-859f-0ae2c55fac1 74.1Kb 4.70Mb 3.57Mb
<= 4.40Kb 35.3Kb 32.8Kb => condor-e4210a6f-3e4f-4ca9-8cf8-c2d2411f550 0b 4.13Mb 1.38Mb
<= 0b 40.3Kb 13.4Kb => condor-efa1ea0b-f4e5-4bbc-8dad-3c95840dea6 0b 3.37Mb 8.65Mb
<= 0b 39.2Kb 103Kb
sc-demo-us-central1-f-f81i.c.fermilab-poc. => 68.8Kb 68.4Kb 67.7Kb
<= 390Kb 320Kb 360Kb => condor-0d3ddf07-5242-455f-889d-44e24a75725 94.8Kb 181Kb 199Kb
<= 29.6Kb 29.7Kb 29.4Kb => condor-d604d196-fa15-436e-82c9-e8d59ce98a0 328Kb 171Kb 193Kb
<= 30.3Kb 29.4Kb 29.1Kb => condor-1f5cc2e2-eb1c-444d-a7c3-ba37a99037f 0b 49.6Kb 64.7Kb
<= 0b 2.38Kb 4.57Kb => condor-adb43335-fc0e-4cda-b4c6-3ebc7b79eea 82.4Kb 40.3Kb 60.3Kb
<= 6.70Kb 3.44Kb 5.02Kb
TX: cum: 385MB peak: 186Mb rates: 80.1Mb 111Mb 103Mb
RX: 107MB 94.5Mb 1.18Mb 42.1Mb 28.5Mb

root@sc-demo-us-central1-a-skfj sekhrineha]# iftop
interface: eth0
IP address is:
MAC address is: 42:01:0a:ffffff80:07:42
[root@sc-demo-us-central1-a-skfj sekhrineha]# hostname
[root@sc-demo-us-central1-a-skfj sekhrineha]#

246Mb                      491Mb                      737Mb                      982Mb                1.20Gb
└─────────────────────────┴──────────────────────────┴──────────────────────────┴──────────────────────────┴────────────────────────── => condor-95531532-adcf-4af5-8c52-0bc3278032fa.c.fermil 64.4Mb 69.6Mb 66.4Mb
<= 815Kb 571Kb 441Kb => condor-123995c8-5ae3-4661-8577-befeb708adfa.c.fermil 90.9Mb 56.0Mb 57.0Mb
<= 334Kb 1.44Mb 644Kb
sc-demo-us-central1-a-skfj.c.fermilab-poc.internal => 776Kb 701Kb 649Kb
<= 65.7Mb 42.6Mb 17.3Mb => condor-9d68fd73-1924-4b9f-914a-e6c372276f56.c.fermil 16.7Mb 23.1Mb 5.77Mb
<= 41.5Kb 58.6Kb 14.7Kb => condor-f7736585-a13c-4c49-a7f2-46af891a3203.c.fermil 0b 18.8Mb 4.71Mb
<= 0b 59.6Kb 14.9Kb => condor-9e694d2b-f8fc-4d82-b778-fad229845246.c.fermil 12.2Mb 16.9Mb 4.22Mb
<= 257Kb 120Kb 29.9Kb => condor-728385be-842e-47ee-b46b-7118ace63c3e.c.fermil 75.3Mb 15.1Mb 3.77Mb
<= 431Kb 86.1Kb 21.5Kb => condor-cfbc3cca-eeab-4dcd-869f-412ce5c6a773.c.fermil 73.3Mb 14.7Mb 3.66Mb
<= 380Kb 76.0Kb 19.0Kb => condor-02d2182a-7bb4-4dce-a823-7f7e9da2ba11.c.fermil 43.3Kb 7.46Mb 1.89Mb
<= 4.19Kb 42.8Kb 12.8Kb => condor-b232d4fd-9e60-4d20-9c98-cdce62a50459.c.fermil 0b 1.45Mb 480Kb
<= 0b 977Kb 308Kb => condor-36b79e58-2191-4cab-8c9c-f43c991803cb.c.fermil 0b 1.45Mb 480Kb
<= 0b 977Kb 308Kb => condor-2a4f6ed3-8aae-4332-a731-7ba5f12683ff.c.fermil 1.54Mb 1.58Mb 2.97Mb
<= 231Kb 232Kb 248Kb => condor-5ce5d03a-96d7-4474-8b29-803c0645e865.c.fermil 1.27Mb 1.53Mb 4.64Mb
<= 228Kb 231Kb 249Kb => condor-d847e66e-1170-4cd8-b6c0-cfe234ba644b.c.fermil 452Kb 90.3Kb 22.6Kb
<= 923Kb 185Kb 46.1Kb => condor-2259c964-ca8e-45e8-8773-ed8cbafe211f.c.fermil 109Kb 163Kb 231Kb
<= 29.9Kb 30.1Kb 30.8Kb
sc-demo-us-central1-a-skfj.c.fermilab-poc.internal => 32.7Kb 34.2Kb 35.7Kb
<= 92.7Kb 146Kb 214Kb => condor-5acab265-56fc-4500-b3be-b32ee53c3724.c.fermil 0b 21.6Kb 5.41Kb
<= 0b 1.51Kb 386b => condor-054b9cf1-af10-4db5-b4f6-12cf00d6ca36.c.fermil 0b 13.6Kb 3.39Kb
<= 0b 947b 237b
sc-demo-us-central1-a-skfj.c.fermilab-poc.internal => 7.98Kb 12.6Kb 10.1Kb
<= 208b 291b 1.90Kb => condor-d47677ab-6090-4f6e-b79a-d78b79adf8df.c.fermil 0b 3.06Kb 784b
<= 0b 1.83Kb 468b
sc-demo-us-central1-a-skfj.c.fermilab-poc.internal => 0b 3.49Kb 893b
<= 0b 1.32Kb 338b
sc-demo-us-central1-a-skfj.c.fermilab-poc.internal => 0b 3.47Kb 1.08Kb
<= 0b 1.30Kb 413b => condor-cdbdfc75-f2ec-4c45-ba0c-02253e8d9d6a.c.fermil 0b 2.19Kb 560b
<= 0b 1.29Kb 331b => condor-d842cd7d-5525-4ea3-ac9a-69c1d18d01ed.c.fermil 0b 1.34Kb 343b
<= 0b 866b 217b
TX: cum: 1.72GB peak: 568Mb rates: 337Mb 229Mb 174Mb
RX: 553MB 69.4Mb 69.4Mb 47.7Mb 19.9Mb
TOTAL: 2.26GB 575Mb 406Mb 276Mb 194Mb

At this time (! 11:26 am), network bytes out is ~ 64MB/s and cpu utilization is 2%

#2 Updated by Steven Timm over 4 years ago

  • Tracker changed from Milestone to Task
  • Status changed from New to Work in progress
  • % Done changed from 0 to 90

#3 Updated by Steven Timm almost 4 years ago

  • Status changed from Work in progress to Resolved

We looked at this. It turned out that our selected scale-up rate was very agresssive, we never came close to saturating the network of any of our on-demand squid servers. we scaled up at 9MB/s, could have scaled at 5X that and been just fine.

#4 Updated by Steven Timm over 3 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF