Project

General

Profile

Task #8994

Describe observed load distribution in dCache/EOS

Added by Gerard Bernabeu Altayo over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Start date:
06/01/2015
Due date:
06/01/2015
% Done:

0%

Estimated time:
2.00 h
Spent time:
Duration: 1

Description

We need an update on what load we have on our dCache disk, where most of the coming disk purchase will go.

I'll use billinginfo data for this.

History

#1 Updated by Gerard Bernabeu Altayo over 4 years ago

Copy&pasting the data below.

Statistics for a whole month:
billing=> select count(*),isnew as Writes from billinginfo where datestamp > '2015-05-01 00:00:00' and datestamp < '2015-06-01 00:00:00' group by isnew;
count | writes
----------+--------
93334370 | f
421845 | t
(2 rows)
billing=> select count (distinct substring(cellname from 3 for 10)) as pools from billinginfo where datestamp > '2015-05-01 00:00:00' and datestamp < '2015-06-01 00:00:00';
pools
-------
189
(1 row)

Statistics for 24h:

[root@cmsdcacheadmindisk ~]# psql U enstore billing
billing=> select count(*),isnew as Writes from billinginfo where datestamp > '2015-05-29 00:00:00' and datestamp < '2015-05-30 00:00:00' group by isnew;
count | writes
--------
+--------
6835183 | f
27796 | t
(2 rows)
billing=> select count (distinct substring(cellname from 3 for 10)) as pools from billinginfo where datestamp > '2015-05-29 00:00:00' and datestamp < '2015-05-30 00:00:00';
pools
-------
189
(1 row)

Statistics for a 2h subset within this 24h:

billing=> select count (distinct substring(cellname from 3 for 10)) as pools from billinginfo where datestamp > '2015-05-29 00:00:00' and datestamp < '2015-05-29 02:00:00';
pools
-------
189
(1 row)

billing=> select count(*),isnew as Writes from billinginfo where datestamp > '2015-05-29 00:00:00' and datestamp < '2015-05-29 02:00:00' group by isnew;
count | writes
--------+--------
424345 | f
3206 | t
(2 rows)

#2 Updated by Gerard Bernabeu Altayo over 4 years ago

This basically means we see:

Reads per Write ratio -- M: 220 r/w , D: 245 r/w , 2H: 132 r/w
Writes that happened per Pool -- Monthly: 2231 , Daily: 147 , 2H window: 17

So to me the test should try to do 9 writes and 2205 reads per hour.

Note that for the 2h period I cherry-picked a busy one.

At any given time a single pool has from 100-250 transfers running simultaneously. Given the way our setup is now, this means 500 parallel transfers per SATABeast (because we're plugging 2 servers per SATABeast).

The average CMS file is 2GB in size so the test files should be this big.

Average read speed is diverse ranging from 50KB/s to 160MB/s - http://cmsdcacheadmindisk.fnal.gov:2288/context/transfers.html.

I've looked at it like:

-bash-4.1$ wget http://cmsdcacheadmindisk.fnal.gov:2288/context/transfers.txt
-bash-4.1$ grep cmsstor332 transfers.txt | awk '{print $(NF-1) }' | grep -v 0.0 | sort -n
25.197428116568872
25.319155538568484
25.925857235277146
27.793103515543084
27.89962186095695
28.1236499201468
28.31715738352416
28.460839338167776
29.41281881078067
29.82311478396275
31.13171515356833
32.39627927216549
37.90351744167948
38.94710979259364
39.47692832996922
43.686879810139075
43.916283552114415
47.025192936648075
47.81642710117434
50.329962092671295
50.708222882825666
50.95335423354342
51.77061376317421
54.224675324675324
69.04834345838292
118.1722733215935
122.78111610741102
151.88523293193575
183.17334805902811
195.81805164790995
207.20143005236437
207.70383693045562
224.76358490566037
226.07915276367328
229.5018703425525
230.67730475457398
238.70421974522293
245.45797631048387
261.1458160085953
261.7351737818738
263.33495555780866
263.79698775598655
264.3983033593485
266.6659488384234
269.17173812308306
270.46109886346613
270.9629031901488
271.04433751503694
275.5541608243752
279.67244145757275
281.20669789124173
289.31141456582634
293.1462581159585
298.3754226362774
303.1792306746958
306.61962852195813
309.48159034039
319.52898568699925
323.37598680035535
325.42922117743257
328.737663879426
332.17367897962566
332.9716184196971
334.44543538603887
355.9166990183573
358.5254437154174
393.3147106369221
402.43756736946494
409.8671995829351
412.34756052689676
412.4622239374112
425.62876600658944
437.14349201834534
439.1088802264111
443.57308503162335
444.1632507447245
456.67897372879776
463.02838286275744
467.77075489636815
470.3084310630893
490.4576786088414
501.63833547219826
519.3876314610577
525.3652809292837
551.0725700812156
907.056925634248
1063.553566598552
1464.8777582332284
1483.8671328671328
1588.5668772121078
1727.3544231408716
1733.21798024244
1768.8829967634422
1788.8812813843647
1804.6819198513435
1810.8915322176172
1828.9043415559584
1849.985113199332
1900.9773211172417
1923.3149519973044
2027.7623406439793
2188.6024159663866
2268.7917074763304
2307.1337889493802
2337.9268414815688
2493.288440792909
2503.55507136565
2520.764529698247
2582.148799974071
2583.2870167408473
2585.0568803979286
2592.664951472253
2614.633492300314
2730.824071365557
2736.7067113157527
2802.8834469310627
2820.9515554262216
2821.745870332224
2843.6985315992556
2854.7389902964883
2862.7237026484463
2902.3869974348618
2905.845672909786
2949.827852398524
2954.969497011573
2971.524351261739
2996.8559760045628
3035.8560913393426
3674.2835436392274
3860.202642093236
3941.73753172281
3958.458788249571
4031.6576184930523
4466.624425856118
21270
21966
23916
128930

Note this is KB/s and could be summarized (I've normalized so that it is easier and maybe more realistic) as 250 transfers with:

- 2 full speed ~0.5%
- 2 with high speed 25MB/s ~0.5%
- 55 with medium speed 1-5 MB/s ~ 22%
- 68 slow 0.1-0.9 MB/s ~ 27%
- 123 very slow tranfers around 50KB/s ~ 50%

Assuming a solution like the SATABeasts we have right now and given all this data I'd make the benchmark such that it does:

- 603 parallel 2GB data file transfers per server at a ratio of 200 reads per write.
- 3 Writes should be at full speed (no cap, as much as the system can deliver)
- 600 reads reads should be capped as follow:
-- 5 reads at full speed (no cap, as much as the system can deliver)
-- 5 reads capped at 25MB/s (125MB/s)
-- 130 reads capped at 3MB/s (390MB/s)
-- 160 reads capped at 0.5MB/s (80MB/s)
-- 300 reads capped at 0.05MB/s (15MB/s)

This totals 610MB/s in capped reads, the rest of available capacity should be use by the non-capped transfers.

#3 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Due date set to 06/01/2015
  • Status changed from New to Resolved
  • Estimated time set to 2.00 h

Email sent with the summary:

from: Gerard Bernabeu <>
to: Amitoj G Singh <>
cc: Stuart C Fuess <>,
David J Fagan <>,
David A Mason <>,
Timothy J Kasza <>,
Stanley J Naymola <>
date: Mon, Jun 1, 2015 at 5:26 PM

Closing task.

#4 Updated by Gerard Bernabeu Altayo over 4 years ago

For the record, we do have loads higher than that (~1100 parallel transfers, mostly slow reads), right now:

[root@cmsstor334 ~]# lsof | grep c /storage/data
1438
[root@cmsstor334 ~]# netstat -puta | grep -c java
1148
[root@cmsstor334 ~]# dstat
----total-cpu-usage---
dsk/total net/total ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
1 1 97 0 0 0| 90M 3910k| 0 0 | 0 0 | 13k 9958
5 4 90 0 0 1| 75M 0 |2487k 259M| 0 0 | 66k 62k
5 4 89 0 0 1| 54M 0 |2764k 314M| 0 0 | 70k 61k
4 3 92 0 0 1| 36M 0 |2344k 243M| 0 0 | 62k 57k
5 4 89 1 0 1| 71M 17k|2623k 293M| 0 0 | 67k 59k^C
[root@cmsstor334 ~]# iotop -b -n1 | grep -c java
2008

[root@cmsstor334 ~]# iotop

Total DISK READ: 90.28 M/s | Total DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
63628 be/4 root 2.66 M/s 0.00 B/s 0.00 % 2.98 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63652 be/4 root 1297.71 K/s 0.00 B/s 0.00 % 2.61 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63659 be/4 root 838.72 K/s 0.00 B/s 0.00 % 2.34 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
44683 be/4 root 8.67 M/s 0.00 B/s 0.00 % 2.12 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
63651 be/4 root 3.03 M/s 0.00 B/s 0.00 % 1.57 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
44699 be/4 root 402.85 K/s 0.00 B/s 0.00 % 1.54 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
63655 be/4 root 1271.30 K/s 0.00 B/s 0.00 % 1.38 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63656 be/4 root 6.41 M/s 0.00 B/s 0.00 % 1.36 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
44686 be/4 root 6.60 K/s 0.00 B/s 0.00 % 1.21 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
63636 be/4 root 102.36 K/s 0.00 B/s 0.00 % 0.72 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63645 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.39 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63649 be/4 root 4.90 M/s 0.00 B/s 0.00 % 0.22 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63657 be/4 root 4.90 M/s 0.00 B/s 0.00 % 0.18 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63637 be/4 root 3.14 M/s 0.00 B/s 0.00 % 0.13 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63646 be/4 root 709.94 K/s 0.00 B/s 0.00 % 0.12 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63638 be/4 root 429.27 K/s 0.00 B/s 0.00 % 0.10 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63654 be/4 root 3.30 M/s 0.00 B/s 0.00 % 0.09 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
44693 be/4 root 825.52 K/s 0.00 B/s 0.00 % 0.09 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44263 be/4 root 3.30 M/s 0.00 B/s 0.00 % 0.07 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
63658 be/4 root 600.98 K/s 0.00 B/s 0.00 % 0.07 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
44261 be/4 root 3.83 M/s 0.00 B/s 0.00 % 0.07 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
44319 be/4 root 845.33 K/s 0.00 B/s 0.00 % 0.07 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
44677 be/4 root 4.93 M/s 0.00 B/s 0.00 % 0.05 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44317 be/4 root 3.30 M/s 0.00 B/s 0.00 % 0.04 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
44678 be/4 root 211.33 K/s 0.00 B/s 0.00 % 0.04 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
63650 be/4 root 3.30 K/s 0.00 B/s 0.00 % 0.02 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
44684 be/4 root 3.24 M/s 0.00 B/s 0.00 % 0.02 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44320 be/4 root 3.30 K/s 0.00 B/s 0.00 % 0.01 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
44689 be/4 root 6.60 K/s 0.00 B/s 0.00 % 0.01 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
63634 be/4 root 4.91 M/s 0.00 B/s 0.00 % 0.01 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
35581 be/4 root 4.95 M/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44262 be/4 root 3.30 M/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
44680 be/4 root 2.45 M/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44691 be/4 root 1284.50 K/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44692 be/4 root 845.33 K/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44694 be/4 root 1690.66 K/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
44695 be/4 root 452.38 K/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk2Domain
63633 be/4 root 1542.06 K/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63648 be/4 root 3.21 M/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
63653 be/4 root 1238.27 K/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk3Domain
57344 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.~g.dcache.boot.BootLoader start w-cmsstor334-disk-disk1Domain
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init
2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
3 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]



Also available in: Atom PDF