Project

General

Profile

Aug 2018 rgang iperf at pdune

Using rgang_iperf

First, I had to path and build a version of iperf that
changed: listen( mSettings->mSock, 5 )
to: listen( mSettings->mSock, 160 )

So, this iperf (~ron/bin/iperf) needs to be used.

160 is based on each server in the simulation getting connections from 8 * 20 "flows"
I also did

rgang np04-srv-0\{01-4} 'sudo sysctl -w net.core.somaxconn=160'

because 160 is greater than the default 128.

First see of an rgang of an basic rgang works:

rgang np04-srv-0\{01} 'PATH=~/bin:~/script:$PATH; rgang  np04-srv-0\{11} uptime'

Then see of a full rgang of rgang works:

rgang -n2 np04-srv-0\{01-4} 'PATH=~/bin:~/script:$PATH; rgang --combine  np04-srv-0\{11-18} uptime'

Then try a simple rgang_iperf.sh:

rgang -n2 np04-srv-0\{01} 'PATH=~/bin:~/script:$PATH; '\
'rgang_iperf.sh np04-srv-0\{11-18} --servers=2 --time=10 --len=3M --sndbuf=6M -P2'

Then try the full rgang_iperf.sh:

rgang -n2 np04-srv-0\{01-4} 'PATH=~/bin:~/script:$PATH; '\
'rgang_iperf.sh np04-srv-0\{11-18,11-18} --servers=2 --time=20 --len=3M --sndbuf=6M,60M -P20'

The "full" rgang_iperf tries to emulate 20 BRs on each of the 8 BR nodes sending to 2 EB on each of the 4 EB nodes.
Note: the BR nodes are specified twice because the script divides them among the servers=2.

Each BR node will have 160 socket buffers active. If each is has room for 20 events (in case an EB hangs just after send 20 tokens), then the total room is approx. 3M * 20 * 160 = 9.6 GB. That, perhaps, is too much? (But what good is memory, if you can't use it.)

In the noisy output below, the 2 columns that should be specifically noted are "_Gb/s_" and "snd(K)" where you
see that when the snd buffers are 10M (requested, 20M allocated), the performance is nice, but when 60M sndbuffers
are specified, the performance is not so good.

The one significant difference between this simulation and the real system, is that
with the simulation, there is no throttling by a trigger. Although, if we are suppose to handle a burst of 100Hz, a significant portion of the buffers will become full, so maybe the simulation is not so bad??? Further investigation seems warranted.

/nfs/home/ron
np04-srv-014 :^) rgang -n2 np04-srv-0\{01-4} 'PATH=~/bin:~/script:$PATH; ''rgang_iperf.sh np04-srv-0\{11-18,11-18} --servers=2 --time=20 --len=3M --sndbuf=6M,60M -P20'

- - - - - - - - - - - - - - np04-srv-001 - - - - - - - - - - - - - -
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:18:42 CEST 2018: BW:20000Mb minRTT:0.066ms linkBW*delay: 82500 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:19:08 CEST 2018  19.16    0    0     0     0 3504978 3504976   2   320   982112.0  3072.0  12288 3069.1
Sat Aug 25 21:19:43 CEST 2018  10.28    0    0     0     0 3561484 3561484   0   320   982112.0  3072.0 122880 3069.1

- - - - - - - - - - - - - - np04-srv-002 - - - - - - - - - - - - - -
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:18:42 CEST 2018: BW:20000Mb minRTT:0.061ms linkBW*delay: 76250 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:19:08 CEST 2018  19.12    0    0     0     0 3505678 3505676   2   320   982112.0  3072.0  12288 3069.1
Sat Aug 25 21:19:43 CEST 2018  15.03    0    0     0     0 3537330 3537330   0   320   982112.0  3072.0 122880 3069.1

- - - - - - - - - - - - - - np04-srv-003 - - - - - - - - - - - - - -
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:18:42 CEST 2018: BW:20000Mb minRTT:0.049ms linkBW*delay: 61250 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:19:08 CEST 2018  18.81    0    0     0     0 3530582 3530580   2   320   982112.0  3072.0  12288 3069.1
Sat Aug 25 21:19:43 CEST 2018  11.35    0    0     0     0 3581768 3581768   0   320   982112.0  3072.0 122880 3069.1

- - - - - - - - - - - - - - np04-srv-004 - - - - - - - - - - - - - -
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: RX:              on
Sat Aug 25 21:18:42 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:18:42 CEST 2018: BW:20000Mb minRTT:0.069ms linkBW*delay: 86250 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:19:05 CEST 2018  18.89    0    0     0     0 3479252 3479250   2   320   982112.0  3072.0  12288 3069.1
Sat Aug 25 21:19:43 CEST 2018  12.03    0    0     0     0 3410672 3410672   0   320   982112.0  3072.0 122880 3069.1

--2018-08-25_21:19:44--

We can decrease the retrans by setting small-ish rcvbufs (without setting, the kernel tries to do auto tuning):

/nfs/home/ron
np04-srv-014 :^) rgang -n2 np04-srv-0\{01-4} 'PATH=~/bin:~/script:$PATH; ''rgang_iperf.sh np04-srv-0\{11-18,11-18} --servers=2 --time=20 --len=3M --rcvbuf=9K --sndbuf=6M,60M -P20'

- - - - - - - - - - - - - - np04-srv-001 - - - - - - - - - - - - - -
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:29:04 CEST 2018: BW:20000Mb minRTT:0.066ms linkBW*delay: 82500 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:29:26 CEST 2018  19.91    0    0     0     0 14942 14934   8   320     1952.0     9.0  12288 6.1
Sat Aug 25 21:29:54 CEST 2018   7.32    0    0     0     0 37979 37975   4   320     1952.0     9.0 122880 6.1

- - - - - - - - - - - - - - np04-srv-002 - - - - - - - - - - - - - -
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:29:04 CEST 2018: BW:20000Mb minRTT:0.070ms linkBW*delay: 87500 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:29:26 CEST 2018  20.00    0    0     0     0 14940 14934   6   320     1952.0     9.0  12288 6.1
Sat Aug 25 21:29:54 CEST 2018   6.21    0    0     0     0 38640 38636   4   320     1952.0     9.0 122880 6.1

- - - - - - - - - - - - - - np04-srv-003 - - - - - - - - - - - - - -
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:29:04 CEST 2018: BW:20000Mb minRTT:0.073ms linkBW*delay: 91250 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:29:26 CEST 2018  19.85    0    0     0     0 14936 14928   8   320     1952.0     9.0  12288 6.1
Sat Aug 25 21:29:54 CEST 2018   6.90    0    0     0     0 40028 40024   4   320     1952.0     9.0 122880 6.1

- - - - - - - - - - - - - - np04-srv-004 - - - - - - - - - - - - - -
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: RX:              on
Sat Aug 25 21:29:04 CEST 2018: bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
Sat Aug 25 21:29:04 CEST 2018: BW:20000Mb minRTT:0.069ms linkBW*delay: 86250 Bytes
/proc/sys/net/core/rmem_max:212992
/proc/sys/net/core/netdev_max_backlog:1000
/proc/sys/net/ipv4/tcp_moderate_rcvbuf:1
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_rmem:4096        87380   6291456
#____________date____________ _Gb/s_ errs drop ovrun frame __rmt_retrans__ flows inflight(K) rcv(K) snd(K) rcalc(K)
Sat Aug 25 21:29:26 CEST 2018  19.92    0    0     0     0 14942 14934   8   320     1952.0     9.0  12288 6.1
Sat Aug 25 21:29:54 CEST 2018   8.32    0    0     0     0 39460 39456   4   320     1952.0     9.0 122880 6.1

--2018-08-25_21:29:55--