Research
- Transmission Control Protocol
- Flowgrind
- Netgrind
- Motivation
- Output
- Manpages
- Autoconfiguration
netgrind
Section: Netgrind Manual (1)Updated: January 2012
Index
NAME
netgrind - network performance measurement toolSYNOPSIS
netgrind [general options] [flow options]
DESCRIPTION
netgrind is a distributed network performance measurement tool. Using the netgrind controller, tests can be setup between hosts running netgrindd(1) , the netgrind daemon.Netgrind keeps track of a distributed applications traffic. It shows which node contacts which other nodes and measures the traffic between them. Additionally, netgrind can perform extensive TCP tests. On systems running the Linux kernel netgrind collects and reports the TCP metrics returned by the TCP_INFO socket option, such as the size of the congestion window or the RTT.
Each test connection is called a flow and netgrindd has to be running on every host that supped to be a flow endpoint. The netgrind controller only sets up the flows as well as gathers and prints the results. The controller does however not actually participate in the test. However, mixing 32bit and 64bit systems can potentially lead to unexpected results and is therefore not recommended.
OPTIONS
General information
Two important groups of options are the global options and flow options. Like the name suggests, global options apply globally and potentially affect all flows. Flow-specific options only apply to the subset of flows selected using the -F option.
Miscellaneous
- -h
-
Show help and exit.
- -h [s|g]
-
show additinal help for socket options and traffic generation.
- -v
-
Print version information and exit.
General options
- -b mean1,mean2,mean3
- -b lwr_bound1,upr_bound1,lwr_bound2,upr_bound2,lwr_bound3,upr_bound3
-
Perform Anderson-Darling Test for exponential distribution OR lower and upper bounds for computing the test for uniform distribution with the given bounds. Test data always generated from first flow.
- -c -begin,-end,-thrpt,-transac,+blocks,-rtt,-iat,-kernel,+status
-
Comma separated list of column groups to display in output. Prefix with either + to show column group, - to hide column group.
- -d
-
Increase debugging verbosity. Add option multiple times to be even more verbose. Only available if compiled with ./configure --enable-debug set.
- -e PREFIX
-
Prepend PREFIX to log and dump filename (default: "flowlog-").
- -f file
-
When using the -Z command to start a passive measurement, the -f command can supply a ip-mapping file. This file is used in case your test nodes have different network interfaces for the testing- and the data-connection. Where the testing-connection is the one where the data of the measured application is found, and the data-connection is the one used for netgrind overhead data. The mapping file should have the following format (one line for each node):
test-if-ip:data-if-ip
Additionally a whitelist is created with the nodes in this file, such that a passive measurement only discovers nodes listed in this file.
- -i #.#
-
Reporting interval in seconds (default: 0.05s).
- -l NAME
-
Use log filename NAME instead. If not specified the current time is used for the filename.
- -m
-
Report throughput in 2**20 bytes/second (MiB/s) (default: 10**6 bit/sec (Mbit/s)).
- -n #
-
Number of test flows (default: 1).
- -o
-
Overwrite existing log files (default: don't).
- -p
-
don't print symbolic values (like INT_MAX) instead of numbers (default: off)
- -q
-
Be quiet, do not log to screen (default: off).
- -r #
-
use random seed # (default: read /dev/urandom). Used for traffic generation.
- -w
-
Write output to logfile (default: off).
Flow options
All flows have two endpoints, a source and a destination. The distinction between source and destination endpoints only affects connection establishment. When starting a flow the destination endpoint listens on a socket and the source endpoint connects to it. For the actual test this makes no difference, both endpoints have exactly the same capabilities. Data can be sent in either direction and many settings can be configured individually for each endpoint.
Some of these options take the flow endpoint as argument. Is is denoted by 'x' in the option syntax. 'x' needs to be replaced with either 's' for the source endpoint, 'd' for the destination endpoint or 'b' for both endpoints. To specify different values for both endpoints, separate them by comma.
Example: -T s=5,d=10
- -A x
-
activate minimal request and response size for RTT calculation. (same as -G p=C,24)
- -B x=#
-
Set requested sending buffer in bytes.
- -C x
-
Stop flow if it is experiencing local congestion.
- -D x=DSCP
-
DSCP value for type-of-service IP header byte.
- -E x
-
Enumerate bytes in payload (default: don't).
- -F #[,#]*
-
Comma-separated list of flows.
Flow options following this option apply only to the specified flows. Useful in combination with
-n
to set specific options for certain flows.
Numbering starts with 0, so -F 1 refers to the second flow.
All flow options before the first -F apply to all flows.
- -G x=[q|p|g],[C|E|P|N|U],#1,(#2)
-
Activate stochastic traffic generation and set parameters according to the used distribution.
- -H x=HOST[/RPCADDRESS[:PORT]]
-
Test from/to HOST. Optional argument is the address and port of the RPC server.
An endpoint that isn't specified is assumed to be 127.0.0.1.
- -L
-
Call connect() on test socket immediately before starting to send data (late connect).
If not specified the test connection is established in the preparation phase before the test starts.
- -N
-
Call shutdown() after test flow is scheduled to end.
- -M x
-
dump traffic using libpcap.
- -O x=OPT
-
Set specific socket options on test socket.
For a list of supported socket options see
-h s
option.
- -P x
-
Do not iterate through select() to continue sending in case block size did not suffice to fill sending queue (pushy).
- -Q
-
Summarize only, skip interval reports (quiet).
- -R x=#.#[z|k|M|G][b|B][p|P]
-
Rate limiting. Send data at specified rate per second, where:
z=2**0,k=2**10,M=2**20,G=2**30.
b=bitspersecond(default),y=bytespersecond,B=blockspersecond.
p=periodic(default),P=Poissondistributed.
- -T x=#.#
-
Set flow duration, in seconds (default: s=10,d=0).
- -U#
-
Set application buffer size (default: 8192).
truncates values if used with stochastic traffic generation.
- -S x=#
-
Set block size, same as -G s=q,C,#.
- -W x=#
-
Set requested receiver buffer (advertised window) in bytes.
- -X file
-
Set a scriptfile to be run on node discovery in -Z passive mode. This script must be located on the machine running the netgrindd daemon, it is invoked on that node as soon as that node is discovered in a passive netgrind measurement.
- -Y x=#.#
-
Set initial delay before the host starts to send data.
- -Z s=#.#.#.#(/#.#.#.#)
-
Start a passive measurement. A passive measurement starts at a given node (s=test-if-ip(/data-if-ip)), watching this nodes connections. New outgoing or incoming connections trigger netgrind to include the respective other node in the measurement and watch its connections as well. To limit the IP-range for node discovery see also the -f command.
Traffic Generation Options
-G x=[q|p|g],[C|U|E|N|L|P|W],#1,(#2)
Activate stochastic traffic generation and set parameters for the chosen distribution.
use distribution for the following flow parameter:
q request size (in bytes)
p response size (in bytes)
g
request interpacket gap (in s)
possible distributions:
C constant (param 1: value, param 2: not used)
U uniform (param 1: min, param 2: max)
E exponential (param 1: lamba - lifetime, param 2: not used)
N normal (param 1: mu - mean value, param 2: sigma_square - variance)
P pareto (param 1: k - shape, x_min - scale)
W weibull (param 1: lambda - scale, param 2: k - shape)
L lognormal (param 1: zeta - mean value, param 2: sigma - std dev)
advanced distributions like weibull are only available if netgrind is compiled with libgsl support.
-U # specify a cap for the calculated values for request and response sizes, needed because the advanced distributed values are unbounded, but we need to know the buffersize (it's not needed for constant values or uniform distribution). Values outside the bounds are recalculated until a valid result occurs but at most 10 times (then the bound value is used).
EXAMPLES
- netgrind
-
default settings, same as netgrind -H b=127.0.0.1 -T s=10,d=0
- netgrind -H s=host1,d=host2
-
Start bulk TCP transfer with host1 as source and host2 as destination endpoint. Both endpoints need to be running the netgrind daemon. The default flow options are used, with a flow of 10 seconds duration with data sent from the source to the destination endpoint.
- netgrind -H s=host1,d=host2 -T s=0,d=10
-
Same as the above but instead with a flow sending data for 10 seconds from the destination to the source endpoint.
- netgrind -f mapping-file -Z s=192.168.9.3/192.168.10.3
-
Example for the mapping-file:
192.168.9.2:192.168.10.2
192.168.9.3:192.168.10.3
192.168.9.4:192.168.10.4
Assume we have a mesh network, where the mesh nodes' wireless interface works in the 192.168.9.0/24 net, and their wired interfaces in the 192.168.10.0/24 net. We choose one node (possibly one with a good connection to many others) where we start the passive measurement. Netgrinds node discovery will recognize new nodes only with their mesh ip, but the nodes only listen on their wired ip for netgrind connections. So the discovered mesh ip can automatically be translated into the wired ip of the respective node.
- netgrind -f mapping-file -X /path/to/script -Z s=192.168.9.3/192.168.10.3
-
The same as above, but the supplied script (located on the mesh nodes) is run immediately on the start-node, and on new nodes as soon as they are discovered by netgrind.
- netgrind -n 2 -F 0 -H s=192.168.0.1,d=192.168.0.69 -F 1 -H s=10.0.0.1,d=10.0.0.2
-
Setup two flows, first flow between 192.168.0.1 and 192.168.0.69, second flow between 10.0.0.1 to 10.0.0.2
- netgrind -p -H s=10.0.0.100/192.168.1.100,d=10.0.0.101/192.168.1.101 -A s
-
Setup one flow between 10.0.0.100 and 10.0.0.101 and use 192.168.1.x IP addresses for configuration. Activate minimal response for RTT calculation and show numerical values.
- netgrind -G s=q,C,400 -G s=p,N,2000,50 -G s=g,U,0.005,0.01 -U 32000
-
q,C,400
use constant request size of 400 bytes
p,N,2000,50
use normal distributed response size with mean 2000 bytes and variance 50
g,U,0.005,0.01
use uniform distributed interpacket gap with min 0.005s and and max 10ms
-U 32000
truncate block sizes at 32 kbytes (needed for normal distribution)
Traffic Generation Scenarios
The following examples demonstrate how Traffic Generation can be used. These have been incorporated in different tests for netgrind and have been proven meaningful. But as Internet Traffic is diverse, there is no guarantee that these are approicated in every situation.
- Request Response Style (HTTP)
- This scenario is based on the work in http://www.3gpp2.org/Public_html/specs/C.R1002-0_v1.0_041221.pdf
- netgrind -r 42 -M s -G s=q,C,350 -G s=p,L,9055,115.17 -U 100000
- -r 42
- Use random seed 42 to make measurements reproduceable
- -M s
- Dump traffic on sender side
- -G s=q,C,350
- Use constant requests size 350 bytes..TP -G s=p,L,9055,115 Use lognormal distribution with mean 9055 and variance 115 for response size
- -U 100000
- Truncate response at 100 kbytes
- For this scenario we recommened to focus on RTT (lower values are better) and Network Transactions/s as metric (higher values are btter).
-
- Interactive Session (Telnet)
- This scenario emulates a telnet session.
- netgrind -G s=q,U,40,10000 -G s=q,U,40,10000 -O b=TCP_NODELAY
- -G s=q,U,40,10000 -G s=q,U,40,10000
- Use Uniform distributed request and response size between 40 bytes and 10 kilobytes
- -O b=TCP_NODELAY
- Set socket options TCP_NODELAY as used by telnet applications.
- For this scenario RTT (lower is better) and Network Transactions/s are useful metrics (higher is better).
-
- Rate Limited (Streaming Media)
- This scenario emulates a video stream transfer with a bitrate of 800 kbit/s.
- netgrind -G s=q,C,800 -G s=g,N,0.008,0.01
- Use normal distributed interpacket gap with mean 0.008 and a small variance (0.001). In conjuction with request size 800 bytes a average bitrate of approx 800 kbit/s is achieved. The variance is added to emulate a variable bitrate like it's used in todays video codecs.
- For this scenario the IAT (lower is better) and minimal throughput (higher is better) are interesting metrics.
-
OUTPUT COLUMNS
- #
-
The endpoint, either S for source or D for destination.
- ID
-
The numerical flow identifier.
- begin and end
-
The boundaries of the measuring interval in seconds. The time shown is the elapsed time since receiving the RPC message to start the test from the daemons point of view.
Application layer metrics
- through
-
The transmitting goodput of the flow endpoint during this measurement interval, measured in Mbit/s (default) or MB/s (-m).
- transac
-
The number of successfully received response blocks per second (we call it network transactions/s).
- requ/resp
-
The number of request and response block sent during this measurement interval (column disabled by default)
- IAT and RTT
-
The 1-way and 2-way block (application layer) delays respectively block IAT and block RTT. For both delays the minimum and maximum encountered values in that interval are displayed in addition to the arithmetic mean. If no block acknowledgement arrived during that report interval, inf is displayed (for example when no responses are send, if in doubt try -A s)
Kernel metrics (TCP_INFO)
- cwnd (tcpi_cwnd)
-
Size of TCP congestion window in number of segments. All TCP specific metrics are obtained from the Linux kernel through the TCP_INFO socket option at the end of every reporting interval.
- ssth (tcpi_snd_sshtresh)
-
The slowstart threshold of the sender in number of segments.
- uack (tcpi_unacked) and sack (tcpi_sacked)
-
Statistics about the number of unacknowledged and selectively acknowledged segments.
- lost (tcpi_lost)
-
Number of segments assumed lost at the end of the reporting interval.
- retr (tcpi_retrans)
-
Number of unacknowledged retransmitted segments.
- tret (tcpi_retransmits)
-
Number of retransmissions of the same segment due a retransmission timeout.
- fack (tcpi_fackets)
-
Number of segments between SND.UND and the highest selectively acknowledged sequence number.
- reor (tcpi_reordering)
-
Segment reordering metric. The Linux kernel can detect and cope with reordering without loss of performance if the distance a segment gets displaced does not exceed the reordering metric.
- rtt (tcpi_rtt) and rttvar (tcpi_rttvar)
-
TCP round-trip time and its variance given in ms.
- rto (tcpi_rto)
-
The retransmission timeout given in ms.
- bkof (tcpi_backoff)
-
Number of backoffs.
- ca state (tcpi_ca_state)
-
Internal state of congestion control state machine as implemented in the Linux kernel. Can be one of open, disorder, cwr, recovery or loss.
-
- Open
-
is the normal state. It indicates that there are no issues with the connection.
- Disorder
-
is similar to Open but is entered upon receiving duplicate ACKs or selective acknowledgements as special attention might be neded in the near future.
- CWR
-
is entered when the size of the congestion window got lowered due to receiving an ICMP Source Quench message or a notification from Explicit Congestion Notification (ECN).
- Recovery
-
indicates that the congestion window got lowered and a segment is fast-retransmitted.
- Loss
- is entered if the RTO expires. Again the size of the congestion window got lowered in this state.
-
- smss and pmtu
-
Sender maximum segment size and path maximum transmission unit in bytes.
Internal netgrind state (only enabled in debug builds)
- status
-
The state of the flow inside netgrind for diagnostic purposes. It is a tuple of two values, the first for sending and the second for receiving. Ideally the states of both the source and destination endpoints of a flow should be symmetrical but since they are not synchronized they may not change at the same time. The possible values are:
-
- c
-
Direction completed sending/receiving.
- d
-
Waiting for initial delay.
- f
-
Fault state.
- l
-
Active state, nothing yet transmitted or received.
- n
-
Normal activity, some data got transmitted or received.
- o
- Flow has zero duration in that direction, no data is going to be exchanged.
-
SEE ALSO
netgrindd(1), netgrind-stop(1)
Index
This document was created by man2html, using the manual pages.
Time: 09:47:55 GMT, January 31, 2012


