Friday, October 28, 2011

Wednesday, October 5, 2011

Bandwidth, Latency and Throughput

Here is what i understand about these (very basic) things. Imagine that the communication is happening between point A and point B of a wire, or one may even look upon this as points on the road (for maybe easier understanding)
  1. Bandwidth = The MAXIMUM number of bytes(or cars e.g.) that pass a certain mark (say point A) in some unit of time e.g. 1 sec, measured in bits/sec.  E.g. if a road carries at max 50 cars/min, then the bw of the road is 50 cars/min. Note that bandwidth can be varying across different sections of the network pipe. Note that having more "lanes" means having more bandwidth usually. If the road bw is 50 cars/min, but currently there are only 30 cars/min, then we say that the "throughput" of the road is 30 cars/min. 
  2. Delay or Latency = time taken for some bits (or cars) to be retrieved/travel  some "length" of the pipe (e.g. from A to B).  
  3. Bandwidth-Delay product = Imagine some packets/cars are trying to move from A to B. The delay between A and B is say 30 seconds. Now, lets say, we measure at point A, that the bandwidth of the pipe is 50 packets/sec (or say 50 cars/sec). Then, the number of packets that will exist on the wire/road before 1st car/packet reaches point B, is called the Bandwidth Delay product e.g. in our case its 1500 packets(or cars) i.e. 1500 packets will be pumped into the network before 1 packet reaches point B. (Assuming no change in network/road conditions meanwhile) . Note that in this situation, the section between A and B is fully occupied with packets/cars and if we try to additionally put packets, they would be dropped. This is why BDP happens to be the max cwnd (outstanding packets) possible for TCP. 
  4. Flow =  RFC 2722 defines traffic flow as "an artificial logical equivalent to a call or connection".
  5. Flow Rate = Rate at which data is delivered for a particular flow, in bits/sec. Sometimes called "flow bandwidth" or "effective bandwidth". 
  6. Throughput = rate of packet retrieval by the measurement apparatus at destination (or point B in our example) for the flow in question, measured in bps. In case of lossless pipes, throughput is same as flow rate. In case of lossy mediums, throughput can be less than flow rates (as packets may be dropped). Throughput depends on the bandwidth, as well as the latency. People have also been doing experiments as to how "power" affects "throughput" for a given data rate (e.g. in wireless networks). People also measure rate of change of throughput, measured in bps/s. 
  7. Goodput = Application level Throughput.
  8. Makespan = time difference between start of a job and end of it. e.g. when we download something from the internet, the time it takes for the download is the makespan of the schedule that allowed the download. High goodput means low makespan.  Measured in seconds. 
Caching: For a given bandwidth of the pipe/road, if the delay to the destination is smaller, then the first packet reaches faster to point B, and also faster back to point A. Thus, having smaller delays (which means smaller bdp) means that same amount of data is retrieved in smaller time, thus increasing the number of bytes in that time interval (with vs. without cache) i.e. basically less "idle" time in the schedule.

Similarly, If we increase bandwidth, but keep latency the same, we can potentially retrieve more data/sec, thus increasing throughput.


Ping for measuring throughput: I have seen on several pages such as this that using Ping for measuring bandwidth is a fallacy. I think that under conditions where bandwidth is not changing much , we can use the Ping for measuring throughput (i.e. i dont agree with the author of that article that Ping cannot be used for flow bandwidth estimation. Of course one cannot know the bandwidth from flow bandwidth, as flow bandwidth may be affected by routing (priority of the ping packet in the scheduler)).
Some other folks have used Ping with varying packet sizes to see how the packet size can affect ping throughput. Bandwidth Estimation is an active research area e.g. see this and this

Also, here is a list of other performance tools useful for measuring effective bw/ or throughput : http://www.caida.org/tools/taxonomy/perftaxonomy.xml, and