[e2e] More 1st principles ...

Black_David at emc.com Black_David at emc.com
Fri Apr 30 14:01:49 PDT 2004


Regarding high-speed networking, Jon Crowcroft asked:

> but why are we considering this as very important? how many 
> times does a TCP session actually  witness a link with the
> characteristics cited (and tested against) in the real world,
> e.g., 100ms, 1-10Gbps?, without ANY OTHER sessions present?

By that time, things have gotten to the point of being clearly
unacceptable; the problems start well before one session is
consuming a multi-Gbps link.  Jon's use of "ANY OTHER" is
revealing, as storage systems are coping with TCP's excessive
backoff at large window sizes in the same fashion as grid FTP
- via multiple parallel TCP sessions.  The underlying problem
is that TCP's segment size has not scaled up with window size,
so while the MD portion of AIMD scales with window size, the
AI does not - parallel sessions knock the MD penalty back
down towards the segment size that hasn't scaled up.

> in the test cases, we often see people spending sometime at
> places like CERN, SLAC, and so forth, waiting for a new 
> optical link to be commissioned, before they get to be
> abler to run their experiment - how long does one have to 
> wait before the link routinely has 100s or 1000s of flows
> on it? at which point why are we trying to get 100% of the 
> capacity in less than the normal time

In the storage world, we expect to saturate 1-10Gpbs links
used for storage replication with far fewer flows (10s at most). 
 
> another side to this motivational chasm seems to me to be: if 
> we have a really really large file to transfer, does
> it matter if we have to wait 100s of RTTs before we get to 
> near line rate? frankly, if its a matter of a GRID FTP
> to move a bunch of astro, or HEP or genome data, then there's 
> going to be hours if not days of CPU time at the far
> end anyhow, so  a few 10s of seconds to get up to line rate 
> is really neaither here nor there (and there are of
> course more than 1 HEP physicist going to be waiting for LHC 
> data, and more than one genetecist looking at genome
> data, so again, what is the SHARE of the link we are 
> targetting to get to?)?

The implied assumption that the only use for high-speed
networking is "ftp on steroids" (large file transfer) is
*very* wrong.  I'll avoid dropping product names, but I
invite Jon and anyone else who suffers from this mis-
impression to explore emc.com to understand what storage
replication products exist *today*, and obtain some idea of
what they're used for.  Storage replication connects
storage systems that have significantly less patience
than HEP physicists when data doesn't arrive.  OTOH,
I think storage replication is somewhat behind the leading
edge of the grid community in bandwidth requirements - while
we don't always need the multi-Gbps bandwidth one hears
about in the grid world, we can put it to use, and a T3
is a relatively slow link.

Thanks,
--David
----------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
black_david at emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------


More information about the end2end-interest mailing list