[e2e] Re: Are you interested in TOEs and related issues

Sunay Tripathi Sunay.Tripathi at eng.sun.com
Tue Mar 9 16:58:45 PST 2004


> Sunay Tripathi <Sunay.Tripathi at eng.sun.com> writes:
> 
> > 1) Extra processor(s) buried in the TOE for networking processing which is
> >    hidden from the kernel and leaves the host CPU to do more application
> >    related work. Saves the cost of licences for application which take
> >    number of CPU into account (oracle is one such application cited).
> 
> One would hope TCP design would not be guided by Oracle licensing quirks.

I don't think we are designing TCP here. We are just discussing an
implementation of TCP and obviously costs do play a factor in any
implementation. 

> > 2) On low end (1-2 CPU) x86 based machines, cost of adding a processor
> >    is much higher than adding a TOE (I personally haven't verified this).
> 
> It's not obvious why this should necessarily be the case, given that
> it is likely that there will continue to be quite a bit more
> general-purpose CPUs made than TCP offload engines.
> 
> > 3) For the up and coming 10Gb NICs, TOE will help saturate the link. Some
> >    vendors assert that TOE will be required to support 10Gb NICs.
> 
> Right now, one can do almost 8Gb/s with a single TCP stream over 10GE
> (let's say 5 or 6Gb/s with more common hardware).  The limiting factor
> is currently host bus speed.  There's nothing (except for compression
> on the bus side, but it should be clear we're not going there) that an
> offload engine can do about host bus speed.  By the time host busses
> faster than 10Gb/s are commonly available, pretty routine x86 box
> should handle 10GE saturation.

We have host busses today which can handle a giga bytes plus on the
back plane. Infiniband based machines are already hitting and there
are other technologies (that I can't talk about yet) which will make
the host bus issues irrelevant. 

> > 4) Performance reasons. Just the LSO aspect of TOE (sending large chunks of
> >    data and letting the TOE split it up in mss size pieces) and ack
> >    coalescing gives a pretty good boost (our own prototypes indicates that
> >    this is true). The gains are by optimizing data movement and not by
> >    offloading protocol processing.
> 
> Here I would agree with the unnamed TOE vendor whom you're
> paraphrasing (and with Jerry Chu's comments in this thread).  Ethernet
> frame size (1500 bytes or even 9kB) is very small; we'd be in a better
> world if we could specify the MTU in units of time (the number of CRC
> bits would have to scale up as well, of course).  TCP offload engines
> could be a kludge that would help to work around this deficiency in
> Ethernet.
> 
> Note that since even at 10Gb/s one CPU is enough to saturate the link
> with 9kB packets, the need for this work-around is not at all
> pressing.  Given the potential harmful effects (undetected errors on
> the host bus, difficulty in patching stale or buggy TCP code, etc.),
> one would probably be better served by concentrating on the deployment
> of jumbo frames.  The investment to support jumbo frames has largely
> already been made, so why not extract all we can from it first?

Sure, jumbo frames do deserve attention. Its being discussed right now
in IETF but I am not sure 9k frames would be enough to saturate the
10Gb NIC till the processor speeds also scale significantly. Note that
its not just the ability to saturate the link but you also need to
have some CPU free to do real work as well.

Cheers,
Sunay

> 
> -- 
> Stanislav Shalunov		http://www.internet2.edu/~shalunov/
> 


-- 
Sunay Tripathi
Solaris Kernel Networking,
Sun MicroSystems Inc.

email: sunay at eng.sun.com		 Phone:	650-786-6007 (W)






More information about the end2end-interest mailing list