[e2e] GRID Network Research

Wed May 8 14:23:21 PDT 2002

Hi Jon

   What GRID stands for and what is its web site ?

   Hesham

-----Original Message-----
From: Jon Crowcroft [mailto:Jon.Crowcroft at cl.cam.ac.uk]
Sent: Wednesday, May 08, 2002 9:11 AM
To: end2end-interest at postel.org
Subject: [e2e] GRID Network Research

the GGF (see usal web site) are interested in inpout on this:
(we will also generate a doc about the 10 things grid people wish
network people knew!)

comments to me!

 cheers

   jon

Outline Draft for:
Top ten things network engineers wish grid programmers knew

TBA:
Top ten things grid programmers wish network engineers knew 

Jon Crowcroft et al...26/4/2002

Abstract 

This is a draft contribution for a document for the GHPNRG 
(see http://www.cs.wm.edu/~lowekamp/ghpnrg.html
amongst other places)
which is meant to list topics that the network community is
working on and is sometimes asked alarming questions about by
folks who make intensive (and quite well educated) use of
networks, such as Global GRID Forum people.

It is currently a list of topics and references. I might expand
the list, and it certainly needs lots of explanatory text. It
might be neat of this was from the GNT!.

1/ Congestion Control (contrariwise: see QoS)

Slow Start
	is this always necessary? no, but beware of ISPs who
mandate it, and if you think you can use less than recent history rather
than recent measurements, look at the Congestion Manager and TCP
PCB state shearing work first!

Congestion Control

	This is not optional in a non QoS network (which is just
about any network) - adaption is mandatory

AIMD and Equation Based

	AIMD is not the only solution to a fair, convergent
	control rule for congestion avoidance and control. Other
	solution are around - Rate based, using loss, or ECN feedback,
	can work to be TCP fair, but not generate the characteristic Saw
	Tooth.

	Assumptions and errors
	Most _connections_ do not behave like the Padhye
	equation, but most bytes are shipped on a small number of
	connections , and do - c.f. Mice and Elephants.

	The jury is still out on whether there are non greedy TCP flows
	(ones who do not have infinite sources of data at any moment)

RMT and Unicast

	Reliable Multicast Transport protocols (PGM, ALC) use
	a variety of techniques to mimic TCP mainly.

Mobile and Congestion Control
	Mobile nodes experience temporary indications of 
	loss  AND congestion
	during a hand-off. People have proposed mechanisms for
	indicating whether these are "true" or chimera.

Economics, Fairness etc

	Congestion control results in an approximately  fair
	distribution of bottleneck bandwidth - this may not be
	great if you paid more to get a fat pipe to the net.
	But, you are probably nearer the core and have 
	every right to ask the ISP to upgrade their bottlenecks anyhow
	and the people that paid less should be bottlenecked at THEIR
	access links in that case. So?

http://www.psc.edu/networking/tcp_friendly.html

2/ Routing

Priorities for good routing system design are:

Fast Forwarding
	Packet classification and switched routers have come a
long way recently - we are unlikely in the software world to beat
the h/w in core routers, but we can compete nicely in access
devices - certainly, there is no reason why a small cluster
couldn;t make a good 10Gbps router - but there's every reason why
a PCI bus machine maxes out at 1Gbps!

Faster Convergence

	Routers and links fail. the job of OSPF/ISIS and BGP is
to find the alternate paths quickly - in reality they take a
whole to converge - IGPs take a while (despite being mainly link
state nowadays) because link failure detection is NOT obvious -
sometimes you have to count missed HELLO packets (since some
links don't generate an explicit clock). BGP convergence is a
joke. But there are smart people on the case.

theory and practice

Most the problems with implementing routing protocols are those
of classic distributed (p2p/autonomous) algorithms: dealing with
bugs in other peoples implementations - it takes a good
programmer about 3 months to do a full OSPF. It then takes around
3 years to put in all the defences.

Better (multi-path, multi-metric) routing

equal cost Multipath OSPF and QOSPF 
have been dreamt up - are they used a lot? multipath in limited
cases appears to work quite well. Multimetric relies on good
understanding of traffic engineering and economics, and to date,
hasn't seen the light of day. Note that also, in terrestrial tier
one networks, end-to-end delays are approaching transmission
delays, so asking for a delay (or jitter) bound is getting fairly
pointless - asking for a throughput guarantee is a good idea, but
doesn't need clever routing!

Does MPLS Help?  No, not one bit.  

Policies are hard - BGP allows one to express unilateral
policies to the planet. this is cute (the same idea could be used
for policy management of other resources like CPUs in the GRID)
however, it results in difficulties in computing global choices
(esp Multihoming) - there are fixes. 

http://www.potaroo.net/
http://www.telstra.net/gih
NANOG

3/ Packet Sizes

Go faster LANs have always pushed the MTU up - since ATM LANs
(remember the fore asx100) we tried 9280 byte packets, and
enjoyed things. But the GRID is global, so the MTU is that of the
weakest link. Most stuff is on 100BaseT somewhere on the path
so we aren't likely to see more than the occasional special case
non 1500 byte path. However, with path MTU discovery, we get that
auto-magically

Multicast MSS is a real problem:)

Sub-IP packet size is a consideration - some systems (ATM) break
packets into tiny little pieces, then apply various level2
schemes to these pieces (e.g. rate/congestion control) - most
these are anathema to good performance.

http://www.nlanr.net/NA/Learn/packetsizes.html
http://www.faqs.org/rfcs/rfc1191.html
etc

4/ Overlays

Overlays and P2p (e.g. Pastry, CAN, Tapastry)
are becoming commonplace - the routing overlay du jour is
probably RON from MIT - these (at best) are an auto-magic way of
configuring a set of Tunnels (IPinIP, GRE etc). I.e. they build
you VPNs

P2P: are slightly different - they do content sharing and have
cute index/search/replication strategies varying from
mind-numbingly stupid (napster, gnutella) to very cute (CAN,
Pastry). They have problems with
Locality and Metrics  so are not the tool for the job for low
latency file access....in trying to mitigate this , they (and
overlay routing substrates) use ping and pathchar to try to find
proximal nodes:

	limitations of Ping/Pathchar
	Convergence when not native  (errors/confidence)

Peer-to-Peer Harnessing the Power of Disruptive Technologies
Edited by Andy Oram, March 2001,  0-596-00110-X

5/ QoS (contrariwise: see Congestion Control)

QoS - would be a nice thing

Parameters typically include
	Throughout
	Delay
	Availability
	Some people add security/integrity
	Some people also mention loss...

Threats
	Theft and Denial of Service

Protection is really what people want - If I send x bps
to site S, what y bps will be received, how much d later?

	to guarantee  y=x, and d is minimised, you need
	Admission Control (so we are not sharing as we would if
		we adapted under congestion control)
	Scheduling (so we do not experience arbitrary queueing
		delays) 
Re-routing may also need to be controlled and pre-empted
alternate routes (also known, unfortunately as protection paths)
may be needed if we want QoS to include availability as well as
throughput guarantees and delay bounds.

Network Structure
	"edge", "core", etc is a myth 0- in the global net the
average traffic path includes 7 ASs - most inter-domain traffic
traverses heavily used Internet Exchange points (e.g. London)
where capacity only just about matches demand, whereas core
networks are often "over-provisioned" (UK academic net now runs at
<5% utilisation).

Aggregation is a technique to scale traffic management for QoS -
by only managing classes of aggregates of flows, we get to reduce
the state and signaling/management overhead for it. VPNs/tunnels
of course are aggregation techniques, as are things that treat
packet differently on subfields like DSCP, port etc etc

SLAs are around already despite non widespread QoS - however,
SLAs are only intra-ISP to my knowledge (some Internet Exchanges
offer SLAs but end 2 end SLAs are as scarce as dragons).

Economics - are important here again as you can imagine!

An Engineering Approach to Computer Networking
Keshav, 1997, Addison-Wesley Pub Co; ISBN: 0201634422 
or
Internet QoS: Architectures and Mechanisms for Quality of Service
by Zheng Wang, 2001, Morgan Kaufmann Publishers; ISBN: 1558606084

6/ Multicast

Tier 1 routing works. Most ISPs run core native multicast

Interdomain only just limps (its getting better...
	MSDP Problems
	App Relay Solutions

RMT - we have some candidate protocols for reliable multicast -
nothing as solid as 1988 TCP quite yet tho.

Address Allocation and Directories are not great yet, hence
beacons and so on.

Access Network  are in bad shape...e.g.
	DSLAMs dont do IGMP snooping
	Cable dont do IGMP snooping
	Dialup cant hack it at all

Does IPv6 Help (don't laugh!) - yes it might!

Developing IP Multicast Networks: The Definitive Guide to
Designing and Deploying CISCO IP Multicast Networks
by Beau Williamson, 2000, Cisco Press; ISBN: 157870077
and
Multicast Communication: Protocols, Programming, and Applications
by Ralph Wittmann, Martina Zitterbart
Morgan Kaufmann Publishers; ISBN: 1558606459 

7/ Operating Systems

Linux, Solaris etc...there's a lot we could say here - lots of
things can and should be configured

zero copy stack - we'd all like this - zero copy receive is hard;
RDMA is not obviously the answer

Interrupts (self selecting NICs) we should minimises these if we
want TCP to go to 10Gbps on a reasonable processor - there are
nice techniques

socket buffer considerations -there are lots!

protection and scheduling domains - if we could get away from OSs
that confused these , life would be easier!

W Richard Stevens, TCP/IP Illustrated, All Volumes.
AND
Understanding the Linux Kernel,
D.P. Bovet and M. Cesati, O'Reilly, 2001, 
ISBN 0-596-00002-2

8/ Layer 2 Considerations

layer 2 NBMA nets - lots - a pain

layer 2 shared media nets - was decreasing due to switched ether,
now increasing due to wireless.

switching and routing re-cursed - layer 2 switching and routing
usually makes life HARDER for the IP engineer.

flow and congestion control re-cursed - layer 2 reliability and
flow control almost ALWAYS make life worse for the IP and TCP
engineer.

signaling (implicit, explicit) is just painful.

802.11 - in its glory:
http://www.apple.com/ibook/wireless.html

General discussion of slow lossy links:
http://www.ietf.org/html.charters/pilc-charter.html

WAP horrors - see web for many stories

GPRS - see:
http://www.cl.cam.ac.uk/Research/SRG/netos/coms/index.html

Other end of "Spectrum", see
http://www.cis.ohio-state.edu/~jain/refs/opt_refs.htm
(includes Raj Jain's own list of hot topics!)

9/ Light v. Heavyweight Protocols

Header prediction, Packet templates make
Code complexity a lot lower in the common case even for a big
protocol like TCP or SCTP.

"User space" v. kernel myths - in this authors experience it is
really worth getting people to put transports into the kernel -
reasons include independent failure of application and protocol
as well as good control of end system resources. It ain't that
hard and user space will just almost never be as fast.

Computer Networks, A Systems Approach
Peterson and Davie, 
Morgan Kaufmann, 1996, ISBN 1-55860-368-9
(2nd ed. too)

10/ Macroscopic Traffic and System Considerations

Self similarity, so?
	traffic is self similar (i.e. arrivals are not i.i.d) -
	this doesn't actually matter much (there is a horizon effect)

traffic phase effects
	p2p (IP router, multiparty applications etc)
	have a tendency (like clocks on a wooden door, or
	fireflies in the mekong delta) to synchronise 0- this is
	a bad thing 

flash crowds
	e.g. genome publication of new result followed 
	by simultaneous dbase search with similar queries from
	lots of different places...

Asymmetry

	Many things in the net are asymmetric - see ADSL lines,
see client-server, master-slave, see most NAT boxes. See BGP
paths. beware - assumptions about symmetry (e.g. deriving 1 way
delay from RTT) are often wildly wrong. Asymmetry also breaks all
kinds of middle box snooping behaviour.

The Art of Computer Systems Performance Analysis
Raj Jain, 1991, Wiley, ISBN 0-471-50336-3

Web Protocols and Practice
B. Krishnamurthy & J. Rexford,
Addison Wesley, 2001, ISBN 0-201-710885

Security Engineering, 
Ross Anderson, 2001 Wiley & Sons; ISBN: 0471389226 

------------------------------------
Global Reference:
ACM CCR 25th Anniversary Edition, 
ACM SIGCOMM CCR, Volume 25, No.1 January 1995, 
ISSN #: 0146-4833
http://www.acm.org/sigcomm/ccr/archive/ccr-toc/ccr-toc-95.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20020508/c937194b/attachment.html