[e2e] Extracting No. of packets or bytes in a router buffer

Matt Mathis mathis at psc.edu
Fri Dec 22 11:09:43 PST 2006


Another approach is to get accurate time stamps of ingress/egress packets and
use the difference in the time stamps to compute effective queue depths.  The
NLANR PMA team was building a "router clamp", an "octopus" designed to get
traces from all interfaces of a busy Internet2 core router.  I have since lost
track of the details. Google "router clamp pma" for clues.

I basically don't believe queue depths measured by any other means, because
there are so many cascaded queues in a typical modern router.  I point out
that most NIC's have short queues right at the wire, along with every DMA
engine and bus arbitrator, etc.

Claiming that an internal software instrument accurately represents the true
aggregate queue depth for the router is equivalent to asserting that none of
the other potential bottlenecks in the router have any queued packets. If they
never have queued packets, why did the HW people bother with the silicon?   I
conclude there is always potential for packets to be queued out of scope of
the software instruments.

It's a long story, but I have first hand experience with one of these cases:
my external measurement of maximum queues size was only half of the design size,
because the "wrong" bottleneck dominated.

Good luck,
--MM--
-------------------------------------------
Matt Mathis      http://www.psc.edu/~mathis
Work:412.268.3319    Home/Cell:412.654.7529
-------------------------------------------
Evil is defined by mortals who think they know
"The Truth" and use force to apply it to others.

On Wed, 20 Dec 2006, Lynne Jolitz wrote:

> Fred has very accurately and enjoyably answered the hardware question. But it gets more complicated when you consider transport-level in hardware, because the staging of the data from the bus and application memory involves buffering too, as well as contention reordering buffers used in the processing of transport-level protocols.
>
> Even more complicated is multiple transport interfaces in say, a blade server, where the buffering of the blade server's frame may be significant - you might be combining blade elements with different logic that stages them to a very high bandwidth 10 Gbit or greater output technology, where there is a bit of blurring between where switching and where channels from the transport layer merge.
>
> The upshot is given all the elements involved, it is hard to tell when something leaves the buffer, but it is always possible to tell when something *enters* the output buffer. All stacks track the outbound packet count, and obviously you can determine the rate by sampling the counters. But confirming how much has yet to hit the depth of buffering will be s very difficult exercise as Fred notes. It may be the case that the rules are very different from one packet to the next (e.g. very different dwell times in the buffers - we don't always have non-preemptive buffering).
>
> Lynne Jolitz
>
> ----
> We use SpamQuiz.
> If your ISP didn't make the grade try http://lynne.telemuse.net
>
> > -----Original Message-----
> > From: end2end-interest-bounces at postel.org
> > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Fred Baker
> > Sent: Wednesday, December 13, 2006 12:17 PM
> > To: Craig Partridge
> > Cc: end2end-interest at postel.org
> > Subject: Re: [e2e] Extracting No. of packets or bytes in a router buffer
> >
> >
> > You're talking about ifOutQLen. It was originally proposed in RFC
> > 1066 (1988) and deprecated in the Interfaces Group MIB (RFC 1573
> > 1994). The reason it was deprecated is not documented, but the
> > fundamental issue is that it is non-trivial to calculate and is very
> > ephemeral.
> >
> > The big issue in calculating it is that it is rarely exactly one
> > queue. Consider a simple case on simple hardware available in 1994.
> >
> >     +----------+ |
> >     |          | |
> >     |  CPU     +-+
> >     |          | |
> >     +----------+ | BUS
> >                  |
> >     +----------+ | +---------+
> >     |          | +-+ LANCE   |
> >     |          | | +---------+
> >     |  DRAM    +-+
> >     |          | | +---------+
> >     |          | +-+ LANCE   |
> >     +----------+ | +---------+
> >
> > I'm using the term "bus" in the most general possible sense - some
> > way for the various devices to get to the common memory. This gets
> > implemented many ways.
> >
> > The AMD 7990 LANCE chip was and is a common Ethernet implementation.
> > It has in front of it a ring in which one can describe up to 2^N
> > messages (0 <= N <= 7) awaiting transmission. The LANCE has no idea
> > at any given time how many messages are waiting - it only knows
> > whether it is working on one right now or is idle, and when switching
> > from message to message it knows whether the next slot it considers
> > contains a message. So it can't keep such a counter. The device
> > driver similarly has a limited view; it might know how many it has
> > put in and how many it has taken out again, but it doesn't know
> > whether the LANCE has perhaps completed some of the messages it
> > hasn't taken out yet. So in the sense of the definition ("The length
> > of the output packet queue (in packets)."), it doesn't know how many
> > are still waiting. In addition, it is common for such queues or rings
> > to be configured pretty small, with excess going into a diffserv-
> > described set of software queues.
> >
> > There are far more general problems. Cisco has a fast forwarding
> > technology that we use on some of our midrange products that
> > calculates when messages should be sent and schedules them in a
> > common calendar queue. Every mumble time units, the traffic that
> > should be sent during THIS time interval are picked up and dispersed
> > to the various interfaces they need to go out. Hence, there isn't a
> > single "output queue", but rather a commingled output schedule that
> > shifts traffic to other output queues at various times - which in
> > turn do something akin to what I described above.
> >
> > Also, in modern equipment one often has forwarders and drivers on NIC
> > cards rather than having some central processor do that. For
> > management purposes, the drivers maintain their counts locally and
> > periodically (perhaps once a second) upload the contents of those
> > counters to a place where management can see them.
> >
> > So when you ask "what is the current queue depth", I have to ask what
> > the hardware has, what of that has already been spent but isn't
> > cleaned up yet, what is in how many software queues, how they are
> > organized, and whether that number has been put somewhere that
> > management can see it.
> >
> > Oh - did I mention encrypt/decrypt units, compressors, and other
> > inline services that might have their own queues associated with them?
> >
> > Yes, there is a definition on the books. I don't know that it answers
> > the question.
> >
> > On Dec 13, 2006, at 10:54 AM, Craig Partridge wrote:
> >
> > >
> > > Queue sizes are standard SNMP variables and thus could be sampled at
> > > these intervals.  But it looks as if you want the queues on a per host
> > > basis?
> > >
> > > Craig
> > >
> > > In message <Pine.LNX.
> > > 4.44.0612130958100.28208-100000 at cmm2.cmmacs.ernet.in>, V A
> > > nil Kumar writes:
> > >
> > >>
> > >> We are searching for any known techniques to continuously sample
> > >> (say at
> > >> every 100 msec interval) the buffer occupancy of router
> > >> interfaces. The
> > >> requirement is to extract or estimate the instantaneous value of the
> > >> number of packets or bytes in the router buffer from another
> > >> machine in
> > >> the network, and not the maximum possible router buffer size.
> > >>
> > >> Any suggestion, advice or pointer to literature on this?
> > >>
> > >> Thanks in advance.
> > >>
> > >> Anil
> >
>


More information about the end2end-interest mailing list