[rbridge] Arch Issue #1

Gray, Eric Eric.Gray at marconi.com
Wed Dec 14 12:36:33 PST 2005


Joe,

	We're not converging in this discussion and part of the
reason why appears to be that there is still some considerable
confusion about the first principles of what we're trying to 
accomplish, and how those principles affect the architecture.

	Consequently, I believe we should break this discussion up 
and start new discussions on the following NEW architectural 
issues:

o What Are the Important Aspects of Bridge Behavior and How Do
  They Impact RBridges and/or the TRILL Architecture?
o What is the Role of the Hello Protocol?
o What Are the Assumptions About How Hello Protocol Works?
o How do RBridge Campus Models map to BPDU Handling Modes?
o (When) Does Block (BPDU Handling Mode) Work?
o Sub-Optimality Issues of Block Mode(s)
o What Does Participate (BPDU Handling Mode) Mean?
o Sub-Optimality Issues of Participate Mode(s)
o What Does Forward (BPDUs) Mean?
o Sub-Optimality Issues of Forward Mode(s)

To close on relatively minor sub-issues:

-> > Many things do not participate in STP now and some forward 
-> > (like the above) while others block.
->
-> Far as I've been able to find out, the things that block are 
-> known as "broken". 

Nice, ringing, phraseology - however, not true.

Let's take a typical current technology - bridging-capable
routers.

Word around this list is that routers (want to) terminate 
a broadcast domain at each router interface.  Sadly, they 
don't always have the option to do this.

If a broadcast domain on one router interface surreptitiously
becomes connected to a broadcast domain on another interface
of the same router, one of three things happens:

1) The network breaks (not the most robust alternative)
2) The router switches to bridging mode between the now
   connected interfaces (including participating in STP)
3) The router recognizes that the two interfaces share
   a common broadcast domain and adopts a reasonable mode
   of operation consistent with this knowledge (including 
   load sharing, ARP sharing, etc. - and _NOT_ including 
   STP participation or forwarding).

You might ask "why would a router adopt behavior number
3 when behavior number 2 works?"

To answer this, look at these two scenarios:

1)         (N1)        (N2)        (N3)
               \_    _/    \_    _/
                 \  /        \  /
                  R1          R2
                _/  \_      _/  \_
               /      \    /      \
           (N4)        (N5)        (N6)
               \_                _/
                 \_____    _____/
                       \  /
                        B*

2)         (N1)        (N2)        (N3)
               \_    _/    \_    _/
                 \  /        \  /
                  R1          R2
                _/  \_      _/  \_
               /      \    /      \
           (N4)        (N5)        (N6)
               \_    _/
                 \  /
                  B*

In both of these cases, separately routed networks have
become a single broadcast domain through the addition of
a new bridge (B*) added between the previously separate
networks (N4+N6 in example 1 and N4+N5 in example 2).

So, the answer to the question above is that there is no
_good_ reason why scenario 1 should work and scenario 2
should not.  However, a very quick analysis will tell you
that scenario 1 works.  N4+N6 is topologically identical
to either N2 or N5 in that it connects R1 to R2.

Consequently, scenario 2 should work and - by analogy -
the _best_mode_ would be for it to work in the same way
that scenario 1 works.  This mode is both less traumatic
and more efficient...

-> If you can point to something specific that blocks that is 
-> widely used - or that is compliant with spec - that'd be 
-> useful.
-> 

So this is a rediculous request.  The "spec" you refer to
is presumably the 802.1 specification, consequently it is
not possible to "point to something that blocks [and] is
compliant with the spec" and it is not very interesting to
"point to something that blocks [or] is complaint with the 
spec".

Simply consider that not everything in the world is going
to be consistent with the spec and yet things still work.

-> > ---> > 
-> > ---> > Participate maps to what bridges do.
-> > 
-> > Most bridges.  Mostly agree.
-> > 
-> > ---> > 
-> > ---> > As a result, either of those can easily be considered as 
-> > ---> > default cases, though forward is clearly suboptimal.
-> > 
-> > "Participate" can be sub-optimal as well, assuming that we're
-> > talking about simple participation.
-> > 
-> > In simple participation, an RBridge participates with bridges
-> > both internal and external to the RBridge campus. This makes
-> > for a single spanning tree and effectively cancels any hopes
-> > we might have for efficient use of bandwidth - even internal
-> > to the RBridge campus.
-> 
-> We haven't talked about the internal transit requirements before;
-> bridges there ARE going run STP/MSTP and coalesce into trees anyway 
-> (you can't stop that). It doesn't matter if such trees touch two 
-> rbridges - there can still be independent paths between the rbridges 
-> (though that needs to be engineered; if they connect anywhere in the 
-> middle, the as you note, the bridging will end up killing the 
-> possibility of multipath).
-> 

We have talked about this, unless you've been letting someone
else use your E-Mail account.

You argued - fairly convincingly - that it would not normally
be possible for there to be more than one RBridge campus in a
single broadcast domain.  How did you suppose this to be the
case if we were not considering "internal transit requirements"?

-> > In order for this approach to work, internal RBridge 
-> > interfaces MUST be eligible to be put in a non-forwarding 
-> > state by the STP, and - once this is done - they cannot be
-> > used by SPF frame forwarding.
-> 
-> We really need to resolve the discussion of what an 'internal 
-> rbridge interface' is before this sentence can be interpreted 
-> completely.
-> 

I don't think anyone other than you is having trouble with 
the meaning of 'internal rbridge interface' in this context.
It is simply an interface that forms a circuit either with 
one or more additional RBridges, or zero or more additional
RBridges and another interface of itself.  That is, if the 
RBridge is an ingress or egress RBridge, an internal RBridge 
interface is one on which it will transmit frames it is the 
ingress for, or receives frames it is the egress for.

The definition - if you want to be picky about it - can be
harder to get your head around than the concept it defines.
See the figure (about 2 inches) below...

-> However, putting an interface (any interface) into a nonforwarding 
-> state is just what a root bridge would do anyway, which is why I 
-> consider this a variant of PARTICIPATE.

Yes, that is what a bridges do, but I think you'll find a root 
bridge typically does not have any non-forwarding interfaces.

But this is not the point.  If one of the working group goals
is to make efficient use of link bandwidth, then - in the figure
below - we do not want any 'internal RBridge interfaces' to be in
a non-forwarding mode.

      __________
     /     i.1_ \    ___ i.3
    (      i.2_>RB-1<___ 1.4
     \          /
      \ campus /            i.1 and i.2 are internal interfaces
       \______/             i.3 and i.4 are external interfaces

That is to say, RB-1 _might_ put i.3 or i.4 in non-forwarding
mode (I do not consider this either necessary or ideal), but 
it should not put i.1 or i.2 in non-forwarding mode.  That is
why blocking STP is the optimal approach.

-> 
-> > More complex participation - which is what I suspect you are
-> > referring to - would be along the lines of treating external
-> > interfaces of all RBridges in a single campus as part of a
-> > single large bridge (you've previously argued that an RBridge 
-> > campus may be viewed as a single "virtual bridge").  This is
-> > more complicated because the STP block and forward decisions
-> > for all such external interfaces now needs to be collectively
-> > made (and maintained) across a number of RBridges (all of them
-> > within a campus). 
-> 
-> Those decisions already need to be coordinated. Until you decide 
-> what's in and out of an rbridge, you don't know what 'internal' 
-> means anyway.

These decisions do not have to be "coordinated" in the same sense
that they would be coordinated between interfaces of a single box.

What we've talked about so far (in terms of coordination) is how
we might use IS-IS to share information among RBridges that will
allow each edge RBridge to build what is effectively a filtering
database.  The model for frame forwarding within a RBridge campus
is supposed to be based on routing rather than bridging.

Consequently, the decision as to which external interfaces are 
used to (receive) ingress and (transmit) egress frames is both 
a local decision (which must be made consistently) and a result
of collective forwarding based on frame routing within a campus.

In other words, frames should be allowed to arrive at an egress
RBridge by whatever interface happens to correspond to the SPF
path to that egress from any ingress.

The bit of coordination that we need to work out is along the 
lines of who is the designated egress for what frames when more
that one RBridge is connected to the same 'external' LAN segment.

-> 
-> > That would be a non-trivial exercise using your favorite proprietary 
-> > approach; it would be impossible to standardize.
-> 
-> Things like the HELLO protocol already start in this direction; 
-> are you suggesting that HELLO will be impossible to standardize? ;-)
-> 

In what way is the HELLO protocol going to be involved in working
out forwarding between RBridges?  As far as I know, in absolutely
_no_way_ at all.

HELLO protocol is involved in determing neighbor relationships and
indirectly allows RBridges to determine campus topology.  That can
certainly be done.

Based on the neighbor relationships that RBridges determine, they
then use link-state routing to determine topology and - finally -
SPF to determine forwarding. 

Saying that the HELLO protocol is involved in determining forwarding
is getting a little ahead of ourselves... 

-> > For one thing, assuming that the hello/discovery protocol uses
-> > an appropriate communication layer, then the hello/discovery
-> > messages MUST be delivered over any non-disfunctional path.
-> 
-> Bridges think that's true for BPDUs too; if rbridges block them,
-> then it is the rbridge that becomes the disfunctional component.
-> 

RBridges are not bridges.  Maybe we should call them NBridges
to make that clearer?

-> > [D]evices (other than RBridges) will not be able to 
-> > distinguish RBridge hello/discovery messages from any
-> > other traffic.
-> 
-> Will they be encrypted? If not, we cannot make this assumption.
-> 

Let's not be cryptic.  There's a difference between:

A) being able to distinguish RBridge hello/discovery messages 
   through some deliberate analysis of frames based on - at 
   least - a priori knowledge of the existence of RBridges, 
   TRILL and the hello and discovery protocols, and 

B) being unable to distinguish RBridge hello/discovery messages 
   because they are frames like every other frame it is the 
   fundamental purpose of LAN devices to forward.

LAN devices - because they are in the business of forwarding
frames - will not reject some frames and pass others without
some prior knowledge and justification.  We need to make sure
that we do not choose a poor frame format that we know will
be rejected by existing technology and we need to rely on a
belief that future technologies will not decide to discard 
TRILL hello/discovery frames without due consideration of the
effects of doing so.

Any other set of assumptions is essentially pathological.

Due consideration is what we are in the process of doing now.
Note that "due consideration" and "slavish compliance" are 
not the same thing.  If we want to do something different, we
are free to do so, as long as we consider the implications.


More information about the rbridge mailing list