[rbridge] Comments on: http://www.ietf.org/internet-drafts/dr aft-ietf-trill-rbridge-arch-01.txt

Gray, Eric Eric.Gray at marconi.com
Tue Oct 31 13:46:27 PST 2006


 

-- [snip] --
--> 
--> Sgai 4> an RBridge must be also able to send two copies of a
--> unicast/multicast/broadcast packet on the same port when it 
--> acts as a designated RBridge (one copy is encapsulated, the 
--> other not).
--> 

This, I think, refers to the immediately preceding text in the 
architecture:

        Forwarding information is derived from the combination of 
        attached MAC address learning, snooping of multicast-related 
        protocols (e.g. - IGMP), and routing advertisements and path 
        computations using the link-state routing protocol. 

While your comment is a reasonable one - although potentially 
implementation specific - it does not appear to have any bearing 
on this text.


-- [snip] -- 
--> 
--> Sgai 5> here there is some confusion between 802 and 802.3
--> 

This  refers (I believe) to the immeditately preceding text:

        The following terminology is defined in other documents. A brief
        definition is included in this section for convenience and - in 
        some cases - to remove any ambiguity in how the term may be used
        in this document, as well as derivative documents intended to 
        specify components, protocol, behavior and encapsulation 
        relative to the architecture specified in this document. 

        o  802: IEEE Specification for the Ethernet architecture, i.e., 
           including hubs and bridges. 

In this text, I explicitly state that these terms are defined elsewhere.
I am also trying to use the most generic definition of Ethernet possible.

The reason for this is that the problem statement does not restrict the
solution to 802.3.

There is some confusion in referring to 802.3 in any case: we explicitly
want to include 802.1Q - as well as all of its variations and extensions.

Consequently, we define the term Ethernet in the broadest possible sense.


-- [snip] --
-->
-->        o  Bridge Spanning Tree (BST): an Ethernet (L2, 802.1D) 
-->           forwarding protocol based on the topology of a spanning tree.
--> 
--> Sgai 6> I have never seen the acronym BST, everyone use STP.
-->

I do not know if this term is used in any of the other documents,
but it is not used in this one.  Unless someone objects, I am only
too happy to remove it from this document.

>From a historical perspective, this was defined this way originally
because we were re-using the term spanning tree instead of IRT.  It
was causing amazing communication difficulties, so we came up with
the term IRT.  Now we don't need to differentiate BST.
 
-- [snip --
--> 
--> Sgai 7> for Ethernet is better to reference 802.3
--> 

See my response to your point (Sgai 5) above.
 
-- [snip --
--> 
-->         o  Hub: an Ethernet (L2, 802) device with multiple ports which
--> 
--> sgai 8> for Hub is better to reference 802.3
--> 

See my response to your point (Sgai 5) above.  By the definition we
provide in this document, Ethernet is defined broadly to include 
802 technologies.  This is reasonable, because the term was stolen
by IEEE from a technology stolen from a satellite communication
protocol.  Ironic that 802.3 does not include wireless, all things
considered.  Certainly the term "Ethernet" should...
 
-- [snip -- 
--> 
-->         o  Node: a device with an L2 (MAC) address that sources and/or 
-->            sinks L2 frames. 
--> 
--> Sgai 9> The IEEE term is "station".
--> 

However, we in the IETF are more familiar with the term "node." 

This is hardly a significant issue.  if people agree that we should
use the term "station" as opposed to "node", then I will change it.

Note that we should be consistent in this usage, if we decide to do
yet another terminology evolution.
 
-- [snip --
--> 
-->         o  Segment: an Ethernet link, either a single physical link or 
-->            emulation thereof (e.g., via hubs) or a logical link or 
-->            emulation thereof (e.g., via bridges).
--> 
--> Sgai 10> IEEE uses the term "LAN segment"  
--> 

I agree, however this is the definition we agreed on here.
 
-- [snip --
--> 
-->         o  Subnet, Ethernet: a single segment, or a set of segments 
-->            interconnected by a CRED (see section 2.2); in the latter
--> 
--> sgai 11> There is no concept of subnet in IEEE. Subnet is typically 
--> an IP subnet, and, even if it is common to have one subnet per LAN, 
--> this is not the only possibility. Pragmatically IP subnets and 
--> Ethernet LAN are unrelated concepts.
--> 

Again, we are defining this term for use in this document.  Does its
use (not its definition) cause confusion or ambiguity?
 
-- [snip --
--> 
-->            case, the subnet may or may not be equivalent to a single 
-->            segment. Also a subnet may be referred to as a broadcast 
-->            domain or LAN. By definition, all nodes within an Ethernet 
-->            Subnet (broadcast domain or LAN) must have L2 connectivity 
-->            with all other nodes in the same Ethernet subnet. 
--> 
-->         o  TRILL: Transparent Interconnect over Lots of Links - the 
-->            working group and working name for the problem domain to be 
-->            addressed in this document. 
--> 
-->         o  Unicast Forwarding: forwarding methods that apply to frames 
-->            with unicast destination MAC addresses. 
--> 
-->         o  Unknown Destination - a destination for which a receiving 
-->            device has no filtering database entry.  Destination (layer
--> 
--> sgai 12> the stations receive the unknown unicast and have filtering
--> information, only the bridges don't. I propose to replace device with
--> bridge. 
-->

Again, it is the intention to provide as broad a definition as possible 
without introducing confusion or ambiguity.  An end node (or station) 
has - in a sense - a filtering database (it's mine, or it's for the bit 
bucket).

We need to explicitly include 802.1D forwarding devices and RBridges.

However, the definition should specify "forwarding device" - as opposed
to just "receiving device."

I will change that.
 
-- [snip --
-->  
-->            2) addresses are typically "learned" by (layer 2) forwarding 
-->            devices via a process commonly referred to as "bridge
learning". 
--> 
--> Sgai 13> in IEEE, the term used is "learning" instead of "bridge
learning"
--> 

However, the term defined in this document is "bridge learning."
 
-- [snip --
-->
-->         o  VLAN: Virtual Local Area Network. VLANs in general fall into 
-->            two categories: link (or port) specific VLANs and tagged 
-->            VLANs. In the former case, all frames forwarded and all 
-->            directly connected nodes are assumed to be part of a single 
-->            VLAN.  In the latter case, VLAN tagged frames are used to 
-->            distinguish which VLAN each frame is intended for. 
--> 
--> Sgai 14> This definition is not completely correct, I prefer:
--> 
--> VLAN technology introduces the following three basic types of frame:
--> a) Untagged frames;
--> b) Priority-tagged frames; and
--> c) VLAN-tagged frames.
--> 
--> An untagged frame or a priority-tagged frame does not carry any
--> identification of the VLAN to which it belongs. Such frames are
--> classified as belonging to a particular VLAN based on parameters
--> associated with the receiving Port, or, through proprietary extensions 
--> to IEEE 802.1Q standard, based on the data content of the frame (e.g., 
--> MAC Address, layer 3 protocol ID, etc.).
--> 
--> A VLAN-tagged frame carries an explicit identification of the VLAN to
--> which it belongs; i.e., it carries a tag header that carries a non-null
--> VID. Such a frame is classified as belonging to a particular VLAN based
--> on the value of the VID that is included in the tag header. The presence
--> of the tag header carrying a non-null VID means that some other device,
--> either the originator of the frame or a VLAN-aware Bridge, has mapped
--> this frame into a VLAN and has inserted the appropriate VID.
--> 

So, you're getting into the details of the technology and - in particular
the encapsulation.  I will expand the definition to include a possibility
that other criteria may be used to define a "VLAN port" - although this is
isomorphic to a logical model in which a device implementation uses some
criteria to decide that a subset of the traffic received on a specific 
physical port is to be forwarded as if received on a specific logical port.

The key point in this definition is that a VLAN is a "Virtual LAN" - meaning
it consists of a subset of the physical and L2 connectivity corresponding to
a "logical LAN."  The intent is to "rise above" the technological approaches
and encapsulations to establish a generic definition that is loosely tied to

ways that this generic definition is actually implemented.

Again, bearing in mind the way that the term is defined, does the usage of
the term cause confusion or ambiguity?
 
-- [snip --
--> 
-->         o  CRED Forwarding Table (CFT): the per-hop forwarding table 
-->            populated by the RBridge Routing Protocol; forwarding within 
-->            the CRED is based on a lookup of the CRED Transit Header 
-->            (CTH) encapsulated within the outermost received L2 header. 
-->            The outermost L2 encapsulation in this case includes the 
-->            source MAC address of the immediate upstream RBridge 
-->            transmitting the frame and destination MAC address of the 
-->            receiving RBridge for use in the unicast forwarding case. 
--> 
--> Sgai 15> In section 7 of
--> http://www.ietf.org/internet-drafts/draft-gai-perlman-trill-encap-00.txt
--> we proposed that when two rbridges are connected by a point to point
--> link the outer MAC addresses may be set to a predefined value in
--> transmission and ignored in reception.  
--> 

I'm not sure how your proposal intends to determine that a link is in
fact point-to-point and does not just look that way.

I prefer to use the same encapsulation in all cases where it will work.

The optimization associated with using some form of null-encapsulation
is of dubious value, given that it may not be possible to be certain a
point-to-point link is - and will remain - in fact a point-to-point
link.
 
-- [snip --
--> 
-->         o  CFT-IRT: a forwarding table used for propagation of 
-->            broadcast, multicast or flooded frames along the Ingress 
-->            RBridge Tree (IRT). 
--> 
--> Sgai 16> is it a forwarding table or is it a filtering database. Since
--> the "miss" behavior is to flood, I see it as a filtering database.
--> 

What state was "miss" behavior from - Hawaii?  :-)

It is analogous to a filtering database entry, but it is not that.

Clearly there are more things in this world than can be classified
in this taxonomy.  However, given these choices, it is a forwarding
table.

This is not a misbehavior, in that the same "base" tree is used for
broadcast and multicast traffic as well as flooded traffic.  It may
be arguable that flooding is a misbehavior, but I seem to recall
that people still see it as a necessary evil.

It is also not "miss" behavior (as in "cache miss") in the multicast
and broadcast case, either.

I believe the definition is quite correct and easy to understand,
provided you're not reading it with preconceived misconceptions
about its meaning.
 
-- [snip --
-->
-->         o  CRED Transit Header (CTH): a 'shim' header that encapsulates 
-->            the ingress L2 frame and persists throughout the transit of a

-->            CRED, which is further encapsulated within a hop-by-hop L2 
-->            header (and trailer). The hop-by-hop L2 encapsulation in this

-->            case includes the source MAC address of the immediate 
-->            upstream RBridge transmitting the frame and destination MAC 
-->            address of the receiving RBridge - at least in the unicast 
-->            forwarding case. 
--> 
--> Sgai 17> is this true also for unknown unicast?
--> 

That is something that will be (may be already) decided in the protocol
specification.
 
-- [snip --
-->
-->         o  CRED Transit Table (CTT): a table that maps ingress frame L2 
-->            destinations to egress RBridge addresses, used to determine 
-->            encapsulation of ingress frames for transit of the CRED. 
--> 
-->         o  Cooperating RBridges - those RBridges within a single 
-->            Ethernet Subnet (broadcast domain or LAN) not having been 
-->            configured to ignore each other. By default, all RBridges 
-->            within a single Ethernet subnet will cooperate with each 
-->            other. It is possible for implementations to allow for 
-->            configuration that will restrict "cooperation" between an 
-->            RBridge and an apparent neighboring RBridge.  One reason why 
-->            this might occur is if the trust model that applies in a 
-->            particular deployment imposes a need for configuration of 
-->            security information.  By default no such configuration is 
-->            required however - should it be used in any specific scenario
--> 
-->            - it is possible (either deliberately or inadvertently) to 
-->            configure neighboring RBridges so that they do not cooperate.
--> 
-->            In the remainder of this document, all RBridges are assumed 
-->            to be in a cooperating (default) configuration. 
--> 
--> Sgai 18> can RBridges cooperate in groups, e.g. four Rbridges connected
--> to a LAN cooperating two and two?
--> 

Yes.  There may be reasons why this might be done deliberately, however
- even in the absence of any "proof of concept" justification - it is a
really good idea to design the protocol to be robust in cases where a
set of RBridges may be (mis)configured in such a way as to be unable to
discover each other.
 
-- [snip --
--> 
-->         o  Ingress RBridge Tree: a tree computed for each edge RBridge
-->            and potentially for each VLAN in which that RBridge 
--> 
--> sgai 19> why for each VLAN? I got the impression that there 
--> was a single tree that was pruned differently for different VLANs.
--> 

Pruning may also take place at the ingress, if - for example - it has a
subset of interfaces that are not connected to any egress for particular
VLANs.  It is a small point but, in such cases, there is in effect more
than one IRT even at the ingress.
 
-- [snip --
-->
-->            participates - for delivery of broadcast, multicast and 
-->            flooded frames from that RBridge to all relevant egress 
-->            RBridges. This is the point-to-multipoint delivery tree used 
-->            by an ingress RBridge to deliver multicast, broadcast or 
-->            flooded traffic.  
--> 
--> Sgai 20> the current version of the proposal speaks about a multicast
--> address, not point-to-point.
--> 

I did not say "point-to-point" (look again).
 
-- [snip --
-->
-->         o  LPT: Learned Port Table. See Filtering Database. 
--> 
--> Sgai 21> not proper terminology, I would use "filtering database"
--> everywhere.
-->  

I am happy to remove this if there is no objection to my doing so.
 
-- [snip --     
--> 
--> sgai 22> I wired port is Ehernet, i.e. IEEE 802.3, a 
--> wireless port is not Ethernet, it is IEEE 802.11.
--> 

See my response to your point (sgai 8) above.
 
-- [snip --
--> 
--> sgai 23> they learn because STP guarantees symmetrical forwarding
--> 

This (apparently) refers to the immeditately preceding text:

        Conventional bridges contain a learned port table (LPT), or 
        filtering database, and a spanning tree table (STT). The LPT 
        allows a bridge to avoid flooding all received frames, as is 
        typical for a hub or repeater. The bridge learns which nodes are
        accessible from a particular port by assuming bi-directional 
        consistency: 

I'm not sure how picking at the peculiarities of STP behavior is 
relevant to this description.  STP results in a single spanning 
tree where each link is bi-directional.  This ensures that the
MAC frames are forwarded in a bi-directionally consistent fashion.

To replace this text with yours is to provide less information.
 
-- [snip --
-->  
--> Sgai 24> active ports -> forwarding ports
--> 

"Active ports" was the exact wording suggested to me.  In context for
this working group "active ports" is more meaningful than "forwarding
ports."  For people not used to STP-speak, "forwarding port" might be
used to discriminate from a "code port."
 
-- [snip --
--> 
--> Sgai 25> there is no STT, there is a state associated with each port
--> that can be: disabled, blocking, listening, learning, and forwarding
--> 

Other than the issue with terminology, is this text wrong?  I am fairly
sure that different implementations handle the "port state" information
in different ways, and I am also reasonably sure that a "table" such as
the one described here is one of the ways it might be done.

However, I agree with your assertion that this is the way that it is 
usually talked about in an IEEE context, so - unless there are specific
objections - I can change the wording to be consistent with what you
suggest.
 
-- [snip --
--> 
--> sgai 26> disabled -> blocking
--> 

I can make this change as well.  However, from the perspective of what
we are trying to do, "disabled" is potentially a more correct term.  It 
is certainly more intuitively correct, outside of a strictly IEEE/STP
context.

Symmetry is maintained in STP by blocking ports/links bi-directionally.  
In such cases, a port is effectively disabled for bridges at either end
of the link for which a port is blocked, is it not?
 
-- [snip --
--> 
--> sgai 27> I repeat a comment that I have made to other documents: "
--> The discussion about VLAN needs to be much more extensive. It is clear
--> from the mailing list discussion that VLANs can be used inside the
--> packet or in the Ethernet encapsulation of TRILL. These are two
--> different kinds of VLANs and their requirement need to be stated
--> separately. Q in Q needs also to be discussed. I propose to define inner
--> and outer VLANs (with reference to the position of the tag in the
frame."
--> Here I think we are talking about outer VLANs
--> 

I responded to this in separate mail.  I wait to hear other opinions on
this subject.
 
-- [snip --
--> 
--> Sgai 28> IMO all RBridges must be ingress RBridges, at least to support
--> inband management, e.g. SNMP.
--> 

No.

In order to ensure symmetry with RBridges not participating in STP, there 
MUST be a designated RBridge that does all of the sending and receiving
of native encapsulated frames on a LAN segment.

Any RBridge the loses the "Designated RBridge" election cannot be either
an ingress or an egress for that LAN segment.
 
-- [snip -- 
--> 
--> Sgai 29> same as above
--> 

Same as above.
 
-- [snip --
-->  
--> Sgai 30> I think the previous definition is not needed.
--> 

This appears to refer to:

        o  Local RBridge - the RBridge that forms and maintains the CFT-
           IRT entry (or entries) under discussion. The local RBridge 
           may be an Ingress RBridge, or an egress RBridge with respect 
           to any set of entries in the CFT-IRT. 

This defintion is needed.  It is subsequently used in at least 4 places.
When discussing the behavior of a specific RBridge relative to a peer,
it is convenient to define the term "local RBridge."

-- [snip --
--> 
--> sgai 31> why is it zero or more, if an RBridge exists, it must have
--> a an IRT, I haven't seen any discussion to support multiple IRTs.
--> 

This was answered previously on the mailing list.  Briefly, zero or
more is correct, given that an RBridge may not have forwarding entries
for frames it has received.  In these cases, a frame is not forwarded.
 
-- [snip --
--> 
--> Sgai 32> I don't understand this. Since the current proposal uses a
--> multicast MAC address, what is needed is a bit map for each IRT that
--> says which ports are blocking and which are forwarding. You are
--> basically building a ST using ISIS.
--> 

This might be the case for a specific implementation.  Whether it is
implemented as a "bit-mask" with "non-blocking" (forwarding) ports, or
is implemented as a full-blown table is largely dependent on what other
information the local device requires in order to properly forwad the
frame.  If - for example - a device has multiple different port types,
it may need to have more information for each port (or that information
may be added later on).

All of these are implementation choices that are logically represented
by the table described here.
 
-- [snip --
-->  
--> Sgai 33> I don't get the pair.
--> 

Since this immediately follows:

        Each entry would contain an indication of which single interface
        a broadcast, multicast or flooded frame would be forwarded for 
        each (ingress RBridge, egress RBridge) pair.

I don't get what you don't get.

The pair refers to the tuple "(ingress RBridge, egress RBridge)."

This might be the point at which your earlier point (Sgai 4) would make
sense to insert.  I had logically modeled this in my own mind as two
distinct "interfaces" (the reason I use this term, rather thhan port),
but I should clarify this.

In any case, the entries are keyed by both ingress and egress RBridge.
While there will be entries for only those egress RBridges where this
local RBridge is on the shortest path (between the given ingress and
egress pair), there will be at most one entry per any ingress and
egress pair.
 
-- [snip --
-->  
--> Sgai 34> as a matter of fact each interface is basically a set of two
--> interfaces, a regular one and a tunnel one, and the forwarding/blocking
--> state may be different for the two.
--> 

No.

As per my response to your point (Sgai 28) above, not every RBridge has a
"regular one" as you describe here.

The forwarding/blocking state is not applicable as RBridges don't do STP.

For the native interface, fowarding/blocking state is analogous to the
"Designated RBridge" election process.

Since this point evidently applies to -

                                                     Entries would also 
        contain any required encapsulation information, etc. required 
        for forwarding on a given interface, and toward a corresponding 
        specific egress RBridge.

- your point, and my response, are related to the point (and response) 
above (Sgai 33), and I will try to clarify this part as well.
 
-- [snip --
--> 
--> Sgai 35> this protocol must be designed to avoid transient loops, since
--> transient loops of multicast/broadcast cause broadcast storm that are
--> highly undesirable.
-->  

No.

The protocol needs to include a provision to prevent _use_ of links that 
may connect to transient loops.  Using a link-state routing protocol has
clearly been demostrated to produce transient loops, but the problem you
want to address has to do with using those links.

A state-machine driven mechanism similar to STP might be the approach to
use.

Because we're incorporating TTL in the SHIM, however this would need to 
apply only to IRT traffic.
 
-- [snip --
--> 
--> 
--> Sgai 36> see my previous comment about VLANs
--> 

See my previous responses.
 
-- [snip --
--> 
--> sgai 37> disabled -> blocking.
-->  

See my response to your point (sgai 26) above.
 
-- [snip --
--> 
--> Sgai 38> for multicast/broadcast we also need to avoid transient loops.
--> 

Yes.
 
-- [snip --
--> 
--> Sgai 39> but RBridge discovery and STP are ongoing processes, why do we
--> want to couple their timers?
--> 

I am not suggesting "coupling their timers."  I am saying that the value
chosen for a timer (if one is used) to determine when it is reasonable to
start RBridge peer discovery should take into account the time it takes
for the local spanning tree resolution.

I feel the reason for this is self-evident but, just to clarify, think
about the process we're planning to use to determine RBridge nick-names.
If we start this too early, we will be re-starting it many times as we
"hear from" new RBridge peers when the connecting links go active after
local bridges go to the forwarding state.  This would apply only at the
system start up as - after that - you are quite correct in asserting it
would be an ongoing process.

Perhaps I should add some words to indicate that a delay would not be
necessary if the protocol mechanisms do not introduce instability as a
new peer is discovered.  So far, I have not seen any indication that a
"race-free" solution to accomplish this has been designed - or talked
about.
 
-- [snip --
--> 
--> Sgai 40> there is also a requirement to time-out learnt information to
--> maintain the filtering databases.
-->  

There would be, if we were doing that.  Instead, we're adopting a 
link-state routing protocol and they tend to have that covered.
 
-- [snip --
--> 
--> Sgai 41> periodically or on demand
--> 

See the response to your point (Sgai 40) above.
 
-- [snip --
--> 
--> Sgai 42> potentially there is an unencapsulated interface for each
--> physical interface of the RBridge. It is true that you can model all
--> of them as a single separate logical interface, but then we need to 
--> replicate the frame according to a bitmask that tells on which physical
--> interface the RBridge is designated.
--> 

Again, your use of a "bitmask" is an implementation choice as opposed
to a logical model.

As you observe, I do "model all of them as a single separate logical 
interface" and if you want to "replicate the frame according to a 
bitmask that tells on which physical interface the RBridge is 
designated" - you're absolutely free to do so.

Personally, I think this is far too much implementation stuff for a
protocol specification, let alone an architecture document.
 
--> 
--> Sgai 43> can we clarify that this means "drop BPDUs".
--> 

Yes.
 
-- [snip --


More information about the rbridge mailing list