From d3e3e3 at gmail.com Wed Feb 4 20:01:15 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Wed, 4 Feb 2009 23:01:15 -0500 Subject: [rbridge] draft protocol-10 WGLC Multicast Addresses In-Reply-To: <49356EF7.8070105@cisco.com> References: <1028365c0811261408o459c3d19jc23b3235bf2f9ef4@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E023D9B71@HQ-EXCH-5.corp.brocade.com> <1028365c0812011957i590f446cp2a3703929bd845ac@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E023D9BDA@HQ-EXCH-5.corp.brocade.com> <49356EF7.8070105@cisco.com> Message-ID: <1028365c0902042001s56b829f7v7e94185e936c001a@mail.gmail.com> See below... On Tue, Dec 2, 2008 at 12:23 PM, Dinesh G Dutt wrote: > > To follow the analogy of SVLAN BPDUs and the behavior of CVLAN BPDUs in an SVLAN cloud, since the CVLAN BPDUs are treated as data by the SVLAN bridges, the SVLAN Rbridge would encapsulate the CVLAN IS-IS (which has a different MAC address than SVLAN IS-IS) in a TRILL header while transporting it through the provider RBridge cloud. This will work fine without having to reserve a bunch of MAC addresses. > > Dinesh > Anoop Ghanwani wrote: Dinesh, every time I read your message above, it seem to support my point, not oppose it. CVLAN BPDUs and SVLAN BPDUs have different MAC addresses, presumably so that customer bridges will ignore and drop SVLAN BPDUs and provider bridges know to treat CVLAN BPDUs as data. So, the closest analogy to a bridge BPDU for an RBridge is a TRILL IS-IS Hello. Before the -11 protocol draft, when TRILL IS-IS frames were TRILL encapsulated, customer RBridge Hellos and provider RBridge Hellos could have been distinguished by using a reserved nickname in the provider RBridge Hellos. (Of course, this is all hypothetical since there are no provider RBridges currently specified.) Since TRILL IS-IS Hellos are no longer encapsulated, it seems to make sense that customer RBridge IS-IS frames (including Hellos) would be distinguished from provider RBridge IS-IS frames (including Hellos) by different destination MAC multicast addresses. Which you confirm when you say "the CVLAN IS-IS (which has a different MAC address than SVLAN IS-IS)". While you could get this SVLAN IS-IS multicast MAC address in a new allocation, it seems better to get a small block initially and allocate such possible later needed MAC addresses out of that block and it seems essential to do this if you want to have CVLAN RBridges drop these frames. To be specific: assume you what the behavior that some future type of frame will be discarded by the currently defined 'customer' RBridges. Two obvious ways to do this are based on MAC address or, for a TRILL encapsulated frame, nickname. When all TRILL frames were encapsulated (which I preferred), there was no particular reason not to use nickname. Now that not all TRILL frames are encapsulated, it seems desirable to be able to get this behavior with MAC address, which requires that some MAC addresses to be dropped by all current RBridges be allocated now and this behavior included in the protocol spec. Thanks, Donald >> Donald, >> With respect to the example you mention below... >> >> It is true that the BPDUs from C-VLAN Bridges are >> treated as data by S-VLAN Bridges, but they have an S-tag in them when transported over provider >> bridges. In our case, if we wanted to >> design such a hierarchy, we would have to depend >> on the TRILL header to separate traffic from >> different belonging to different "customers" >> including the control traffic. In other words, >> we would have no choice but to encapsulate the >> customer IS-IS frames with a TRILL header. >> >> In any case, as I stated earlier, I don't see >> it hurting anything to reserve the block of MAC >> addresses. >> >> Anoop >> >> >>> >>> -----Original Message----- >>> From: Donald Eastlake [mailto:d3e3e3 at gmail.com] Sent: Monday, December 01, 2008 7:58 PM >>> To: Anoop Ghanwani >>> Cc: Developing a hybrid router/bridge. >>> Subject: Re: [rbridge] draft protocol-10 WGLC Multicast Addresses >>> >>> Hi Anoop, >>> >>> On Mon, Dec 1, 2008 at 9:27 PM, Anoop Ghanwani wrote: >>> >>>> >>>> It probably wouldn't hurt, but I'm not sure that this is at >>> >>> all necessary. >>> >>>> >>>> Other than the core IS-IS instance, all other frames have a TRILL >>>> header and we can control forwarding behavior using those contents >>>> (and we have already reserved NickName values for that purpose). >>>> In future, if we define anything that we don't want to be forwarded >>>> by RBridges, we can always force it to have the TRILL header. >>>> So we are not dependent on MAC addresses... >>>> >>> >>> You are probably right that you could figure out some way to do >>> whatever you wanted with reserved nick names or other tweaking of the >>> TRILL header but it might not be very simple or efficient. >>> >>> Just as an example, if you wanted to specify Provider RBridges that >>> related to the current RBridge specification the same way Provider >>> Bridges relate to customer Bridges, one obvious way to do this would >>> include a new multicast address (All-Provider-IS-IS-RBridges?) for >>> provider core IS-IS messages, they same way Provider Bridges use a >>> different multicast address for their BPDUs and, as I understand it, >>> simply forward customer Bridge BPDUs... >>> >>> This could alternatively be done as you suggest, but that would >>> require encapsulating the Provider RBridge IS-IS messages with a funny >>> TRILL Header and, I think, some people on this list really like >>> dispatching on the multicast address and don't like encapsulating >>> IS-IS... >>> >>> Donald >>> >>> >>>> >>>> Anoop >>>> >>>> ________________________________ >>>> From: rbridge-bounces at postel.org >>> >>> [mailto:rbridge-bounces at postel.org] On >>> >>>> >>>> Behalf Of Donald Eastlake >>>> Sent: Wednesday, November 26, 2008 2:09 PM >>>> To: Developing a hybrid router/bridge. >>>> Subject: [rbridge] draft protocol-10 WGLC Multicast Addresses >>>> >>>> Hi, >>>> When TRILL started, it had only one multicast address: >>> >>> All-RBridges. Then it >>> >>>> >>>> was decided that encapsulated IS-IS frames would have an >>> >>> Inner.MacDA of >>> >>>> >>>> All-IS-IS-RBridges and there were two. Now there are three multicast >>>> address: (1) IS-IS frames are not longer encapsulated >>>> and All-IS-IS-RBridges is their Outer.MacDA, (2) >>> >>> All-RBridges is the >>> >>>> >>>> Outer.MacDA for ESADI and multi-destination data frames, and (3) >>>> All-ESADI-RBridges is the Inner.MacDA for ESADI frames. >>>> I don't think we are going to need any more than these >>> >>> three multicast >>> >>>> >>>> addresses for the Base Protocol Specification but multicast >>> >>> addresses are >>> >>>> >>>> cheap. 802.1 initially allocated itself a block of 16 >>> >>> addresses for bridging >>> >>>> >>>> and link protocols (see, for example, 802.1D-2004 Figure >>> >>> 7-10 or the more >>> >>>> >>>> recent 802.1Q-2005 Table 8-1) with the defined behavior >>> >>> being that a bridge >>> >>>> >>>> was required to drop any frame sent to one of these >>> >>> addresses if the bridge >>> >>>> >>>> did not understand the protocol(s) indicated by that >>> >>> address. This sort of >>> >>>> >>>> behavior has to be specified at the beginning. Once you >>> >>> start shipping >>> >>>> >>>> devices that are transparent to some addresses, you can't >>> >>> practically later >>> >>>> >>>> say they have to drop them if they don't know the >>> >>> protocol(s) associated >>> >>>> >>>> with those addresses. (This behavior for bridges has been >>> >>> somewhat modified >>> >>>> >>>> for more recent complicated cases like provider bridging.) >>>> So, I propose that, when we apply, we get a block of 16 >>> >>> addresses with the >>> >>>> >>>> ones listed in the first paragraph above being the first >>> >>> three addresses in >>> >>>> >>>> this block and the remaining 13 being reserved for future >>> >>> use. And that the >>> >>>> >>>> protocol specification require RBridges to drop frames with >>> >>> Outer.MacDA >>> >>>> >>>> being any of these 13 addresses (unless the RBridge >>> >>> understands some future >>> >>>> >>>> use of that address). >>>> >>>> Thanks, >>>> Donald >>>> ============================= >>>> Donald E. Eastlake 3rd +1-508-634-2066 (home) >>>> 155 Beaver Street >>>> Milford, MA 01757 USA >>>> d3e3e3 at gmail.com >>>> >> >> _______________________________________________ >> rbridge mailing list >> rbridge at postel.org >> http://mailman.postel.org/mailman/listinfo/rbridge >> >> > > -- > We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan > -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From ddutt at cisco.com Thu Feb 5 05:30:03 2009 From: ddutt at cisco.com (Dinesh G Dutt) Date: Thu, 05 Feb 2009 05:30:03 -0800 Subject: [rbridge] draft protocol-10 WGLC Multicast Addresses In-Reply-To: <1028365c0902042001s56b829f7v7e94185e936c001a@mail.gmail.com> References: <1028365c0811261408o459c3d19jc23b3235bf2f9ef4@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E023D9B71@HQ-EXCH-5.corp.brocade.com> <1028365c0812011957i590f446cp2a3703929bd845ac@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E023D9BDA@HQ-EXCH-5.corp.brocade.com> <49356EF7.8070105@cisco.com> <1028365c0902042001s56b829f7v7e94185e936c001a@mail.gmail.com> Message-ID: <498AE9DB.4040000@cisco.com> Donald, > Dinesh, every time I read your message above, it seem to support my > point, not oppose it. > CVLAN BPDUs and SVLAN BPDUs have different MAC addresses, presumably > so that customer bridges will ignore and drop SVLAN BPDUs and provider > bridges know to treat CVLAN BPDUs as data. > SVLAN RBridges don't need to know the CVLAN IS-IS MAC address to forward them as data. They treat one set of MAC addresses as BPDU, to be not forwarded and the rest as data. If you want regular Rbridges to drop provider RBridge BPDUs, then yes, you do need to reserve that address now. In general, I'm not averse to reserving some addresses and I don't believe I was disagreeing with your point except as stated above, Dinesh > So, the closest analogy to a bridge BPDU for an RBridge is a TRILL > IS-IS Hello. Before the -11 protocol draft, when TRILL IS-IS frames > were TRILL encapsulated, customer RBridge Hellos and provider RBridge > Hellos could have been distinguished by using a reserved nickname in > the provider RBridge Hellos. (Of course, this is all hypothetical > since there are no provider RBridges currently specified.) Since TRILL > IS-IS Hellos are no longer encapsulated, it seems to make sense that > customer RBridge IS-IS frames (including Hellos) would be > distinguished from provider RBridge IS-IS frames (including Hellos) by > different destination MAC multicast addresses. Which you confirm when > you say "the CVLAN IS-IS (which has a different MAC address than SVLAN > IS-IS)". > > While you could get this SVLAN IS-IS multicast MAC address in a new > allocation, it seems better to get a small block initially and > allocate such possible later needed MAC addresses out of that block > and it seems essential to do this if you want to have CVLAN RBridges > drop these frames. To be specific: assume you what the behavior that > some future type of frame will be discarded by the currently defined > 'customer' RBridges. Two obvious ways to do this are based on MAC > address or, for a TRILL encapsulated frame, nickname. When all TRILL > frames were encapsulated (which I preferred), there was no particular > reason not to use nickname. Now that not all TRILL frames are > encapsulated, it seems desirable to be able to get this behavior with > MAC address, which requires that some MAC addresses to be dropped by > all current RBridges be allocated now and this behavior included in > the protocol spec. > > Thanks, > Donald > > >>> Donald, >>> With respect to the example you mention below... >>> >>> It is true that the BPDUs from C-VLAN Bridges are >>> treated as data by S-VLAN Bridges, but they have an S-tag in them when transported over provider >>> bridges. In our case, if we wanted to >>> design such a hierarchy, we would have to depend >>> on the TRILL header to separate traffic from >>> different belonging to different "customers" >>> including the control traffic. In other words, >>> we would have no choice but to encapsulate the >>> customer IS-IS frames with a TRILL header. >>> >>> In any case, as I stated earlier, I don't see >>> it hurting anything to reserve the block of MAC >>> addresses. >>> >>> Anoop >>> >>> >>> >>>> -----Original Message----- >>>> From: Donald Eastlake [mailto:d3e3e3 at gmail.com] Sent: Monday, December 01, 2008 7:58 PM >>>> To: Anoop Ghanwani >>>> Cc: Developing a hybrid router/bridge. >>>> Subject: Re: [rbridge] draft protocol-10 WGLC Multicast Addresses >>>> >>>> Hi Anoop, >>>> >>>> On Mon, Dec 1, 2008 at 9:27 PM, Anoop Ghanwani wrote: >>>> >>>> >>>>> It probably wouldn't hurt, but I'm not sure that this is at >>>>> >>>> all necessary. >>>> >>>> >>>>> Other than the core IS-IS instance, all other frames have a TRILL >>>>> header and we can control forwarding behavior using those contents >>>>> (and we have already reserved NickName values for that purpose). >>>>> In future, if we define anything that we don't want to be forwarded >>>>> by RBridges, we can always force it to have the TRILL header. >>>>> So we are not dependent on MAC addresses... >>>>> >>>>> >>>> You are probably right that you could figure out some way to do >>>> whatever you wanted with reserved nick names or other tweaking of the >>>> TRILL header but it might not be very simple or efficient. >>>> >>>> Just as an example, if you wanted to specify Provider RBridges that >>>> related to the current RBridge specification the same way Provider >>>> Bridges relate to customer Bridges, one obvious way to do this would >>>> include a new multicast address (All-Provider-IS-IS-RBridges?) for >>>> provider core IS-IS messages, they same way Provider Bridges use a >>>> different multicast address for their BPDUs and, as I understand it, >>>> simply forward customer Bridge BPDUs... >>>> >>>> This could alternatively be done as you suggest, but that would >>>> require encapsulating the Provider RBridge IS-IS messages with a funny >>>> TRILL Header and, I think, some people on this list really like >>>> dispatching on the multicast address and don't like encapsulating >>>> IS-IS... >>>> >>>> Donald >>>> >>>> >>>> >>>>> Anoop >>>>> >>>>> ________________________________ >>>>> From: rbridge-bounces at postel.org >>>>> >>>> [mailto:rbridge-bounces at postel.org] On >>>> >>>> >>>>> Behalf Of Donald Eastlake >>>>> Sent: Wednesday, November 26, 2008 2:09 PM >>>>> To: Developing a hybrid router/bridge. >>>>> Subject: [rbridge] draft protocol-10 WGLC Multicast Addresses >>>>> >>>>> Hi, >>>>> When TRILL started, it had only one multicast address: >>>>> >>>> All-RBridges. Then it >>>> >>>> >>>>> was decided that encapsulated IS-IS frames would have an >>>>> >>>> Inner.MacDA of >>>> >>>> >>>>> All-IS-IS-RBridges and there were two. Now there are three multicast >>>>> address: (1) IS-IS frames are not longer encapsulated >>>>> and All-IS-IS-RBridges is their Outer.MacDA, (2) >>>>> >>>> All-RBridges is the >>>> >>>> >>>>> Outer.MacDA for ESADI and multi-destination data frames, and (3) >>>>> All-ESADI-RBridges is the Inner.MacDA for ESADI frames. >>>>> I don't think we are going to need any more than these >>>>> >>>> three multicast >>>> >>>> >>>>> addresses for the Base Protocol Specification but multicast >>>>> >>>> addresses are >>>> >>>> >>>>> cheap. 802.1 initially allocated itself a block of 16 >>>>> >>>> addresses for bridging >>>> >>>> >>>>> and link protocols (see, for example, 802.1D-2004 Figure >>>>> >>>> 7-10 or the more >>>> >>>> >>>>> recent 802.1Q-2005 Table 8-1) with the defined behavior >>>>> >>>> being that a bridge >>>> >>>> >>>>> was required to drop any frame sent to one of these >>>>> >>>> addresses if the bridge >>>> >>>> >>>>> did not understand the protocol(s) indicated by that >>>>> >>>> address. This sort of >>>> >>>> >>>>> behavior has to be specified at the beginning. Once you >>>>> >>>> start shipping >>>> >>>> >>>>> devices that are transparent to some addresses, you can't >>>>> >>>> practically later >>>> >>>> >>>>> say they have to drop them if they don't know the >>>>> >>>> protocol(s) associated >>>> >>>> >>>>> with those addresses. (This behavior for bridges has been >>>>> >>>> somewhat modified >>>> >>>> >>>>> for more recent complicated cases like provider bridging.) >>>>> So, I propose that, when we apply, we get a block of 16 >>>>> >>>> addresses with the >>>> >>>> >>>>> ones listed in the first paragraph above being the first >>>>> >>>> three addresses in >>>> >>>> >>>>> this block and the remaining 13 being reserved for future >>>>> >>>> use. And that the >>>> >>>> >>>>> protocol specification require RBridges to drop frames with >>>>> >>>> Outer.MacDA >>>> >>>> >>>>> being any of these 13 addresses (unless the RBridge >>>>> >>>> understands some future >>>> >>>> >>>>> use of that address). >>>>> >>>>> Thanks, >>>>> Donald >>>>> ============================= >>>>> Donald E. Eastlake 3rd +1-508-634-2066 (home) >>>>> 155 Beaver Street >>>>> Milford, MA 01757 USA >>>>> d3e3e3 at gmail.com >>>>> >>>>> >>> _______________________________________________ >>> rbridge mailing list >>> rbridge at postel.org >>> http://mailman.postel.org/mailman/listinfo/rbridge >>> >>> >>> >> -- >> We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan >> >> > > > > -- > ============================= > Donald E. Eastlake 3rd +1-508-634-2066 (home) > 155 Beaver Street > Milford, MA 01757 USA > d3e3e3 at gmail.com > > -- We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan From d3e3e3 at gmail.com Thu Feb 5 06:49:05 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 5 Feb 2009 09:49:05 -0500 Subject: [rbridge] draft protocol-10 WGLC Multicast Addresses In-Reply-To: <498AE9DB.4040000@cisco.com> References: <1028365c0811261408o459c3d19jc23b3235bf2f9ef4@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E023D9B71@HQ-EXCH-5.corp.brocade.com> <1028365c0812011957i590f446cp2a3703929bd845ac@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E023D9BDA@HQ-EXCH-5.corp.brocade.com> <49356EF7.8070105@cisco.com> <1028365c0902042001s56b829f7v7e94185e936c001a@mail.gmail.com> <498AE9DB.4040000@cisco.com> Message-ID: <1028365c0902050649u593cb59h826fa9306fdebd56@mail.gmail.com> OK, thanks, Donald On Thu, Feb 5, 2009 at 8:30 AM, Dinesh G Dutt wrote: > Donald, >> >> Dinesh, every time I read your message above, it seem to support my >> point, not oppose it. >> CVLAN BPDUs and SVLAN BPDUs have different MAC addresses, presumably >> so that customer bridges will ignore and drop SVLAN BPDUs and provider >> bridges know to treat CVLAN BPDUs as data. >> > > SVLAN RBridges don't need to know the CVLAN IS-IS MAC address to forward > them as data. They treat one set of MAC addresses as BPDU, to be not > forwarded and the rest as data. If you want regular Rbridges to drop > provider RBridge BPDUs, then yes, you do need to reserve that address now. > > In general, I'm not averse to reserving some addresses and I don't believe I > was disagreeing with your point except as stated above, > > Dinesh >> >> So, the closest analogy to a bridge BPDU for an RBridge is a TRILL >> IS-IS Hello. Before the -11 protocol draft, when TRILL IS-IS frames >> were TRILL encapsulated, customer RBridge Hellos and provider RBridge >> Hellos could have been distinguished by using a reserved nickname in >> the provider RBridge Hellos. (Of course, this is all hypothetical >> since there are no provider RBridges currently specified.) Since TRILL >> IS-IS Hellos are no longer encapsulated, it seems to make sense that >> customer RBridge IS-IS frames (including Hellos) would be >> distinguished from provider RBridge IS-IS frames (including Hellos) by >> different destination MAC multicast addresses. Which you confirm when >> you say "the CVLAN IS-IS (which has a different MAC address than SVLAN >> IS-IS)". >> >> While you could get this SVLAN IS-IS multicast MAC address in a new >> allocation, it seems better to get a small block initially and >> allocate such possible later needed MAC addresses out of that block >> and it seems essential to do this if you want to have CVLAN RBridges >> drop these frames. To be specific: assume you what the behavior that >> some future type of frame will be discarded by the currently defined >> 'customer' RBridges. Two obvious ways to do this are based on MAC >> address or, for a TRILL encapsulated frame, nickname. When all TRILL >> frames were encapsulated (which I preferred), there was no particular >> reason not to use nickname. Now that not all TRILL frames are >> encapsulated, it seems desirable to be able to get this behavior with >> MAC address, which requires that some MAC addresses to be dropped by >> all current RBridges be allocated now and this behavior included in >> the protocol spec. >> >> Thanks, >> Donald >> >> >>>> >>>> Donald, >>>> With respect to the example you mention below... >>>> >>>> It is true that the BPDUs from C-VLAN Bridges are >>>> treated as data by S-VLAN Bridges, but they have an S-tag in them when >>>> transported over provider >>>> bridges. In our case, if we wanted to >>>> design such a hierarchy, we would have to depend >>>> on the TRILL header to separate traffic from >>>> different belonging to different "customers" >>>> including the control traffic. In other words, >>>> we would have no choice but to encapsulate the >>>> customer IS-IS frames with a TRILL header. >>>> >>>> In any case, as I stated earlier, I don't see >>>> it hurting anything to reserve the block of MAC >>>> addresses. >>>> >>>> Anoop >>>> >>>> >>>> >>>>> >>>>> -----Original Message----- >>>>> From: Donald Eastlake [mailto:d3e3e3 at gmail.com] Sent: Monday, December >>>>> 01, 2008 7:58 PM >>>>> To: Anoop Ghanwani >>>>> Cc: Developing a hybrid router/bridge. >>>>> Subject: Re: [rbridge] draft protocol-10 WGLC Multicast Addresses >>>>> >>>>> Hi Anoop, >>>>> >>>>> On Mon, Dec 1, 2008 at 9:27 PM, Anoop Ghanwani >>>>> wrote: >>>>> >>>>> >>>>>> >>>>>> It probably wouldn't hurt, but I'm not sure that this is at >>>>>> >>>>> >>>>> all necessary. >>>>> >>>>> >>>>>> >>>>>> Other than the core IS-IS instance, all other frames have a TRILL >>>>>> header and we can control forwarding behavior using those contents >>>>>> (and we have already reserved NickName values for that purpose). >>>>>> In future, if we define anything that we don't want to be forwarded >>>>>> by RBridges, we can always force it to have the TRILL header. >>>>>> So we are not dependent on MAC addresses... >>>>>> >>>>>> >>>>> >>>>> You are probably right that you could figure out some way to do >>>>> whatever you wanted with reserved nick names or other tweaking of the >>>>> TRILL header but it might not be very simple or efficient. >>>>> >>>>> Just as an example, if you wanted to specify Provider RBridges that >>>>> related to the current RBridge specification the same way Provider >>>>> Bridges relate to customer Bridges, one obvious way to do this would >>>>> include a new multicast address (All-Provider-IS-IS-RBridges?) for >>>>> provider core IS-IS messages, they same way Provider Bridges use a >>>>> different multicast address for their BPDUs and, as I understand it, >>>>> simply forward customer Bridge BPDUs... >>>>> >>>>> This could alternatively be done as you suggest, but that would >>>>> require encapsulating the Provider RBridge IS-IS messages with a funny >>>>> TRILL Header and, I think, some people on this list really like >>>>> dispatching on the multicast address and don't like encapsulating >>>>> IS-IS... >>>>> >>>>> Donald >>>>> >>>>> >>>>> >>>>>> >>>>>> Anoop >>>>>> >>>>>> ________________________________ >>>>>> From: rbridge-bounces at postel.org >>>>>> >>>>> >>>>> [mailto:rbridge-bounces at postel.org] On >>>>> >>>>> >>>>>> >>>>>> Behalf Of Donald Eastlake >>>>>> Sent: Wednesday, November 26, 2008 2:09 PM >>>>>> To: Developing a hybrid router/bridge. >>>>>> Subject: [rbridge] draft protocol-10 WGLC Multicast Addresses >>>>>> >>>>>> Hi, >>>>>> When TRILL started, it had only one multicast address: >>>>>> >>>>> >>>>> All-RBridges. Then it >>>>> >>>>> >>>>>> >>>>>> was decided that encapsulated IS-IS frames would have an >>>>>> >>>>> >>>>> Inner.MacDA of >>>>> >>>>> >>>>>> >>>>>> All-IS-IS-RBridges and there were two. Now there are three multicast >>>>>> address: (1) IS-IS frames are not longer encapsulated >>>>>> and All-IS-IS-RBridges is their Outer.MacDA, (2) >>>>>> >>>>> >>>>> All-RBridges is the >>>>> >>>>> >>>>>> >>>>>> Outer.MacDA for ESADI and multi-destination data frames, and (3) >>>>>> All-ESADI-RBridges is the Inner.MacDA for ESADI frames. >>>>>> I don't think we are going to need any more than these >>>>>> >>>>> >>>>> three multicast >>>>> >>>>> >>>>>> >>>>>> addresses for the Base Protocol Specification but multicast >>>>>> >>>>> >>>>> addresses are >>>>> >>>>> >>>>>> >>>>>> cheap. 802.1 initially allocated itself a block of 16 >>>>>> >>>>> >>>>> addresses for bridging >>>>> >>>>> >>>>>> >>>>>> and link protocols (see, for example, 802.1D-2004 Figure >>>>>> >>>>> >>>>> 7-10 or the more >>>>> >>>>> >>>>>> >>>>>> recent 802.1Q-2005 Table 8-1) with the defined behavior >>>>>> >>>>> >>>>> being that a bridge >>>>> >>>>> >>>>>> >>>>>> was required to drop any frame sent to one of these >>>>>> >>>>> >>>>> addresses if the bridge >>>>> >>>>> >>>>>> >>>>>> did not understand the protocol(s) indicated by that >>>>>> >>>>> >>>>> address. This sort of >>>>> >>>>> >>>>>> >>>>>> behavior has to be specified at the beginning. Once you >>>>>> >>>>> >>>>> start shipping >>>>> >>>>> >>>>>> >>>>>> devices that are transparent to some addresses, you can't >>>>>> >>>>> >>>>> practically later >>>>> >>>>> >>>>>> >>>>>> say they have to drop them if they don't know the >>>>>> >>>>> >>>>> protocol(s) associated >>>>> >>>>> >>>>>> >>>>>> with those addresses. (This behavior for bridges has been >>>>>> >>>>> >>>>> somewhat modified >>>>> >>>>> >>>>>> >>>>>> for more recent complicated cases like provider bridging.) >>>>>> So, I propose that, when we apply, we get a block of 16 >>>>>> >>>>> >>>>> addresses with the >>>>> >>>>> >>>>>> >>>>>> ones listed in the first paragraph above being the first >>>>>> >>>>> >>>>> three addresses in >>>>> >>>>> >>>>>> >>>>>> this block and the remaining 13 being reserved for future >>>>>> >>>>> >>>>> use. And that the >>>>> >>>>> >>>>>> >>>>>> protocol specification require RBridges to drop frames with >>>>>> >>>>> >>>>> Outer.MacDA >>>>> >>>>> >>>>>> >>>>>> being any of these 13 addresses (unless the RBridge >>>>>> >>>>> >>>>> understands some future >>>>> >>>>> >>>>>> >>>>>> use of that address). >>>>>> >>>>>> Thanks, >>>>>> Donald >>>>>> ============================= >>>>>> Donald E. Eastlake 3rd +1-508-634-2066 (home) >>>>>> 155 Beaver Street >>>>>> Milford, MA 01757 USA >>>>>> d3e3e3 at gmail.com >>>>>> >>>>>> >>>> >>>> _______________________________________________ >>>> rbridge mailing list >>>> rbridge at postel.org >>>> http://mailman.postel.org/mailman/listinfo/rbridge >>>> >>>> >>>> >>> >>> -- >>> We make our world significant by the courage of our questions and by the >>> depth of our answers. - Carl Sagan >>> >>> >> >> >> >> -- >> ============================= >> Donald E. Eastlake 3rd +1-508-634-2066 (home) >> 155 Beaver Street >> Milford, MA 01757 USA >> d3e3e3 at gmail.com >> >> > > -- > We make our world significant by the courage of our questions and by the > depth of our answers. - Carl Sagan > > -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From james.d.carlson at sun.com Thu Feb 12 11:19:43 2009 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 12 Feb 2009 14:19:43 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges Message-ID: <18836.30287.649263.814351@gargle.gargle.HOWL> I've discovered something in my testing of our TRILL/RBridge implementation that I think might be significant. If I'm right, it means that something is missing from our current protocol draft (likely section 4.2.3.3). An RBridge by itself (no other RBridges in sight on the wire) will become DRB on every link it has and, since it sees no other RBridge, will also become AF for all configured VLANs on each link. It will begin learning and forwarding on all of those links. Logically, since there are no other RBridges to talk to, it will (at least as viewed from outside; the internal design doesn't matter) simply forward L2 packets from one link to another. Now suppose we have two RBridges connected like so: hub1 |+----------+| | | +--+--+ +--+--+ | RB1 | | RB2 | +--+--+ +--+--+ | | |+----------+| hub2 What should happen is that RB1 and RB2 will elected a DRB on each one of those networks (hub1 and hub2), resulting in either L2 forwarding through just a single RBridge (if AF is the same RBridge on each network), or in a trip through TRILL encpasulation (if RB1 is AF on one network and RB2 is AF on the other). So far, so good. However, what happens if RB1 and RB2 cannot see each other because of MTU restrictions in hub1 and hub2? This is where things seem to get ugly. If we're using the normal MTU-padded IS-IS packets for TRILL Hello messages, then those will get dropped by the hubs. Both RBridges will declare themselves to be DRB and AF for everything, and both will begin forwarding L2 frames. Neither sees the other at all, except that for all frames sized to the restriction or less, both will attempt to forward. The loop avoidance in 4.2.3.3 doesn't apply, because the Hellos aren't getting through; they're choked on MTU restrictions. The "root bridge collision check" doesn't apply, because we're not actually decapsulating a frame from some other RBridge. Chaos then ensues. Conceptually, if we were (in this stand-alone L2 forwarding between ports case) pretending to encapsulate in TRILL and then immediately decapsulating, then the collision check would apply, but it wouldn't actually work, because there *isn't* any 802.1D bridged connection between hub1 and hub2, and thus they do have completely different root bridges. This is a more significant problem to me because we will naturally be dealing with unusual packet sizes for TRILL. The path between TRILL RBridges needs to be at least 24 octets bigger than it is for regular bridges, and if we're to avoid accidentally trying to use unsuitable paths, such as ordinary (non-jumbo-enabled) switches and hubs, then we need to check for them, which is what IS-IS's MTU padding does. In routing, the IS-IS Hello failure means that forwarding for non-local destinations _stops_ through that link, which means that loops can't form. Unfortunately, TRILL defines things such that if IS-IS Hello fails, forwarding _starts_ through that link, and since (unlike routing) bridging doesn't distinguish between local and non-local, loops become possible. We have it backwards. I can think of at least two ways to avoid this problem, but I think I'd like to hear from the group before going ahead with a formal description of one. A. Require TRILL IS-IS implementations to send very small Hello messages, not MTU-padded, and with as little information as we can manage. (If we need "large" Hellos, then send small ones occasionally as well.) This allows the loop avoidance part to do its work. (Or perhaps the MTU that IS-IS uses should not be the actual MTU needed along the path with TRILL overhead, but the MTU for data packets, since IS-IS is now going unencapsulated. But then this means that MTU-restricted paths are unprotected and cannot be detected by normal IS-IS operation. We end up with the black hole problem.) B. Require RBridge implementations to include STP as well. However, the STP portion in this solution has (by design) no effect on TRILL itself. Instead, we use STP's link "forwarding" state to gate the AF behavior in the following ways: i. When STP reports that the link is not in "forwarding" state, we refuse to become AF for any VLAN. We become AF only if STP reports "forwarding" for the link *and* the DRB appoints us to do it. ii. When encapsulating from or decapsulating to native format, examine both the AF flag and the STP state. Discard the packet if not AF or if STP state is not "forwarding." Thus, I have at least two questions here: 1. Have I correctly identified a real problem? (If not, then a bonus question would be "why isn't it a problem even though I've run into this?") 2. Assuming it is a problem, which solution do we prefer? (And perhaps a third question: does it bug anyone else that "MTU" never appears in this document ... ?) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From eric.gray at ericsson.com Thu Feb 12 13:04:25 2009 From: eric.gray at ericsson.com (Eric Gray) Date: Thu, 12 Feb 2009 15:04:25 -0600 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18836.30287.649263.814351@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> Message-ID: <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> James, One of the basic things we need to be able to assume in layer 2 forwarding technologies is that broadcast messages - especially those associated with topology determination - will be forwarded by the underlying network. What would be the effect we could expect if BPDUs were not forwarded? Hence, the maximum size of our hello messages has to be determined from the "real" MTU limitations of the underlying network. Configure it if you have to. -- Eric -----Original Message----- From: rbridge-bounces at postel.org [mailto:rbridge-bounces at postel.org] On Behalf Of James Carlson Sent: Thursday, February 12, 2009 2:20 PM To: TRILL/RBridge Working Group Subject: [rbridge] potential L2 forwarding loop issue in RBridges I've discovered something in my testing of our TRILL/RBridge implementation that I think might be significant. If I'm right, it means that something is missing from our current protocol draft (likely section 4.2.3.3). An RBridge by itself (no other RBridges in sight on the wire) will become DRB on every link it has and, since it sees no other RBridge, will also become AF for all configured VLANs on each link. It will begin learning and forwarding on all of those links. Logically, since there are no other RBridges to talk to, it will (at least as viewed from outside; the internal design doesn't matter) simply forward L2 packets from one link to another. Now suppose we have two RBridges connected like so: hub1 |+----------+| | | +--+--+ +--+--+ | RB1 | | RB2 | +--+--+ +--+--+ | | |+----------+| hub2 What should happen is that RB1 and RB2 will elected a DRB on each one of those networks (hub1 and hub2), resulting in either L2 forwarding through just a single RBridge (if AF is the same RBridge on each network), or in a trip through TRILL encpasulation (if RB1 is AF on one network and RB2 is AF on the other). So far, so good. However, what happens if RB1 and RB2 cannot see each other because of MTU restrictions in hub1 and hub2? This is where things seem to get ugly. If we're using the normal MTU-padded IS-IS packets for TRILL Hello messages, then those will get dropped by the hubs. Both RBridges will declare themselves to be DRB and AF for everything, and both will begin forwarding L2 frames. Neither sees the other at all, except that for all frames sized to the restriction or less, both will attempt to forward. The loop avoidance in 4.2.3.3 doesn't apply, because the Hellos aren't getting through; they're choked on MTU restrictions. The "root bridge collision check" doesn't apply, because we're not actually decapsulating a frame from some other RBridge. Chaos then ensues. Conceptually, if we were (in this stand-alone L2 forwarding between ports case) pretending to encapsulate in TRILL and then immediately decapsulating, then the collision check would apply, but it wouldn't actually work, because there *isn't* any 802.1D bridged connection between hub1 and hub2, and thus they do have completely different root bridges. This is a more significant problem to me because we will naturally be dealing with unusual packet sizes for TRILL. The path between TRILL RBridges needs to be at least 24 octets bigger than it is for regular bridges, and if we're to avoid accidentally trying to use unsuitable paths, such as ordinary (non-jumbo-enabled) switches and hubs, then we need to check for them, which is what IS-IS's MTU padding does. In routing, the IS-IS Hello failure means that forwarding for non-local destinations _stops_ through that link, which means that loops can't form. Unfortunately, TRILL defines things such that if IS-IS Hello fails, forwarding _starts_ through that link, and since (unlike routing) bridging doesn't distinguish between local and non-local, loops become possible. We have it backwards. I can think of at least two ways to avoid this problem, but I think I'd like to hear from the group before going ahead with a formal description of one. A. Require TRILL IS-IS implementations to send very small Hello messages, not MTU-padded, and with as little information as we can manage. (If we need "large" Hellos, then send small ones occasionally as well.) This allows the loop avoidance part to do its work. (Or perhaps the MTU that IS-IS uses should not be the actual MTU needed along the path with TRILL overhead, but the MTU for data packets, since IS-IS is now going unencapsulated. But then this means that MTU-restricted paths are unprotected and cannot be detected by normal IS-IS operation. We end up with the black hole problem.) B. Require RBridge implementations to include STP as well. However, the STP portion in this solution has (by design) no effect on TRILL itself. Instead, we use STP's link "forwarding" state to gate the AF behavior in the following ways: i. When STP reports that the link is not in "forwarding" state, we refuse to become AF for any VLAN. We become AF only if STP reports "forwarding" for the link *and* the DRB appoints us to do it. ii. When encapsulating from or decapsulating to native format, examine both the AF flag and the STP state. Discard the packet if not AF or if STP state is not "forwarding." Thus, I have at least two questions here: 1. Have I correctly identified a real problem? (If not, then a bonus question would be "why isn't it a problem even though I've run into this?") 2. Assuming it is a problem, which solution do we prefer? (And perhaps a third question: does it bug anyone else that "MTU" never appears in this document ... ?) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ rbridge mailing list rbridge at postel.org http://mailman.postel.org/mailman/listinfo/rbridge From james.d.carlson at sun.com Thu Feb 12 13:18:19 2009 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 12 Feb 2009 16:18:19 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> Message-ID: <18836.37403.734102.790451@gargle.gargle.HOWL> Eric Gray writes: > James, > > One of the basic things we need to be able to assume in > layer 2 forwarding technologies is that broadcast messages - > especially those associated with topology determination - will > be forwarded by the underlying network. > > What would be the effect we could expect if BPDUs were > not forwarded? It's effectively the same result, at least from a high level. However, I'm talking specifically about IS-IS as used for TRILL, not BPDUs. The two are different. > Hence, the maximum size of our hello messages has to be > determined from the "real" MTU limitations of the underlying > network. Sadly, in the configuration I described, the "real" MTU limitation isn't knowable, at least in any automatic sense. More generally, the silent MTU problem which leads to black holes in the network (as was once created by FDDI/Ethernet bridges) is exactly why IS-IS has this feature. > Configure it if you have to. I don't think it's merely a matter of configuration. It may be possible to configure to avoid the problem on a particular network, but that renders the solution fragile. Accidents result in catastrophe, which is a bad thing for a protocol whose sole purpose in life is to avoid accidents. (If everyone could just configure L2 topologies without loops, we wouldn't need any of this fancy loop-detection stuff.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From anoop at brocade.com Thu Feb 12 18:42:11 2009 From: anoop at brocade.com (Anoop Ghanwani) Date: Thu, 12 Feb 2009 18:42:11 -0800 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18836.30287.649263.814351@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> Message-ID: <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> I am not going to argue about whether or not this is a real problem since you have actually run into it. It may not be one that would happen to an informed user (per Eric's response), but then there is always an ill-informed implementer/deployer out there and if you run things according to spec, then such problems are a no-no. > A. Require TRILL IS-IS implementations to send very small Hello > messages, not MTU-padded, and with as little information as we > can manage. (If we need "large" Hellos, then send small ones > occasionally as well.) This allows the loop avoidance part to > do its work. > > (Or perhaps the MTU that IS-IS uses should not be the actual MTU > needed along the path with TRILL overhead, but the MTU for data > packets, since IS-IS is now going unencapsulated. But then this > means that MTU-restricted paths are unprotected and cannot be > detected by normal IS-IS operation. We end up with the black > hole problem.) This is certainly doable, but I'm not sure it's robust enough. It almost falls in the category of what Eric said. > B. Require RBridge implementations to include STP as well. > However, the STP portion in this solution has (by design) no > effect on TRILL itself. Instead, we use STP's link "forwarding" > state to gate the AF behavior in the following ways: > > i. When STP reports that the link is not in "forwarding" > state, we refuse to become AF for any VLAN. We become AF > only if STP reports "forwarding" for the link *and* the > DRB appoints us to do it. > > ii. When encapsulating from or decapsulating to native format, > examine both the AF flag and the STP state. Discard the > packet if not AF or if STP state is not "forwarding." The problem I have with this is that thus far we have been assuming we terminate STP at RBridges. If we are to require that RBridges participate in STP, then are the different ports considered different instances or the same instance? If they are the same instance, we risk creating a giant spanning tree that goes across RBridges. Or maybe I'm missing something...perhaps you can provide more detail on what you meant. Anoop From d3e3e3 at gmail.com Thu Feb 12 19:58:26 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 12 Feb 2009 22:58:26 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18836.30287.649263.814351@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> Message-ID: <1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com> Hi Jim, See below... On Thu, Feb 12, 2009 at 2:19 PM, James Carlson wrote: > I've discovered something in my testing of our TRILL/RBridge > implementation that I think might be significant. If I'm right, it > means that something is missing from our current protocol draft > (likely section 4.2.3.3). > > An RBridge by itself (no other RBridges in sight on the wire) will > become DRB on every link it has and, since it sees no other RBridge, > will also become AF for all configured VLANs on each link. > > It will begin learning and forwarding on all of those links. > Logically, since there are no other RBridges to talk to, it will (at > least as viewed from outside; the internal design doesn't matter) > simply forward L2 packets from one link to another. > > Now suppose we have two RBridges connected like so: > > hub1 > |+----------+| > | | > +--+--+ +--+--+ > | RB1 | | RB2 | > +--+--+ +--+--+ > | | > |+----------+| > hub2 > > What should happen is that RB1 and RB2 will elected a DRB on each one > of those networks (hub1 and hub2), resulting in either L2 forwarding > through just a single RBridge (if AF is the same RBridge on each > network), or in a trip through TRILL encpasulation (if RB1 is AF on > one network and RB2 is AF on the other). > > So far, so good. However, what happens if RB1 and RB2 cannot see each > other because of MTU restrictions in hub1 and hub2? > > This is where things seem to get ugly. If we're using the normal > MTU-padded IS-IS packets for TRILL Hello messages, then those will get > dropped by the hubs. Both RBridges will declare themselves to be DRB > and AF for everything, and both will begin forwarding L2 frames. > Neither sees the other at all, except that for all frames sized to the > restriction or less, both will attempt to forward. > > The loop avoidance in 4.2.3.3 doesn't apply, because the Hellos aren't > getting through; they're choked on MTU restrictions. The "root bridge > collision check" doesn't apply, because we're not actually > decapsulating a frame from some other RBridge. > > Chaos then ensues. Right. The problem is the low MTU of the hubs and that we are using Hellos for two different things. Their function related to forwarding is doing fine. In this example, no forwarding is occurring as evidenced by the fact that there are no TRILL encapsulated frames on the wire. However, the other function, which might be called ingress/egress loop safety, isn't working because the Hellos are not getting through. > Conceptually, if we were (in this stand-alone L2 forwarding between > ports case) pretending to encapsulate in TRILL and then immediately > decapsulating, then the collision check would apply, but it wouldn't > actually work, because there *isn't* any 802.1D bridged connection > between hub1 and hub2, and thus they do have completely different root > bridges. > > This is a more significant problem to me because we will naturally be > dealing with unusual packet sizes for TRILL. The path between TRILL > RBridges needs to be at least 24 octets bigger than it is for regular > bridges, and if we're to avoid accidentally trying to use unsuitable > paths, such as ordinary (non-jumbo-enabled) switches and hubs, then we > need to check for them, which is what IS-IS's MTU padding does. Given all the various tags that 802.1 is specifying, many people would not consider these frame sizes to be "unusual"... It is interesting that you found devices that strictly enforce the old Ethernet frame limit. No one has worried about this because (almost?) everyone has been assuming that equipment you would actually run into does not have that limit. I don't know that adding just 24 bytes actually constitutes a "jumbo" when 802.3 has now defined a larger size to allow for tag insertion, etc. (802.3as-2006 specifies three frame sizes with the following size for the MAC Client Data transported: basic frame = 1500, Q-tagged frame = 1504, and envelope frame = 1982.) > In routing, the IS-IS Hello failure means that forwarding for > non-local destinations _stops_ through that link, which means that > loops can't form. Unfortunately, TRILL defines things such that if > IS-IS Hello fails, forwarding _starts_ through that link, and since > (unlike routing) bridging doesn't distinguish between local and > non-local, loops become possible. We have it backwards. Well, as I say above, I don't think that ingress or egress are forwarding. they are a function not present with IP or other protocols that have end station participation. Ingress/egress *have* to be enabled by the non-receipt of frames (currently Hellos) just as bridge ports have to be enabled by the non-receipt of frames (BPDUs). After all, if there is actually no RBridge or no bridge out that port then it has to sooner or later become enabled for RBridge ingress/egress or bridge forwarding respectively. > I can think of at least two ways to avoid this problem, but I think > I'd like to hear from the group before going ahead with a formal > description of one. > > A. Require TRILL IS-IS implementations to send very small Hello > messages, not MTU-padded, and with as little information as we > can manage. (If we need "large" Hellos, then send small ones > occasionally as well.) This allows the loop avoidance part to > do its work. > > (Or perhaps the MTU that IS-IS uses should not be the actual MTU > needed along the path with TRILL overhead, but the MTU for data > packets, since IS-IS is now going unencapsulated. But then this > means that MTU-restricted paths are unprotected and cannot be > detected by normal IS-IS operation. We end up with the black > hole problem.) Yes, that's one possibility. But it seems like a pain to have two different Hellos sizes... > B. Require RBridge implementations to include STP as well. > However, the STP portion in this solution has (by design) no > effect on TRILL itself. Instead, we use STP's link "forwarding" > state to gate the AF behavior in the following ways: > > i. When STP reports that the link is not in "forwarding" > state, we refuse to become AF for any VLAN. We become AF > only if STP reports "forwarding" for the link *and* the > DRB appoints us to do it. > > ii. When encapsulating from or decapsulating to native format, > examine both the AF flag and the STP state. Discard the > packet if not AF or if STP state is not "forwarding." You want to be careful not to imply that the RBridge as a whole is participating in STP. The bigger the STP the more ports are turned off and, if STP turns off a port, it becomes unavailable for TRILL encapsulated traffic. If you wanted to do this, you would have each RBridge port separately participate in STP as if there was a virtual bridge between the RBridge and the link. And if you are going to do this, you might as well make the virtual bridges be maximum priority for root and just say that you are DRB on a port if and only if your virtual bridge at that port becomes root. Then, I think, you can leave Hellos the size they are now and get rid of all Hellos except on the Designated VLAN. > Thus, I have at least two questions here: > > 1. Have I correctly identified a real problem? (If not, then a > bonus question would be "why isn't it a problem even though I've > run into this?") If you have run into it, it clearly has some reality :-) Perhaps people were wrong when they thought that the old Ethernet MTU would not be enforced by equipment people would actually encounter. Then again, some people have said things very much like "there are no such things as hubs any more"... > 2. Assuming it is a problem, which solution do we prefer? I think we need some discussion on this point but maybe this will push us to the "DRB only if root bridge" position... > (And perhaps a third question: does it bug anyone else that "MTU" > never appears in this document ... ?) It has been discussed but those who were knowledgeable said they thought it wouldn't be a problem. Perhpas a caveat should be added that, since neither IS-IS Hellos nor 802.1 BPDUs are fragmentable, having components of your network with an MTU small enough to block them but big enough to let some data through means your network is toast. > -- > James Carlson, Solaris Networking > Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 Thanks, Donald ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From d3e3e3 at gmail.com Thu Feb 12 20:01:58 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 12 Feb 2009 23:01:58 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> Message-ID: <1028365c0902122001x4c0af893y753d9333064df633@mail.gmail.com> Hi Anoop, see below On Thu, Feb 12, 2009 at 9:42 PM, Anoop Ghanwani wrote: > > ... > >> B. Require RBridge implementations to include STP as well. >> However, the STP portion in this solution has (by design) no >> effect on TRILL itself. Instead, we use STP's link "forwarding" >> state to gate the AF behavior in the following ways: >> >> i. When STP reports that the link is not in "forwarding" >> state, we refuse to become AF for any VLAN. We become AF >> only if STP reports "forwarding" for the link *and* the >> DRB appoints us to do it. >> >> ii. When encapsulating from or decapsulating to native format, >> examine both the AF flag and the STP state. Discard the >> packet if not AF or if STP state is not "forwarding." > > The problem I have with this is that thus far we have been > assuming we terminate STP at RBridges. If we are to require > that RBridges participate in STP, then are the different ports > considered different instances or the same instance? If they > are the same instance, we risk creating a giant spanning tree > that goes across RBridges. We certainly don't want that. See my response to Jim's initial message, which suggests what might be called a solution C. > Or maybe I'm missing something...perhaps you can provide > more detail on what you meant. > > Anoop Thanks, Donald ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From ddutt at cisco.com Thu Feb 12 21:52:51 2009 From: ddutt at cisco.com (Dinesh G Dutt) Date: Thu, 12 Feb 2009 21:52:51 -0800 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18836.37403.734102.790451@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> <18836.37403.734102.790451@gargle.gargle.HOWL> Message-ID: <49950AB3.50104@cisco.com> HI James, I'm not overly concerned with the problem you raise because it is not a realistic deployment. However, from a protocol correctness perspective, we must specify something to prevent the problem that you mention from occurring. I'm not in favor of running STP because one of the reasons I hear from customers is that they'll look at TRILL because they hate STP and want STP to completely disappear, if they can help it. Adding another Hello to deal with this problem seems more complicated than required to deal with what is a theoretical problem, at least to me. I favor your third solution and suggest a minimum MTU in a TRILL network. Many L2 protocols that I'm aware of are doing this such as 802.1Q VLAN and FCoE. We can specify that if the MTU on a link is less than this min MTU, TRILL forwarding is disabled (i.e. encap/decap). Dinesh James Carlson wrote: > Eric Gray writes: > >> James, >> >> One of the basic things we need to be able to assume in >> layer 2 forwarding technologies is that broadcast messages - >> especially those associated with topology determination - will >> be forwarded by the underlying network. >> >> What would be the effect we could expect if BPDUs were >> not forwarded? >> > > It's effectively the same result, at least from a high level. > However, I'm talking specifically about IS-IS as used for TRILL, not > BPDUs. The two are different. > > >> Hence, the maximum size of our hello messages has to be >> determined from the "real" MTU limitations of the underlying >> network. >> > > Sadly, in the configuration I described, the "real" MTU limitation > isn't knowable, at least in any automatic sense. > > More generally, the silent MTU problem which leads to black holes in > the network (as was once created by FDDI/Ethernet bridges) is exactly > why IS-IS has this feature. > > >> Configure it if you have to. >> > > I don't think it's merely a matter of configuration. It may be > possible to configure to avoid the problem on a particular network, > but that renders the solution fragile. Accidents result in > catastrophe, which is a bad thing for a protocol whose sole purpose in > life is to avoid accidents. > > (If everyone could just configure L2 topologies without loops, we > wouldn't need any of this fancy loop-detection stuff.) > > -- We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan From eric.gray at ericsson.com Fri Feb 13 05:45:47 2009 From: eric.gray at ericsson.com (Eric Gray) Date: Fri, 13 Feb 2009 07:45:47 -0600 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18836.37403.734102.790451@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL><941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> <18836.37403.734102.790451@gargle.gargle.HOWL> Message-ID: <941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se> James, If you have a management knob to allow you to configure the "real" MTU, you could use it if you have to. You would know that you have to, in the scenario you describe - either before hand, or shortly there-after. Configuring an MTU for individual interfaces, when it is needed, is not the same as configuring a topology, as you should well understand. It is in fact, no more configuration effort than we have already bought into in supporting a variety of different VLAN topology options. In fact, given the number of things we've already decided to call "configuration errors", this is simply yet another one. And that is what I was getting at. Yes, the technique you describe almost certainly was meant to detect the situation where an underlying Data-Link Layer will drop MTU-sized packets. But we're not using IS-IS to determine what will happen in L3 forwarding, because of L2 issues - we're using IS-IS to determine what will happen in L2 forwarding. Hence, two things are true: 1) the padding - if (as you assert) it serves only the one purpose of finding L2 forwarding problems - is unnecessary and 2) if it is done anyway (for wahtever reason) it must be based on the effective maximum MTU at the Data-Link Layer. I propose that configuring the L2 MTU is one option - to serve as a proof of concept that the issue you point out can be dealt with as is. -- Eric -----Original Message----- From: James Carlson [mailto:james.d.carlson at sun.com] Sent: Thursday, February 12, 2009 4:18 PM To: Eric Gray Cc: TRILL/RBridge Working Group Subject: RE: [rbridge] potential L2 forwarding loop issue in RBridges Importance: High Eric Gray writes: > James, > > One of the basic things we need to be able to assume in > layer 2 forwarding technologies is that broadcast messages - > especially those associated with topology determination - will > be forwarded by the underlying network. > > What would be the effect we could expect if BPDUs were > not forwarded? It's effectively the same result, at least from a high level. However, I'm talking specifically about IS-IS as used for TRILL, not BPDUs. The two are different. > Hence, the maximum size of our hello messages has to be > determined from the "real" MTU limitations of the underlying > network. Sadly, in the configuration I described, the "real" MTU limitation isn't knowable, at least in any automatic sense. More generally, the silent MTU problem which leads to black holes in the network (as was once created by FDDI/Ethernet bridges) is exactly why IS-IS has this feature. > Configure it if you have to. I don't think it's merely a matter of configuration. It may be possible to configure to avoid the problem on a particular network, but that renders the solution fragile. Accidents result in catastrophe, which is a bad thing for a protocol whose sole purpose in life is to avoid accidents. (If everyone could just configure L2 topologies without loops, we wouldn't need any of this fancy loop-detection stuff.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at sun.com Fri Feb 13 09:12:37 2009 From: james.d.carlson at sun.com (James Carlson) Date: Fri, 13 Feb 2009 12:12:37 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> <18836.37403.734102.790451@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se> <49950AB3.50104@cisco.com> <1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> Message-ID: <18837.43525.148230.571133@gargle.gargle.HOWL> Anoop Ghanwani writes: > > (Or perhaps the MTU that IS-IS uses should not be the actual MTU > > needed along the path with TRILL overhead, but the MTU for data > > packets, since IS-IS is now going unencapsulated. But then this > > means that MTU-restricted paths are unprotected and cannot be > > detected by normal IS-IS operation. We end up with the black > > hole problem.) > > This is certainly doable, but I'm not sure it's > robust enough. It almost falls in the category of > what Eric said. Agreed; it's not so safe. > The problem I have with this is that thus far we have been > assuming we terminate STP at RBridges. If we are to require > that RBridges participate in STP, then are the different ports > considered different instances or the same instance? If they > are the same instance, we risk creating a giant spanning tree > that goes across RBridges. I was thinking of a single instance. It's true that it creates a single giant spanning tree, but "so what?" The only case where this spanning tree is clearly needed is when we're doing plain L2-to-L2 forwarding. I think the suggestion I made could be narrowed down: the only thing that really needs to be protected is this special case of forwarding at L2 without going through at least one TRILL hop to a distinct RBridge. That special case currently has no protection against loops *if* IS-IS fails to hear peers. Donald Eastlake writes: > > The loop avoidance in 4.2.3.3 doesn't apply, because the Hellos aren't > > getting through; they're choked on MTU restrictions. The "root bridge > > collision check" doesn't apply, because we're not actually > > decapsulating a frame from some other RBridge. > > > > Chaos then ensues. > > Right. The problem is the low MTU of the hubs and that we are using > Hellos for two different things. Yes. Another way to look at it is that when L3 routing fails, it fails "safe" in that it doesn't forward non-local data, and thus can't loop. When this L2 mechanism fails, it fails "unsafe" because it enables non-local forwarding. Failing "unsafe" was one of the criticisms leveled against STP, so it's unfortunate to see it here. > Their function related to forwarding > is doing fine. In this example, no forwarding is occurring as > evidenced by the fact that there are no TRILL encapsulated frames on > the wire. However, the other function, which might be called > ingress/egress loop safety, isn't working because the Hellos are not > getting through. Yes; that's it. > > This is a more significant problem to me because we will naturally be > > dealing with unusual packet sizes for TRILL. The path between TRILL > > RBridges needs to be at least 24 octets bigger than it is for regular > > bridges, and if we're to avoid accidentally trying to use unsuitable > > paths, such as ordinary (non-jumbo-enabled) switches and hubs, then we > > need to check for them, which is what IS-IS's MTU padding does. > > Given all the various tags that 802.1 is specifying, many people would > not consider these frame sizes to be "unusual"... I'll use any word you like here. At 18+6+18+1500 == 1542 octets, they're simply bigger than some (likely many) older devices will support. Base-2 bias among us geeks makes 1536 a not uncommon limit. It's actually worse than I'm suggesting. The above narrow issue occurs only when you have a lowish-MTU-limit device on the scene with old-school Ethernet limits. Things *really* start to fall apart if any of your TRILL nodes are attempting to support jumbo frames, because now you're pushing out to everyone's limit, and failure (even with brand-new devices) is virtually guaranteed. In other words, if TRILL thinks the limit is 9000, and you need 9024 to get your encapsulated packets through, then you've got a problem, because you're all out of head room. > It is interesting that you found devices that strictly enforce the old > Ethernet frame limit. No one has worried about this because (almost?) > everyone has been assuming that equipment you would actually run into > does not have that limit. I don't know that adding just 24 bytes > actually constitutes a "jumbo" when 802.3 has now defined a larger > size to allow for tag insertion, etc. > > (802.3as-2006 specifies three frame sizes with the following size for > the MAC Client Data transported: basic frame = 1500, Q-tagged frame = > 1504, and envelope frame = 1982.) My guess is that your budget allows buying newfangled stuff more than mine. ;-} > > In routing, the IS-IS Hello failure means that forwarding for > > non-local destinations _stops_ through that link, which means that > > loops can't form. Unfortunately, TRILL defines things such that if > > IS-IS Hello fails, forwarding _starts_ through that link, and since > > (unlike routing) bridging doesn't distinguish between local and > > non-local, loops become possible. We have it backwards. > > Well, as I say above, I don't think that ingress or egress are > forwarding. they are a function not present with IP or other protocols > that have end station participation. Ingress/egress *have* to be > enabled by the non-receipt of frames (currently Hellos) just as bridge > ports have to be enabled by the non-receipt of frames (BPDUs). After > all, if there is actually no RBridge or no bridge out that port then > it has to sooner or later become enabled for RBridge ingress/egress or > bridge forwarding respectively. Agreed; that's just the trigger for the problem, when combined with IS-IS's attempt to discover MTU restrictions. > > (Or perhaps the MTU that IS-IS uses should not be the actual MTU > > needed along the path with TRILL overhead, but the MTU for data > > packets, since IS-IS is now going unencapsulated. But then this > > means that MTU-restricted paths are unprotected and cannot be > > detected by normal IS-IS operation. We end up with the black > > hole problem.) > > Yes, that's one possibility. But it seems like a pain to have two > different Hellos sizes... Yes. :-< > You want to be careful not to imply that the RBridge as a whole is > participating in STP. The bigger the STP the more ports are turned off > and, if STP turns off a port, it becomes unavailable for TRILL > encapsulated traffic. That's not quite so. We can arbitrarily define how STP and TRILL work together, if they must work together at all. As I said in my original post: However, the STP portion in this solution has (by design) no effect on TRILL itself. Instead, we use STP's link "forwarding" We can simply say that when STP turns off a port, all of TRILL still flows through the port, because we specifically exempt it. Those implementing TRILL already have to make _substantial_ changes to their code to get even basic functions to work; this sort of exemption would have trivial impact. > If you wanted to do this, you would have each RBridge port separately > participate in STP as if there was a virtual bridge between the > RBridge and the link. And if you are going to do this, you might as > well make the virtual bridges be maximum priority for root and just > say that you are DRB on a port if and only if your virtual bridge at > that port becomes root. Then, I think, you can leave Hellos the size > they are now and get rid of all Hellos except on the Designated VLAN. That's certainly a possibility. It might have other bits of goodness for other cases, such as the wiring closet case. Interesting ... > > Thus, I have at least two questions here: > > > > 1. Have I correctly identified a real problem? (If not, then a > > bonus question would be "why isn't it a problem even though I've > > run into this?") > > If you have run into it, it clearly has some reality :-) ;-} As above, it seems like my reality has problems with other folks' standards. > Perhaps people were wrong when they thought that the old Ethernet MTU > would not be enforced by equipment people would actually encounter. > Then again, some people have said things very much like "there are no > such things as hubs any more"... :-/ > > 2. Assuming it is a problem, which solution do we prefer? > > I think we need some discussion on this point but maybe this will push > us to the "DRB only if root bridge" position... Yes. > > (And perhaps a third question: does it bug anyone else that "MTU" > > never appears in this document ... ?) > > It has been discussed but those who were knowledgeable said they > thought it wouldn't be a problem. Perhpas a caveat should be added > that, since neither IS-IS Hellos nor 802.1 BPDUs are fragmentable, > having components of your network with an MTU small enough to block > them but big enough to let some data through means your network is > toast. Blocking the relatively tiny BPDUs due to MTU problems doesn't happen in any real network I've seen. Blocking the intentionally padded-out- to-what-I-think-the-MTU-is IS-IS frames happens all the time. There's another subtle issue here, which is that the change to IS-IS to make it unencapsulated in TRILL has made MTU a much more complex issue to handle in any sort of interoperable way. IS-IS needs to know the link MTU to communicate with its peers. Unfortunately, the MTU it uses is likely *NOT* the same as seen by traffic that's encapsulated by TRILL. Thus, I think the TRILL IS-IS implementations need to compute at least MTU+6 in order to match the MTU seen by TRILL. Dinesh G Dutt writes: > I'm not overly concerned with the problem you raise because it is not a > realistic deployment. However, from a protocol correctness perspective, > we must specify something to prevent the problem that you mention from > occurring. Yes, it's the problem prevention that I'm concerned about. Whether the deployment is realistic or not probably depends on what product you make, exactly who your customers are, and what they're doing with it. It's hard to put limits on that in ways that seem on-topic for this group. > I'm not in favor of running STP because one of the reasons I > hear from customers is that they'll look at TRILL because they hate STP > and want STP to completely disappear, if they can help it. Adding > another Hello to deal with this problem seems more complicated than > required to deal with what is a theoretical problem, at least to me. I understand the philosophical objection to running STP, but I'm more interested in looking at the technical parts first. In the fix I suggested, STP was *not* actually controlling the link usage (TRILL still runs even if STP disables), so STP failure is of much less consequence, but instead it's guarding against the bad things that TRILL can apparently do. > I favor your third solution and suggest a minimum MTU in a TRILL > network. Many L2 protocols that I'm aware of are doing this such as > 802.1Q VLAN and FCoE. We can specify that if the MTU on a link is less > than this min MTU, TRILL forwarding is disabled (i.e. encap/decap). Specifying things doesn't make them so. The issue I was posing was with MTU restrictions along the path between TRILL peers. The only way you can discover this with the protocols we're using (as far as I know) is that you just don't receive Hello messages from some or all of your peers. How can you distinguish between that failure case (MTU too small; must disable) and the case where no peers are present (normal leaf network; must enable)? Eric Gray writes: > If you have a management knob to allow you to configure > the "real" MTU, you could use it if you have to. You would know > that you have to, in the scenario you describe - either before > hand, or shortly there-after. Exactly how do you know that this problem exists? In the failure mode, both RBridges start forwarding L2 frames, and the network becomes unusable. For an administrator (who would have access to that knob), it's quite unclear that there is a restriction at all, or that it is specifically the problem, or exactly what action should be taken. (Do I set it to 1500? 1480? Some other number?) We have some very painful history here with the FDDI fiasco. It's nice that the civilized world is back to 1500 again (no, that world doesn't include PPPoE), because it limits the scope of the harm greatly. But now we're adding more overhead, and raising the problem again. > Configuring an MTU for individual interfaces, when it is > needed, is not the same as configuring a topology, as you should > well understand. It is in fact, no more configuration effort > than we have already bought into in supporting a variety of > different VLAN topology options. MTU restrictions along the path are simply not knowable to the protocols involved, except when larger packets are dropped. I think you're supposing that the administrator knows about all of the magic limits built into the equipment his packets must traverse, even if some of that equipment is locked in closets away from his prying eyes, but, frankly, I don't see how that's a realistic position. People are going to _expect_ that TRILL is just plug-and-play. If it's anything less than that, it almost certainly won't be acceptable to any of the customers I know about. > In fact, given the number of things we've already decided > to call "configuration errors", this is simply yet another one. > And that is what I was getting at. I think the ease with which this can happen is a distinguishing factor. This isn't a case where I've asked for something foolish, and the system has provided me with my foolish answer. This is a case where I set up the system using all of the expected defaults, and it trashed my network. > Hence, two things are true: > > 1) the padding - if (as you assert) it serves only the one purpose > of finding L2 forwarding problems - is unnecessary and I don't understand "unnecessary" here. MTU restrictions are a fact of life. That's why IS-IS has this feature. The only alternative I see is to whistle past the graveyard and "hope" that there's nothing in the middle that might have restrictions. If you said "ineffective," I'd be in agreement. What IS-IS is doing is in fact ineffective when used for L2 forwarding. > 2) if it is done anyway (for wahtever reason) it must be based on > the effective maximum MTU at the Data-Link Layer. Agreed; including the TRILL overhead. > I propose that configuring the L2 MTU is one option - to > serve as a proof of concept that the issue you point out can be > dealt with as is. I don't think I understand the answer. Are you saying that all RBridges must have MTU manually configured on all ports? That would certainly put a big dent in the zero configuration story that TRILL is supposed to have. It'd work, though, assuming administrators have perfect knowledge of all parts of the infrastructure they're using, even the parts that are under someone else's (e.g., a provider's) control. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From anoop at brocade.com Fri Feb 13 10:34:52 2009 From: anoop at brocade.com (Anoop Ghanwani) Date: Fri, 13 Feb 2009 10:34:52 -0800 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18837.43525.148230.571133@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL><941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se><18836.37403.734102.790451@gargle.gargle.HOWL><941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se><49950AB3.50104@cisco.com><1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com><4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> <18837.43525.148230.571133@gargle.gargle.HOWL> Message-ID: <4C94DE2070B172459E4F1EE14BD2364E02861253@HQ-EXCH-5.corp.brocade.com> > I was thinking of a single instance. > > It's true that it creates a single giant spanning tree, but "so what?" > The only case where this spanning tree is clearly needed is when we're > doing plain L2-to-L2 forwarding. That makes me a little uncomfortable. STP makes assumptins about the size of the network for things like TCNs to propagate in a reasonable amount of time. When run on very large topologies this would affect convergence times quite significantly. Anoop From james.d.carlson at sun.com Fri Feb 13 14:10:02 2009 From: james.d.carlson at sun.com (James Carlson) Date: Fri, 13 Feb 2009 17:10:02 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <4C94DE2070B172459E4F1EE14BD2364E02861253@HQ-EXCH-5.corp.brocade.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> <18836.37403.734102.790451@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se> <49950AB3.50104@cisco.com> <1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> <18837.43525.148230.571133@gargle.gargle.HOWL> <4C94DE2070B172459E4F1EE14BD2364E02861253@HQ-EXCH-5.corp.brocade.com> Message-ID: <18837.61370.168125.89235@gargle.gargle.HOWL> Anoop Ghanwani writes: > > > I was thinking of a single instance. > > > > It's true that it creates a single giant spanning tree, but "so what?" > > The only case where this spanning tree is clearly needed is when we're > > doing plain L2-to-L2 forwarding. > > That makes me a little uncomfortable. STP makes > assumptins about the size of the network for things > like TCNs to propagate in a reasonable amount of time. > When run on very large topologies this would affect > convergence times quite significantly. I'd wonder whether those large topologies are workable in general (won't you just drown in NetBIOS noise?), but, yes, that sounds like a reasonable concern with simply running them in parallel. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From eric.gray at ericsson.com Sat Feb 14 07:27:30 2009 From: eric.gray at ericsson.com (Eric Gray) Date: Sat, 14 Feb 2009 09:27:30 -0600 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18837.61370.168125.89235@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL><941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se><18836.37403.734102.790451@gargle.gargle.HOWL><941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se><49950AB3.50104@cisco.com><1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com><4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com><18837.43525.148230.571133@gargle.gargle.HOWL><4C94DE2070B172459E4F1EE14BD2364E02861253@HQ-EXCH-5.corp.brocade.com> <18837.61370.168125.89235@gargle.gargle.HOWL> Message-ID: <941D5DCD8C42014FAF70FB7424686DCF04A2BEEF@eusrcmw721.eamcs.ericsson.se> James, You're conflating the notion of "large topologies" with "large networks" where what's important in the former is the number of nodes and links and the latter the number of noise makers. With the proliferation of so-called work group switches, it is not uncommon to have bridges at the edge of the network numbering ~ half the number of end-stations. Add in the bridges required to connect all of the edges, and you have quite a large bridge topology - with the number of end-stations possibly barely out-numbering the number of bridges. If you had to run STP on all of these bridges, you could be looking at exactly the scenario Anoop mentions - and without adding VLANs to the mix. Add VLANs, and the number of noise-makers can get quite large without causing the network to "drown in NetBIOS noise" (or other noise sources) - thus compounding the problems that could exist in large "simple" topologies using STP. -- Eric -----Original Message----- From: James Carlson [mailto:james.d.carlson at sun.com] Sent: Friday, February 13, 2009 5:10 PM To: Anoop Ghanwani Cc: Donald Eastlake; Dinesh G Dutt; Eric Gray; TRILL/RBridge Working Group Subject: RE: [rbridge] potential L2 forwarding loop issue in RBridges Anoop Ghanwani writes: > > > I was thinking of a single instance. > > > > It's true that it creates a single giant spanning tree, but "so what?" > > The only case where this spanning tree is clearly needed is when we're > > doing plain L2-to-L2 forwarding. > > That makes me a little uncomfortable. STP makes > assumptins about the size of the network for things > like TCNs to propagate in a reasonable amount of time. > When run on very large topologies this would affect > convergence times quite significantly. I'd wonder whether those large topologies are workable in general (won't you just drown in NetBIOS noise?), but, yes, that sounds like a reasonable concern with simply running them in parallel. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Sat Feb 14 21:35:18 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Sun, 15 Feb 2009 00:35:18 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18837.43525.148230.571133@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> <18836.37403.734102.790451@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se> <49950AB3.50104@cisco.com> <1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> <18837.43525.148230.571133@gargle.gargle.HOWL> Message-ID: <1028365c0902142135k7fc39de0k63db9f846c4f0878@mail.gmail.com> Hi, See below On Fri, Feb 13, 2009 at 12:12 PM, James Carlson wrote: > Anoop Ghanwani writes: > > ... > >> The problem I have with this is that thus far we have been >> assuming we terminate STP at RBridges. If we are to require >> that RBridges participate in STP, then are the different ports >> considered different instances or the same instance? If they >> are the same instance, we risk creating a giant spanning tree >> that goes across RBridges. > > I was thinking of a single instance. > > It's true that it creates a single giant spanning tree, but "so what?" > The only case where this spanning tree is clearly needed is when we're > doing plain L2-to-L2 forwarding. I'm not sure of that. Assume a long chaing of RBridges, RB1, RB2, RB3, ... RBn, each connected to the next. Then assume RB1 is connected to a restricted MTU hub which in turn is connected to RBn. Don't we have the same problem if Hellos can't get through the hub? Admittedly, if the hub was actually an 802.1 bridge and you implemented the options root bridge collision check, you would be safe... So I don't see why its restricted to "plain L2-to-L2 forwarding". There are other disadvantages to a giant spanning tree and mentioned further below. > ... > > Donald Eastlake writes: >> > The loop avoidance in 4.2.3.3 doesn't apply, because the Hellos aren't >> > getting through; they're choked on MTU restrictions. The "root bridge >> > collision check" doesn't apply, because we're not actually >> > decapsulating a frame from some other RBridge. >> > >> > Chaos then ensues. >> >> Right. The problem is the low MTU of the hubs and that we are using >> Hellos for two different things. > > Yes. Another way to look at it is that when L3 routing fails, it > fails "safe" in that it doesn't forward non-local data, and thus can't > loop. When this L2 mechanism fails, it fails "unsafe" because it > enables non-local forwarding. Failing "unsafe" was one of the > criticisms leveled against STP, so it's unfortunate to see it here. > >> Their function related to forwarding >> is doing fine. In this example, no forwarding is occurring as >> evidenced by the fact that there are no TRILL encapsulated frames on >> the wire. However, the other function, which might be called >> ingress/egress loop safety, isn't working because the Hellos are not >> getting through. > > Yes; that's it. > >> > This is a more significant problem to me because we will naturally be >> > dealing with unusual packet sizes for TRILL. The path between TRILL >> > RBridges needs to be at least 24 octets bigger than it is for regular >> > bridges, and if we're to avoid accidentally trying to use unsuitable >> > paths, such as ordinary (non-jumbo-enabled) switches and hubs, then we >> > need to check for them, which is what IS-IS's MTU padding does. >> >> Given all the various tags that 802.1 is specifying, many people would >> not consider these frame sizes to be "unusual"... > > I'll use any word you like here. At 18+6+18+1500 == 1542 octets, > they're simply bigger than some (likely many) older devices will > support. Base-2 bias among us geeks makes 1536 a not uncommon limit. > > It's actually worse than I'm suggesting. The above narrow issue > occurs only when you have a lowish-MTU-limit device on the scene with > old-school Ethernet limits. Things *really* start to fall apart if > any of your TRILL nodes are attempting to support jumbo frames, > because now you're pushing out to everyone's limit, and failure (even > with brand-new devices) is virtually guaranteed. > > In other words, if TRILL thinks the limit is 9000, and you need 9024 > to get your encapsulated packets through, then you've got a problem, > because you're all out of head room. Your arguments above convince me more than ever that we need two different frames for the two different functions currently being performed by Hellos. For IS-IS topology adjacency, you want Hellos that are more or less as long as the longest frames you will normally be sending. If all of your campus handles 9K Jumbo frames, then you want 9K+ Hellos so your topology will reflect adjacencies over which the jumbo frames can flow. On the other hand, for native frame loop protection, in theory, you would want a "Hello" as small as the smallest native frame you would encounter. That's probably impractically small as you need some miminal data in these loop-protection-Hellos. But they could still be pretty small which should be safe enough. >> It is interesting that you found devices that strictly enforce the old >> Ethernet frame limit. No one has worried about this because (almost?) >> everyone has been assuming that equipment you would actually run into >> does not have that limit. I don't know that adding just 24 bytes >> actually constitutes a "jumbo" when 802.3 has now defined a larger >> size to allow for tag insertion, etc. >> >> (802.3as-2006 specifies three frame sizes with the following size for >> the MAC Client Data transported: basic frame = 1500, Q-tagged frame = >> 1504, and envelope frame = 1982.) > > My guess is that your budget allows buying newfangled stuff more than > mine. ;-} ? I was just quoting the standard. Most modern equipment seems to either allow for a reasonably large amount of tag stuff (but not as much as the new "envelope frame" standard) or goes all the way and allows 9K jumbo frames. > > ... > >> > (Or perhaps the MTU that IS-IS uses should not be the actual MTU >> > needed along the path with TRILL overhead, but the MTU for data >> > packets, since IS-IS is now going unencapsulated. But then this >> > means that MTU-restricted paths are unprotected and cannot be >> > detected by normal IS-IS operation. We end up with the black >> > hole problem.) >> >> Yes, that's one possibility. But it seems like a pain to have two >> different Hellos sizes... > > Yes. :-< However, maybe we need something like two Hello sizes. Of course, it could actually be adjacency Hellos and native frame loop safety Hellos; or it could be adjacency Hellos and some new IS-IS PDU type for native loop safety frames; or it could be adjacency Hellos and BPDUs (although people don't seem to want that). >> You want to be careful not to imply that the RBridge as a whole is >> participating in STP. The bigger the STP the more ports are turned off >> and, if STP turns off a port, it becomes unavailable for TRILL >> encapsulated traffic. > > That's not quite so. We can arbitrarily define how STP and TRILL work > together, if they must work together at all. As I said in my original > post: > > However, the STP portion in this solution has (by design) no > effect on TRILL itself. Instead, we use STP's link "forwarding" > > We can simply say that when STP turns off a port, all of TRILL still > flows through the port, because we specifically exempt it. Those > implementing TRILL already have to make _substantial_ changes to their > code to get even basic functions to work; this sort of exemption would > have trivial impact. First of all, there is no positive benefit to having the entire RBridge act as an STP node. It requires interaction between all the ports. Easier to just run STP on each RBridge port as if it was a stub. Secondly, there are negative effects: Although we can say that the STP blocked state has no effect on TRILL frames being received or transmitted at an RBridge port, we can not change how *bridge* ports work. Making the whole mixed RBridge / bridge campus one spanning tree will, for most topologies, turn off lots of bridge ports and make then unavailable for TRILL traffic. For example, assume a string of bridges B1, B2, B3, ... Bn, and a string of RBridges, RB1, RB2, RB3, ... RBn. Assume each bridge is connected to the next higher numbered bridge and each RBridge is connected to the next higher numbered RBridge and each bridge and RBridge having the same number are connected to each other. With spanning tree running just in the bridges or just in the bridges and the ports of the RBridges connected to them, all the bridge-RBridge connections are available for TRILL traffic. With spanning tree running in and through all bridges and RBridges, you could get a spanning tree such that all the bridge-RBridge connections but one was blocked at the bridge end and unavailable for TRILL traffic... >> If you wanted to do this, you would have each RBridge port separately >> participate in STP as if there was a virtual bridge between the >> RBridge and the link. And if you are going to do this, you might as >> well make the virtual bridges be maximum priority for root and just >> say that you are DRB on a port if and only if your virtual bridge at >> that port becomes root. Then, I think, you can leave Hellos the size >> they are now and get rid of all Hellos except on the Designated VLAN. > > That's certainly a possibility. It might have other bits of goodness > for other cases, such as the wiring closet case. Interesting ... > >> > Thus, I have at least two questions here: >> > >> > 1. Have I correctly identified a real problem? (If not, then a >> > bonus question would be "why isn't it a problem even though I've >> > run into this?") >> >> If you have run into it, it clearly has some reality :-) > > ;-} As above, it seems like my reality has problems with other folks' > standards. > >> Perhaps people were wrong when they thought that the old Ethernet MTU >> would not be enforced by equipment people would actually encounter. >> Then again, some people have said things very much like "there are no >> such things as hubs any more"... > > :-/ > >> > 2. Assuming it is a problem, which solution do we prefer? >> >> I think we need some discussion on this point but maybe this will push >> us to the "DRB only if root bridge" position... > > Yes. > >> > (And perhaps a third question: does it bug anyone else that "MTU" >> > never appears in this document ... ?) I now agree with you and think there should be text about MTU and this problem in the specification. >> It has been discussed but those who were knowledgeable said they >> thought it wouldn't be a problem. Perhpas a caveat should be added >> that, since neither IS-IS Hellos nor 802.1 BPDUs are fragmentable, >> having components of your network with an MTU small enough to block >> them but big enough to let some data through means your network is >> toast. > > Blocking the relatively tiny BPDUs due to MTU problems doesn't happen > in any real network I've seen. Blocking the intentionally padded-out- > to-what-I-think-the-MTU-is IS-IS frames happens all the time. > > There's another subtle issue here, which is that the change to IS-IS > to make it unencapsulated in TRILL has made MTU a much more complex > issue to handle in any sort of interoperable way. IS-IS needs to know > the link MTU to communicate with its peers. Unfortunately, the MTU it > uses is likely *NOT* the same as seen by traffic that's encapsulated > by TRILL. > > Thus, I think the TRILL IS-IS implementations need to compute at least > MTU+6 in order to match the MTU seen by TRILL. Seems reasonably to me. (I wanted the IS-IS PDUs to remain encapsulated, which also solves this, but the WG was of the other opinion.) > Dinesh G Dutt writes: >> I'm not overly concerned with the problem you raise because it is not a >> realistic deployment. However, from a protocol correctness perspective, >> we must specify something to prevent the problem that you mention from >> occurring. > > Yes, it's the problem prevention that I'm concerned about. > > Whether the deployment is realistic or not probably depends on what > product you make, exactly who your customers are, and what they're > doing with it. It's hard to put limits on that in ways that seem > on-topic for this group. > >> I'm not in favor of running STP because one of the reasons I >> hear from customers is that they'll look at TRILL because they hate STP >> and want STP to completely disappear, if they can help it. Adding >> another Hello to deal with this problem seems more complicated than >> required to deal with what is a theoretical problem, at least to me. > > I understand the philosophical objection to running STP, but I'm more > interested in looking at the technical parts first. In the fix I > suggested, STP was *not* actually controlling the link usage (TRILL > still runs even if STP disables), so STP failure is of much less > consequence, but instead it's guarding against the bad things that > TRILL can apparently do. > >> I favor your third solution and suggest a minimum MTU in a TRILL >> network. Many L2 protocols that I'm aware of are doing this such as >> 802.1Q VLAN and FCoE. We can specify that if the MTU on a link is less >> than this min MTU, TRILL forwarding is disabled (i.e. encap/decap). > > Specifying things doesn't make them so. > > The issue I was posing was with MTU restrictions along the path > between TRILL peers. The only way you can discover this with the > protocols we're using (as far as I know) is that you just don't > receive Hello messages from some or all of your peers. How can you > distinguish between that failure case (MTU too small; must disable) > and the case where no peers are present (normal leaf network; must > enable)? Well, you want to disable topological adjacency if the MTU is too small. But, unless the ports are configures as trunk ports, you always want exactly one appointed forwarder per VLAN on the link... As I say above, there are two different Hello usages. For IS-IS adjacency purposes, padding the Hellos so they are the same size as a maximum TRILL data frame for your network is probably the right things to do. But you can't do that for the native frame loop avoidance purpose where you want a smaller Hello, if anything, not a bigger one. > Eric Gray writes: >> If you have a management knob to allow you to configure >> the "real" MTU, you could use it if you have to. You would know >> that you have to, in the scenario you describe - either before >> hand, or shortly there-after. > > Exactly how do you know that this problem exists? > > In the failure mode, both RBridges start forwarding L2 frames, and the > network becomes unusable. For an administrator (who would have access > to that knob), it's quite unclear that there is a restriction at all, > or that it is specifically the problem, or exactly what action should > be taken. (Do I set it to 1500? 1480? Some other number?) With reasonably small native frame loop safety "Hellos", your network shouldn't melt down. If big frames are being consistently dropped, then you can suspect that your adjacency Hellos aren't big enough and are establishing adjacencies that these large frames can't make it through. > We have some very painful history here with the FDDI fiasco. It's > nice that the civilized world is back to 1500 again (no, that world > doesn't include PPPoE), because it limits the scope of the harm > greatly. But now we're adding more overhead, and raising the problem > again. > >> Configuring an MTU for individual interfaces, when it is >> needed, is not the same as configuring a topology, as you should >> well understand. It is in fact, no more configuration effort >> than we have already bought into in supporting a variety of >> different VLAN topology options. I don't see that configuring per port MTUs is such good idea... Are you going to have different topologies for every different MTU value? Are RBridge supposed to be able to fragment frames? > MTU restrictions along the path are simply not knowable to the > protocols involved, except when larger packets are dropped. > > I think you're supposing that the administrator knows about all of the > magic limits built into the equipment his packets must traverse, even > if some of that equipment is locked in closets away from his prying > eyes, but, frankly, I don't see how that's a realistic position. > People are going to _expect_ that TRILL is just plug-and-play. If > it's anything less than that, it almost certainly won't be acceptable > to any of the customers I know about. We need to consider both the zero configuration case and the heavily configured case... > > ... > >> Hence, two things are true: >> >> 1) the padding - if (as you assert) it serves only the one purpose >> of finding L2 forwarding problems - is unnecessary and > > I don't understand "unnecessary" here. MTU restrictions are a fact of > life. That's why IS-IS has this feature. The only alternative I see > is to whistle past the graveyard and "hope" that there's nothing in > the middle that might have restrictions. > > If you said "ineffective," I'd be in agreement. What IS-IS is doing > is in fact ineffective when used for L2 forwarding. ? Hellos that detect adjacencies over which data can be send should to be padded to the maximum size of the data frames or their detection of adjacencies will be erroneous. This is a principle in the IS-IS specification. But frames sent to protect against native frame looping have to be short so they will get through... >> 2) if it is done anyway (for wahtever reason) it must be based on >> the effective maximum MTU at the Data-Link Layer. > > Agreed; including the TRILL overhead. > >> I propose that configuring the L2 MTU is one option - to >> serve as a proof of concept that the issue you point out can be >> dealt with as is. > > I don't think I understand the answer. Are you saying that all > RBridges must have MTU manually configured on all ports? That would > certainly put a big dent in the zero configuration story that TRILL is > supposed to have. Sees like we need to allow some MTU configuration but specify the zero configuration default for 802.3 links... > ... > > James Carlson, Solaris Networking > Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 Thanks, Donald ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From Radia.Perlman at sun.com Mon Feb 16 21:17:09 2009 From: Radia.Perlman at sun.com (Radia Perlman) Date: Mon, 16 Feb 2009 21:17:09 -0800 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18836.30287.649263.814351@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> Message-ID: <499A4855.2010808@sun.com> Yup. Nasty problem. So...it seems to be that there are two purposes to a Hello: a) ensuring that you can hear your neighbors b) testing to see how big a frame you can really send. So these are kind of competing purposes. I'd suggest having RBridges send two types of Hellos: a) a minimal-sized Hello that includes the RBridge's ID and priority, and the designated VLAN -- perhaps some other stuff like what it thinks the MTU size actually is b) fully padded Hello with all the other information included. The second type of Hello only needs to be sent on the designated VLAN, and perhaps could be sent less frequently than the minimal sized Hello. If R2 hears little Hellos from R1, and then doesn't hear any of the big ones, R2 could complain somehow. Radia James Carlson wrote: > I've discovered something in my testing of our TRILL/RBridge > implementation that I think might be significant. If I'm right, it > means that something is missing from our current protocol draft > (likely section 4.2.3.3). > > An RBridge by itself (no other RBridges in sight on the wire) will > become DRB on every link it has and, since it sees no other RBridge, > will also become AF for all configured VLANs on each link. > > It will begin learning and forwarding on all of those links. > Logically, since there are no other RBridges to talk to, it will (at > least as viewed from outside; the internal design doesn't matter) > simply forward L2 packets from one link to another. > > Now suppose we have two RBridges connected like so: > > hub1 > |+----------+| > | | > +--+--+ +--+--+ > | RB1 | | RB2 | > +--+--+ +--+--+ > | | > |+----------+| > hub2 > > What should happen is that RB1 and RB2 will elected a DRB on each one > of those networks (hub1 and hub2), resulting in either L2 forwarding > through just a single RBridge (if AF is the same RBridge on each > network), or in a trip through TRILL encpasulation (if RB1 is AF on > one network and RB2 is AF on the other). > > So far, so good. However, what happens if RB1 and RB2 cannot see each > other because of MTU restrictions in hub1 and hub2? > > This is where things seem to get ugly. If we're using the normal > MTU-padded IS-IS packets for TRILL Hello messages, then those will get > dropped by the hubs. Both RBridges will declare themselves to be DRB > and AF for everything, and both will begin forwarding L2 frames. > Neither sees the other at all, except that for all frames sized to the > restriction or less, both will attempt to forward. > > The loop avoidance in 4.2.3.3 doesn't apply, because the Hellos aren't > getting through; they're choked on MTU restrictions. The "root bridge > collision check" doesn't apply, because we're not actually > decapsulating a frame from some other RBridge. > > Chaos then ensues. > > Conceptually, if we were (in this stand-alone L2 forwarding between > ports case) pretending to encapsulate in TRILL and then immediately > decapsulating, then the collision check would apply, but it wouldn't > actually work, because there *isn't* any 802.1D bridged connection > between hub1 and hub2, and thus they do have completely different root > bridges. > > This is a more significant problem to me because we will naturally be > dealing with unusual packet sizes for TRILL. The path between TRILL > RBridges needs to be at least 24 octets bigger than it is for regular > bridges, and if we're to avoid accidentally trying to use unsuitable > paths, such as ordinary (non-jumbo-enabled) switches and hubs, then we > need to check for them, which is what IS-IS's MTU padding does. > > In routing, the IS-IS Hello failure means that forwarding for > non-local destinations _stops_ through that link, which means that > loops can't form. Unfortunately, TRILL defines things such that if > IS-IS Hello fails, forwarding _starts_ through that link, and since > (unlike routing) bridging doesn't distinguish between local and > non-local, loops become possible. We have it backwards. > > I can think of at least two ways to avoid this problem, but I think > I'd like to hear from the group before going ahead with a formal > description of one. > > A. Require TRILL IS-IS implementations to send very small Hello > messages, not MTU-padded, and with as little information as we > can manage. (If we need "large" Hellos, then send small ones > occasionally as well.) This allows the loop avoidance part to > do its work. > > (Or perhaps the MTU that IS-IS uses should not be the actual MTU > needed along the path with TRILL overhead, but the MTU for data > packets, since IS-IS is now going unencapsulated. But then this > means that MTU-restricted paths are unprotected and cannot be > detected by normal IS-IS operation. We end up with the black > hole problem.) > > B. Require RBridge implementations to include STP as well. > However, the STP portion in this solution has (by design) no > effect on TRILL itself. Instead, we use STP's link "forwarding" > state to gate the AF behavior in the following ways: > > i. When STP reports that the link is not in "forwarding" > state, we refuse to become AF for any VLAN. We become AF > only if STP reports "forwarding" for the link *and* the > DRB appoints us to do it. > > ii. When encapsulating from or decapsulating to native format, > examine both the AF flag and the STP state. Discard the > packet if not AF or if STP state is not "forwarding." > > Thus, I have at least two questions here: > > 1. Have I correctly identified a real problem? (If not, then a > bonus question would be "why isn't it a problem even though I've > run into this?") > > 2. Assuming it is a problem, which solution do we prefer? > > (And perhaps a third question: does it bug anyone else that "MTU" > never appears in this document ... ?) > > From Radia.Perlman at sun.com Mon Feb 16 21:22:24 2009 From: Radia.Perlman at sun.com (Radia Perlman) Date: Mon, 16 Feb 2009 21:22:24 -0800 Subject: [rbridge] WG last call on draft-trill-rbridge-protocol-10.txt In-Reply-To: References: Message-ID: <499A4990.40004@sun.com> Ayan -- I"ll answer two of the things you mentioned. Ayan Banerjee wrote: > > Clarification for EASDI: > If my understanding of the draft is accurate, there are possibly 4K > vlan-ESADIs and on each node and we are "only" running the CSNP > functionality (of traditional IS-IS). The "hello" functionality is not run > on them. Consider a router that is interested in all "vlans"; such a router > will have to support *all* VLAN-ESADIs. I believe that this is a significant > load on the router. I believe that we should have an optional single-ESADI > instance as well; this will allow for control-plane learning of unicast MAC > addresses in a more scalable fashion. I am fine with having the optional 4K > instances also co-exist; but my preference is to have a single instance for > unicast MAC distribution. > I think that combining the ESADIs into one would be a complication, and it involves more overhead since RBridges would have to keep information for VLANs they are not interested in. So I think that in most cases, having the separate ESADIs would work best. An RBridge is allowed to ignore all ESADIs and only learn from data traffic, and it could choose to listen to ESADIs on some subset of VLANs, and learn from data traffic on the others. > > > Parallel links between rbridges: > We need information in the draft that states that we break ties using (a) > extended circuit id on P2P links (makes 3-way handshake mandatory) and (b) > in a LAN, use lan circuit id. > > I think that if R1 and R2 have two pt-to-pt links between them, they should not be reporting both of them in their LSP. There's already a tie-breaker for pseudonodes, right? Maybe I don't understand this issue. Radia From james.d.carlson at Sun.COM Tue Feb 17 06:03:00 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Tue, 17 Feb 2009 09:03:00 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <499A4855.2010808@sun.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <499A4855.2010808@sun.com> Message-ID: <18842.50068.113516.12288@gargle.gargle.HOWL> Radia Perlman writes: > I'd suggest having RBridges send two types of Hellos: > a) a minimal-sized Hello that includes the RBridge's ID and priority, > and the designated VLAN -- perhaps > some other stuff like what it thinks the MTU size actually is > b) fully padded Hello with all the other information included. Yes, that's essentially what I had in mind for my option (A). -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at sun.com Tue Feb 17 06:37:52 2009 From: james.d.carlson at sun.com (James Carlson) Date: Tue, 17 Feb 2009 09:37:52 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <1028365c0902142135k7fc39de0k63db9f846c4f0878@mail.gmail.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A044B4@eusrcmw721.eamcs.ericsson.se> <18836.37403.734102.790451@gargle.gargle.HOWL> <941D5DCD8C42014FAF70FB7424686DCF04A048E2@eusrcmw721.eamcs.ericsson.se> <49950AB3.50104@cisco.com> <1028365c0902121958m20214108we2310386f915adc0@mail.gmail.com> <4C94DE2070B172459E4F1EE14BD2364E0286116A@HQ-EXCH-5.corp.brocade.com> <18837.43525.148230.571133@gargle.gargle.HOWL> <1028365c0902142135k7fc39de0k63db9f846c4f0878@mail.gmail.com> Message-ID: <18842.52160.896503.253453@gargle.gargle.HOWL> Donald Eastlake writes: > I'm not sure of that. Assume a long chaing of RBridges, RB1, RB2, RB3, > ... RBn, each connected to the next. Then assume RB1 is connected to a > restricted MTU hub which in turn is connected to RBn. Don't we have > the same problem if Hellos can't get through the hub? Admittedly, if > the hub was actually an 802.1 bridge and you implemented the options > root bridge collision check, you would be safe... Yes, that's what I was hoping for in that case. Since I had been considering running STP in parallel, that would have guaranteed that there'd be an be a proper root bridge ID available (even if that hub itself wasn't a bridge at all, but a repeater), because *we* could always be the root. > So I don't see why its restricted to "plain L2-to-L2 forwarding". > > There are other disadvantages to a giant spanning tree and mentioned > further below. Agreed. > > In other words, if TRILL thinks the limit is 9000, and you need 9024 > > to get your encapsulated packets through, then you've got a problem, > > because you're all out of head room. > > Your arguments above convince me more than ever that we need two > different frames for the two different functions currently being > performed by Hellos. For IS-IS topology adjacency, you want Hellos > that are more or less as long as the longest frames you will normally > be sending. If all of your campus handles 9K Jumbo frames, then you > want 9K+ Hellos so your topology will reflect adjacencies over which > the jumbo frames can flow. > > On the other hand, for native frame loop protection, in theory, you > would want a "Hello" as small as the smallest native frame you would > encounter. That's probably impractically small as you need some > miminal data in these loop-protection-Hellos. But they could still be > pretty small which should be safe enough. Yes, that's what Radia and I have both suggested. I'm not wild about it, but there's a case to be made. > >> (802.3as-2006 specifies three frame sizes with the following size for > >> the MAC Client Data transported: basic frame = 1500, Q-tagged frame = > >> 1504, and envelope frame = 1982.) > > > > My guess is that your budget allows buying newfangled stuff more than > > mine. ;-} > > ? I was just quoting the standard. Most modern equipment seems to > either allow for a reasonably large amount of tag stuff (but not as > much as the new "envelope frame" standard) or goes all the way and > allows 9K jumbo frames. One of the unresolved issues is whether those who build such devices consider TRILL overhead to be "tag stuff" or just payload. Being overly strict about what you consider to be "payload" seems to be par for this course. > First of all, there is no positive benefit to having the entire > RBridge act as an STP node. It requires interaction between all the > ports. Easier to just run STP on each RBridge port as if it was a > stub. Agreed. The more I think about this option, the less I like it. Consider me convinced. ;-} > > Thus, I think the TRILL IS-IS implementations need to compute at least > > MTU+6 in order to match the MTU seen by TRILL. > > Seems reasonably to me. (I wanted the IS-IS PDUs to remain > encapsulated, which also solves this, but the WG was of the other > opinion.) Rewacking our implementation has at least turned up some interesting issues. > > The issue I was posing was with MTU restrictions along the path > > between TRILL peers. The only way you can discover this with the > > protocols we're using (as far as I know) is that you just don't > > receive Hello messages from some or all of your peers. How can you > > distinguish between that failure case (MTU too small; must disable) > > and the case where no peers are present (normal leaf network; must > > enable)? > > Well, you want to disable topological adjacency if the MTU is too > small. But, unless the ports are configures as trunk ports, you always > want exactly one appointed forwarder per VLAN on the link... There are secondary issues here as well. If you detect an MTU restriction, it means that the network is misconfigured (at best) or that it (more likely) contains a "broken" device. Thus, not only is your adjacency in trouble, but the configuration of that interface and perhaps of the larger network is in question, and you need to report that there's a problem. Even without adjacencies, there's the issue of AF behavior to consider -- no fair having an AF that only handles "some" of the packets. (Obviously, there are MTU problems that _can't_ be detected this way -- if a restriction has no TRILL neighbors behind it, you're sunk. I guess LLDP might be an out ... if it were commonly used.) > > I don't understand "unnecessary" here. MTU restrictions are a fact of > > life. That's why IS-IS has this feature. The only alternative I see > > is to whistle past the graveyard and "hope" that there's nothing in > > the middle that might have restrictions. > > > > If you said "ineffective," I'd be in agreement. What IS-IS is doing > > is in fact ineffective when used for L2 forwarding. > > ? Hellos that detect adjacencies over which data can be send should to > be padded to the maximum size of the data frames or their detection of > adjacencies will be erroneous. This is a principle in the IS-IS > specification. What makes that mechanism *effective* is that with normal IS-IS (for L3 forwarding), the forwarding for non-local destinations stops if a problem is encountered. With the current TRILL scheme, forwarding continues if a problem is encountered, and thus I was offering up "ineffective" in contrast to Eric's suggested "unnecessary." I think it's necessary. I just don't think it works without the changes you've suggested. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at sun.com Wed Feb 18 07:02:45 2009 From: james.d.carlson at sun.com (James Carlson) Date: Wed, 18 Feb 2009 10:02:45 -0500 Subject: [rbridge] IS-IS for TRILL interoperability issues Message-ID: <18844.8981.454420.809959@gargle.gargle.HOWL> In testing our TRILL implementation, I've discovered that there's both ambiguity in the current -11 I-D (separate designers here were able to interpret the specification in incompatible ways), as well as serious issues looming further down the road. I recommend (1) including explicit diagrams of *all* of the packet formats (currently, only the TRILL header is diagrammed), so that there can be no confusion about the meaning of the English text, and (2) reconsidering the recent "IS-IS unencapsulated" decision. The first issue (specification ambiguity) revolves around the way IS-IS frames are transmitted, and the interpretation of the word "and" in this text: o "TRILL" frames are those (1) with a multicast destination address allocated to the TRILL protocol (see Section 7.2) and (2) non-control frames with the TRILL Ethertype. There are Assuming Ethernet, my reading of the current specification results in TRILL IS-IS frames that look like this: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All-IS-IS-RBridges MAC Destination | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MAC Destination (cont) | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet 802 Length Field | DSAP = FE | SSAP = FE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CTRL = 03 | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... That is, I read the "and" in the text to mean that any packet with that special multicast destination *OR* with the appropriate Ethertype is to be considered a "TRILL frame," and I understood "unencapsulated" to mean that the *ONLY* thing distinguishing this from a regular IS-IS frame was the destination MAC address (and woe unto those implementations that don't filter multicast perfectly). The other designer here apparently read that "and" literally, because his frames look like this on the wire: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All-IS-IS-RBridges MAC Destination | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MAC Destination (cont) | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Ethertype | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... Both seem plausible, and given the lack of information, it's possible that others may read this specification in other ways, perhaps including the LLC header with the Ethertype, depending on what those readers consider to be the IS-IS frame. Thus, I strongly encourage the authors to include an appropriate diagram (such as the above) to make the frame format unambiguous. However, this observation leads directly into the second problem, which I view as much less tractable. Because of the "unencapsulated" change, the specification now critically depends on Outer.MacDA. Unfortunately, the presence of Outer. itself depends on the underlying medium. When TRILL is run over something *other* than Ethernet, those fields change. For example, when run over PPP, there are no Other.anything fields available at all other than a protocol type. A TRILL data packet on PPP would look like this (without header compression): +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ADDR = FF | CTRL = 03 | TRILL Protocol ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V | R |M|Op-Length| Hop Count | Egress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ingress RBridge Nickname | Options ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... But what does a TRILL IS-IS packet look like on PPP? There's no way to set Outer.MacDA to distinguish this message, so the remaining options appear to be: - Allocate two protocol IDs, one for TRILL data and the other for TRILL IS-IS. (As the chair of PPP extensions, I can tell you that allocation of multiple protocol IDs for a single purpose would likely be viewed dimly by a significant number of participants. It just reeks of poor planning.) - Cheat on the TRILL header. As long as the version field "V" is never set to '10', you can always tell an IS-IS frame by testing the first byte after the protocol ID: it's 83 for IS-IS, and anything else for TRILL. - Given both this problem and the subtle MTU issues, reconsider the "unencapsulated" decision, and put IS-IS frames after a TRILL header that includes a flag indicating whether the payload is IS-IS or data. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at sun.com Wed Feb 18 10:20:41 2009 From: james.d.carlson at sun.com (James Carlson) Date: Wed, 18 Feb 2009 13:20:41 -0500 Subject: [rbridge] IS-IS for TRILL interoperability issues In-Reply-To: <18844.8981.454420.809959@gargle.gargle.HOWL> References: <18844.8981.454420.809959@gargle.gargle.HOWL> Message-ID: <18844.20857.278803.129874@gargle.gargle.HOWL> James Carlson writes: > - Given both this problem and the subtle MTU issues, reconsider the > "unencapsulated" decision, and put IS-IS frames after a TRILL > header that includes a flag indicating whether the payload is > IS-IS or data. I know, bad form to follow up my own posting, and I should have included this in my original message, but I could live with either the original Inner.MacDA test (needed for both IS-IS and ESADI) to distinguish these frames or with a discrete flag in the header. The latter would chew up a precious bit, but would be far easier to implement in hardware and embedded systems (much easier than looking for a giant Ethernet address later in the message at a potentially *variable* offset due to options). That latter sort of header would look like this on Ethernet: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All-IS-IS-RBridges MAC Destination | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MAC Destination (cont) | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Ethertype | V |R|C|M|Op-Length| Hop Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress RBridge Nickname | Ingress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Options ... +-+-+-+-+-+-+-+-+-... | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... and this on PPP: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ADDR = FF | CTRL = 03 | TRILL Protocol ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |R|C|M|Op-Length| Hop Count | Egress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ingress RBridge Nickname | TRILL Options ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... Where the big change is the addition of the "C" (command/data) flag: If C == 0, then the header specifies TRILL data, and contains an Ethernet MAC header with VLAN tag after the TRILL Options (if any). If C == 1, then: M must be 0 Op-Length is zero because no options are yet defined Hop Count must be 0 Egress and Ingress RBridge Nicknames are 0 on transmit The IS-IS packet (starting with IDRP) begins after the (non-existent) options field. A form like this guarantees that the key information required to distinguish control from data is always at fixed offsets in the packet, which is necessary for hardware design, simple embedded systems, and some kinds of packet filtering mechanisms. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From Radia.Perlman at sun.com Wed Feb 18 16:10:43 2009 From: Radia.Perlman at sun.com (Radia Perlman) Date: Wed, 18 Feb 2009 16:10:43 -0800 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <18844.20857.278803.129874@gargle.gargle.HOWL> References: <18844.8981.454420.809959@gargle.gargle.HOWL> <18844.20857.278803.129874@gargle.gargle.HOWL> Message-ID: <499CA383.2010408@sun.com> (Note: I like descriptive subject lines, so I'm changing the subject line for the thread to discuss James Carlson's latest noticing of a problem in the spec). He mentioned that currently the way to tell the difference between an IS-IS message and an encapsulated TRILL message is that the outer destination MAC address is "All-IS-IS-RBridges" for IS-IS messages, and "All-RBridges" for encapsulated data, and ESADI. But on PPP, there is no "outer destination MAC addres". So, how can we differentiate them? Here are the possibilities that were floated around: a) get two protocol types for TRILL for PPP (it's frowned on to "waste" these) b) use the top bit after the PPP header, which would be the version field if there were a TRILL header there, which fortuitously today happens to be 0 for TRILL-encapsulated, and 1 for IS-IS. (kind of kludgy, and uses a bit of the already small version number field) c) insert a "sub-protocol type" field after the PPP header, either a byte, or more if people want the header byte-or word aligned. Currently it would only have two values (IS-IS, or encapsulated) d) always encapsulate with a TRILL header, and differentiate based on the "inner" MAC destination address ************ I think I prefer suggestion c). Here's James Carlson's original post: James Carlson writes: > > - Given both this problem and the subtle MTU issues, reconsider the > > "unencapsulated" decision, and put IS-IS frames after a TRILL > > header that includes a flag indicating whether the payload is > > IS-IS or data. > I know, bad form to follow up my own posting, and I should have included this in my original message, but I could live with either the original Inner.MacDA test (needed for both IS-IS and ESADI) to distinguish these frames or with a discrete flag in the header. The latter would chew up a precious bit, but would be far easier to implement in hardware and embedded systems (much easier than looking for a giant Ethernet address later in the message at a potentially *variable* offset due to options). That latter sort of header would look like this on Ethernet: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All-IS-IS-RBridges MAC Destination | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MAC Destination (cont) | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Ethertype | V |R|C|M|Op-Length| Hop Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress RBridge Nickname | Ingress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Options ... +-+-+-+-+-+-+-+-+-... | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... and this on PPP: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ADDR = FF | CTRL = 03 | TRILL Protocol ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |R|C|M|Op-Length| Hop Count | Egress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ingress RBridge Nickname | TRILL Options ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... Where the big change is the addition of the "C" (command/data) flag: If C == 0, then the header specifies TRILL data, and contains an Ethernet MAC header with VLAN tag after the TRILL Options (if any). If C == 1, then: M must be 0 Op-Length is zero because no options are yet defined Hop Count must be 0 Egress and Ingress RBridge Nicknames are 0 on transmit The IS-IS packet (starting with IDRP) begins after the (non-existent) options field. A form like this guarantees that the key information required to distinguish control from data is always at fixed offsets in the packet, which is necessary for hardware design, simple embedded systems, and some kinds of packet filtering mechanisms. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ rbridge mailing list rbridge at postel.org http://mailman.postel.org/mailman/listinfo/rbridge From james.d.carlson at Sun.COM Thu Feb 19 05:18:38 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Thu, 19 Feb 2009 08:18:38 -0500 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <499CA383.2010408@sun.com> References: <18844.8981.454420.809959@gargle.gargle.HOWL> <18844.20857.278803.129874@gargle.gargle.HOWL> <499CA383.2010408@sun.com> Message-ID: <18845.23598.604855.274846@gargle.gargle.HOWL> Radia Perlman writes: > c) insert a "sub-protocol type" field after the PPP header, either a > byte, or more if > people want the header byte-or word aligned. Currently it would only > have two > values (IS-IS, or encapsulated) [...] > I think I prefer suggestion c). I dislike (c) for at least two reasons: - This means that TRILL headers have two "formats." There's one format (without this sub-protocol [kludge] field) on media that have L2 addresses that we can allocate, and another format (with the field) on media that don't. This is unlike all other L3 protocols that run on PPP, which all have well-known formats that also work on other media. - The difference in headers means that the already-subtle MTU computation (which is marred by the "unencapsulated" IS-IS running in parallel with the highly-encapsulated data) gains another wart. I'm already a bit wary that we can get interoperability at all (the spec has no real guidance on MTU issues), and this will add to the problem. - What happens if we simplify the implementation? Why not have the same TRILL header format on both Ethernet and PPP? To me, the sub-protocol idea is just a half-step into using a flag to distinguish data from control, so we might as well go all the way. OK; at least two reasons. Probably more like three. ;-} One more possible solution (peculiar to PPP) would be: (e) use the rarely-used 4xxx range of PPP Protocol IDs to carry TRILL IS-IS and ESADI. That's still not so great, as we'd need to distinguish between TRILL IS-IS and ESADI, and it still doesn't address MTU. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Thu Feb 19 06:30:08 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 19 Feb 2009 09:30:08 -0500 Subject: [rbridge] IS-IS for TRILL interoperability issues In-Reply-To: <1028365c0902180912t6329b64co34cdc3b042cb1173@mail.gmail.com> References: <18844.8981.454420.809959@gargle.gargle.HOWL> <1028365c0902180912t6329b64co34cdc3b042cb1173@mail.gmail.com> Message-ID: <1028365c0902190630o3196bb2bk660d096ecf40bdda@mail.gmail.com> Hi Jim, On Wed, Feb 18, 2009 at 10:02 AM, James Carlson wrote: > In testing our TRILL implementation, I've discovered that there's both > ambiguity in the current -11 I-D (separate designers here were able to > interpret the specification in incompatible ways), as well as serious > issues looming further down the road. > > I recommend (1) including explicit diagrams of *all* of the packet > formats (currently, only the TRILL header is diagrammed), so that > there can be no confusion about the meaning of the English text, and > (2) reconsidering the recent "IS-IS unencapsulated" decision. > > The first issue (specification ambiguity) revolves around the way > IS-IS frames are transmitted, and the interpretation of the word "and" > in this text: > > o "TRILL" frames are those (1) with a multicast destination > address allocated to the TRILL protocol (see Section 7.2) and > (2) non-control frames with the TRILL Ethertype. There are Probably should be re-worded something like: "TRILL" frames are those that either (1) have a multicast destination address allocated to the TRILL protocol (see Section 7.2) or (2) are non-control frames with the TRILL Ethertype. ... > Assuming Ethernet, my reading of the current specification results in > TRILL IS-IS frames that look like this: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | All-IS-IS-RBridges MAC Destination | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | MAC Destination (cont) | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Ethernet 802 Length Field | DSAP = FE | SSAP = FE | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | CTRL = 03 | IDRP = 83 | Length ... > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... > > That is, I read the "and" in the text to mean that any packet with > that special multicast destination *OR* with the appropriate Ethertype > is to be considered a "TRILL frame," and I understood "unencapsulated" > to mean that the *ONLY* thing distinguishing this from a regular IS-IS > frame was the destination MAC address (and woe unto those > implementations that don't filter multicast perfectly). The above is indeed what was intended. > The other designer here apparently read that "and" literally, because > his frames look like this on the wire: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | All-IS-IS-RBridges MAC Destination | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | MAC Destination (cont) | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | TRILL Ethertype | IDRP = 83 | Length ... > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... > > Both seem plausible, and given the lack of information, it's possible > that others may read this specification in other ways, perhaps > including the LLC header with the Ethertype, depending on what those > readers consider to be the IS-IS frame. Thus, I strongly encourage > the authors to include an appropriate diagram (such as the above) to > make the frame format unambiguous. Yes, the wording in the definition of a TRILL frame should be improved and an explicit frame diagram for a TRILL IS-IS frame should also be included. > ... [see other messages re PPP problem] Thanks, Donald -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From james.d.carlson at sun.com Thu Feb 19 07:15:31 2009 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 19 Feb 2009 10:15:31 -0500 Subject: [rbridge] IS-IS for TRILL interoperability issues In-Reply-To: <1028365c0902190630o3196bb2bk660d096ecf40bdda@mail.gmail.com> References: <18844.8981.454420.809959@gargle.gargle.HOWL> <1028365c0902180912t6329b64co34cdc3b042cb1173@mail.gmail.com> <1028365c0902190630o3196bb2bk660d096ecf40bdda@mail.gmail.com> Message-ID: <18845.30611.813491.436287@gargle.gargle.HOWL> Donald Eastlake writes: > On Wed, Feb 18, 2009 at 10:02 AM, James Carlson wrote: > > o "TRILL" frames are those (1) with a multicast destination > > address allocated to the TRILL protocol (see Section 7.2) and > > (2) non-control frames with the TRILL Ethertype. There are > > Probably should be re-worded something like: > "TRILL" frames are those that either (1) have a multicast > destination address allocated to the TRILL protocol (see Section 7.2) > or (2) are non-control frames with the TRILL Ethertype. ... Yes, that's clearer, though I still like the idea of pictures better. There's no arguing with an array of bytes. > Yes, the wording in the definition of a TRILL frame should be improved > and an explicit frame diagram for a TRILL IS-IS frame should also be > included. That'll help; thanks. > > ... [see other messages re PPP problem] For what it's worth, I had them in one message because the two issues are tangled together. The text is unclear in part because of this change ... and the change itself has also made the text dependent on features of Ethernet that aren't present on other media. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Thu Feb 19 07:58:51 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 19 Feb 2009 10:58:51 -0500 Subject: [rbridge] WG last call on draft-trill-rbridge-protocol-10.txt In-Reply-To: References: <49307783.9010504@sun.com> Message-ID: <1028365c0902190758l2d564959lc80c60f06bf48f0b@mail.gmail.com> Hi Ayan, See below... On Tue, Jan 13, 2009 at 5:56 PM, Ayan Banerjee wrote: > Radia and Donald, > > ... > > Clarification for EASDI: > If my understanding of the draft is accurate, there are possibly 4K > vlan-ESADIs and on each node and we are "only" running the CSNP > functionality (of traditional IS-IS). The "hello" functionality is not run > on them. Consider a router that is interested in all "vlans"; such a router > will have to support *all* VLAN-ESADIs. I assume by router you mean RBridge. No RBridge "has to support" any VLAN-ESADIs. It always works, in the sense that behavior is correct, for the RBridge to just learn from the data plane. No one implementing an RBridge is required to implement any ESADI support. Support/use of ESADI would likely be important in cases where their is a Layer 2 registration protocol for end stations and you want to transmit the end station MAC address information securely or the rate of mobility or arrival/departure of end stations is so high that, in the absence of ESADI, you would have excessive black-holing due to out-of-date address cache information that has not yet timed out or excessive broadcast traffic due to unknown unicast addresses. I can think of scenarios where an RBridge is appointed forwarder for 4K VLANs. But I have a much harder time thinking of plausible scenarios where there are 4K VLANs all of which have the special requirements to make ESADI very important. Can you give me an example? > I believe that this is a significant > load on the router. I believe that we should have an optional single-ESADI > instance as well; this will allow for control-plane learning of unicast MAC > addresses in a more scalable fashion. I am fine with having the optional 4K > instances also co-exist; but my preference is to have a single instance for > unicast MAC distribution. As I say, I don't see why there would ever be thousands of VLANs for which the ESADI protocol was being used. But let's assume there are. So what exactly do you mean by "instance"? It seems to me its mostly a matter of implementation whether it is more "load" on an RBridge that for some reason is running ESADI for thousands of VLANs to get separate LSPs/CSNPs for each VLAN or some kind of merged single set of LSPs/CSNPs that are received, processed, and possibly re-emitted hop-by-hop throughout the entire campus. In order for this "single-ESADI instance" you propose to work, wouldn't all transit RBridges have to implement it? Doesn't that impose a big burden, since no RBridge has to implement ESADI right now. Doesn't it add a big load to almost all RBridges that are actually interested in running ESADI for zero or maybe one or two VLANs? Wouldn't they have to actively process all ESADI protocol mediated updates for all VLANs? ESADI is carefully designed current to impose zero control plane burden on transit RBridges that don't implement it and have it enabled. Wouldn't you lose that? Also, wouldn't the information in the "single instance" LSPs have to be labeled as to what VLAN it applies to? Doesn't that imply the specification of a bunch of additional TLVs or optional fields in TLVs with VLAN fields? And doesn't that break VLAN translation within RBridges or at least make it much more complex? > P2P IIHs and LAN IIHs: > When TRILL-IS-IS sends out hellos it does so based on the link capability. > On P2P links (configured or real ones) it sends P2P IIHs and on multi-access > links it sends LAN IIHs. I presume that in TRILL we want to default sending > out LAN IIHs (is this accurate?). We should have a section on P2P IIHs and > just talk about if any functionality/sub-tlvs are not required for that case > (for example, do we need to find a common vlan in P2P like in a LAN - > probably not etc). Note that a P2P IIH and LAN IIH will not be bring up an > adjacency. Yes, I think the default should be LAN Hellos. I'm not sure there are many differences between P2P and LAN Hello contents. Couldn't you have a "point-to-point" link between two RBridges that was, in fact, over carrier Ethernet facilities or something that only provide connectivity on, say, VLAN 42? But if there are any differences at all, a brief section on P2P Hellos seems reasonable. > Parallel links between rbridges: > We need information in the draft that states that we break ties using (a) > extended circuit id on P2P links (makes 3-way handshake mandatory) and (b) > in a LAN, use lan circuit id. I'm confused by what you say. Assume we have RBridges 1 and 2 such that there is, say, a point to point link between port A on RB1 and port A on RB2 and also between port B on RB1 and port B on RB2. There are two points of view depending on whether you are one of these two RBridges or some other RBridge in the campus: Assume you are some other RBridge, RB3. Do you even see both the A and B adjacencies in the link state? I would think not and that this should be reported as only a single adjacency in the RB1 and RB2 LSPs. If, for some reason, you do see it as two adjacencies, why would you care? As long as you know there is connectivity between RB1 and RB2 you can use that in SPF calculations. I suppose you need a way of determining the cost from the two costs you would see but you could just use the minimum of the two or something. And if for some really bizarre reason, even though you are remote from RB1 and RB2 you not only see the two parallel paths in the link state but you actually care which path is taken, there is no space in the LSP TLVs to encode any tie breaking information such as you suggest. So I don't see any need for a tiebreaker here. Assume the other case, that you are either RB1 or RB2. I don't see any difficulty here either. You should accept TRILL traffic on both connections and we should say that as a clarification. (You wouldn't want the Reverse Path Forwarding Check or something causing TRILL frames on one of the parallel connections to be discarded.) And you can send traffic on either connection. Or can do Equal Cost MultiPath on both. But if you send over only one of them then, assuming they are equal cost, it seems like a purely local decision which one and I don't see why we need to specify a tie breaker. > Thanks, > Ayan > > P.S. I have not fully cross-checked with version 11 to see all that has gone > in, but I will take a look. Also, I will take a closer look on the > hello-AF/AC issue with mis-configurations and get back to you. > > ... > Thanks, Donald ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From d3e3e3 at gmail.com Thu Feb 19 11:43:29 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 19 Feb 2009 14:43:29 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <499A4855.2010808@sun.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <499A4855.2010808@sun.com> Message-ID: <1028365c0902191143kac30515nb8aa362c71d2eb76@mail.gmail.com> Below is a slightly more specific suggestion for "two Hellos"... On Tue, Feb 17, 2009 at 12:17 AM, Radia Perlman wrote: > Yup. Nasty problem. > > So...it seems to be that there are two purposes to a Hello: > a) ensuring that you can hear your neighbors > b) testing to see how big a frame you can really send. > > So these are kind of competing purposes. > > I'd suggest having RBridges send two types of Hellos: > a) a minimal-sized Hello that includes the RBridge's ID and priority, > and the designated VLAN -- perhaps > some other stuff like what it thinks the MTU size actually is > b) fully padded Hello with all the other information included. > > The second type of Hello only needs to be sent on the designated VLAN, > and perhaps could > be sent less frequently than the minimal sized Hello. More detailed proposal on two Hellos (this would be changes in Section 4.2.3.1 and possibly other sections of the protocol specifciation): The current Hello contents discussion in 4.2.3.1.2 needs to be augmented to discuss and distinguish adjacency Hellos and native frame loop safety Hellos and discuss MTU considerations. Adjacency Hellos: Only send on the Designated VLAN. Have all the additional data elements listed in the current section 4.2.3.1.2 including the IS Neighbor TLV. Have the usual IS-IS padding to the expected MTU tweaked as needed. Native frame loop safety Hellos: Sent as in the current Draft on the Designated VLAN and possibly many others. Have the same IS PDU type and fixed header fields as adjacency Hello but only have the additional data needed for loop safety. In particular, only items 1, 2, and 4 from the current section 4.2.3.1.2 and do not have an IS Neighbor TLV and do not have padding. Does the above seem reasonable as a starting point? Thanks, Donald > ... > > Radia > > > James Carlson wrote: >> ... -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From d3e3e3 at gmail.com Thu Feb 19 12:07:55 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Thu, 19 Feb 2009 15:07:55 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <18845.47400.573275.584245@gargle.gargle.HOWL> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <499A4855.2010808@sun.com> <1028365c0902191143kac30515nb8aa362c71d2eb76@mail.gmail.com> <18845.47400.573275.584245@gargle.gargle.HOWL> Message-ID: <1028365c0902191207o42992e26qf53e597f1c480ff4@mail.gmail.com> OK. Item 5 is just a one bit flag so it doesn't seem like a big deal either way. Thanks, Donald On Thu, Feb 19, 2009 at 2:55 PM, James Carlson wrote: > Donald Eastlake writes: >> Native frame loop safety Hellos: Sent as in the current Draft on the >> Designated VLAN and possibly many others. Have the same IS PDU type >> and fixed header fields as adjacency Hello but only have the >> additional data needed for loop safety. In particular, only items 1, >> 2, and 4 from the current section 4.2.3.1.2 and do not have an IS >> Neighbor TLV and do not have padding. >> >> Does the above seem reasonable as a starting point? > > You won't be able to avoid including (5) from the current 4.2.3.1.2, > because it's in the same sub-TLV with the rest, but, yes, that sounds > reasonable to me. > > -- > James Carlson, Solaris Networking > Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 > -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From james.d.carlson at sun.com Thu Feb 19 12:01:30 2009 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 19 Feb 2009 15:01:30 -0500 Subject: [rbridge] suggested minor tweak to align TRILL documents Message-ID: <18845.47770.873752.290539@gargle.gargle.HOWL> I-D protocol-11 says this on page 35: 6. If the sender is DRB, the Rbridges (including itself) that it appoints as forwarders for that link and the VLANs for which it appoints them. Note the "including itself" phrase, which indicates that the list must be explicit. I-D isis-02 says this: An RBridge's nickname may occur as appointed forwarder for multiple VLAN ranges within the same or different Port Capability TLVs within a DRB's Hello. In the absence of appointed forwarder subTLVs referring to a VLAN, the DRB acts as the appointed forwarder for that VLAN if end station service is enabled. Note the "in the absence of" clause. This latter spec would allow a sender to omit any cases where the AF is the DRB, potentially omitting the option entirely if all AFs are the DRB. This seems a lot more reasonable to me than the -11 language, which requires a "dummy" option for the most obvious case. This is especially so, as there's really nothing useful that anyone else listening to this message could garner from hearing the DRB appoint itself as AF. Those other (non-designated) RBs need to hear if they're being tapped as AF, but they don't need to do anything if the DRB is AF. Any chance we can update the protocol-11 language to conform to the isis-02 text? Suggestions would be either: 6. If the sender is DRB, the Rbridges (excluding itself) that it or: 6. If the sender is DRB, the Rbridges that it appoints as forwarders for that link and the VLANs for which it appoints them. Where the appointed forwarder is the DRB, the VLANs need not be listed explicitly. (I like the former better, as it makes the expected result more obvious, but the latter works, too.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at Sun.COM Thu Feb 19 11:55:20 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Thu, 19 Feb 2009 14:55:20 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <1028365c0902191143kac30515nb8aa362c71d2eb76@mail.gmail.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <499A4855.2010808@sun.com> <1028365c0902191143kac30515nb8aa362c71d2eb76@mail.gmail.com> Message-ID: <18845.47400.573275.584245@gargle.gargle.HOWL> Donald Eastlake writes: > Native frame loop safety Hellos: Sent as in the current Draft on the > Designated VLAN and possibly many others. Have the same IS PDU type > and fixed header fields as adjacency Hello but only have the > additional data needed for loop safety. In particular, only items 1, > 2, and 4 from the current section 4.2.3.1.2 and do not have an IS > Neighbor TLV and do not have padding. > > Does the above seem reasonable as a starting point? You won't be able to avoid including (5) from the current 4.2.3.1.2, because it's in the same sub-TLV with the rest, but, yes, that sounds reasonable to me. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From ashreddy at cisco.com Thu Feb 19 12:14:11 2009 From: ashreddy at cisco.com (Ashwini Reddy (ashreddy)) Date: Thu, 19 Feb 2009 12:14:11 -0800 Subject: [rbridge] rbridge Digest, Vol 57, Issue 9 In-Reply-To: References: Message-ID: Thanks. -----Original Message----- From: rbridge-bounces at postel.org [mailto:rbridge-bounces at postel.org] On Behalf Of rbridge-request at postel.org Sent: Thursday, February 19, 2009 11:44 AM To: rbridge at postel.org Subject: rbridge Digest, Vol 57, Issue 9 Send rbridge mailing list submissions to rbridge at postel.org To subscribe or unsubscribe via the World Wide Web, visit http://mailman.postel.org/mailman/listinfo/rbridge or, via email, send a message with subject or body 'help' to rbridge-request at postel.org You can reach the person managing the list at rbridge-owner at postel.org When replying, please edit your Subject line so it is more specific than "Re: Contents of rbridge digest..." Today's Topics: 1. Ambiguity with IS-IS messages on PPP (Radia Perlman) 2. Re: Ambiguity with IS-IS messages on PPP (James Carlson) 3. Re: IS-IS for TRILL interoperability issues (Donald Eastlake) 4. Re: IS-IS for TRILL interoperability issues (James Carlson) 5. Re: WG last call on draft-trill-rbridge-protocol-10.txt (Donald Eastlake) 6. Re: potential L2 forwarding loop issue in RBridges (Donald Eastlake) ---------------------------------------------------------------------- Message: 1 Date: Wed, 18 Feb 2009 16:10:43 -0800 From: Radia Perlman Subject: [rbridge] Ambiguity with IS-IS messages on PPP To: TRILL/RBridge Working Group Message-ID: <499CA383.2010408 at sun.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed (Note: I like descriptive subject lines, so I'm changing the subject line for the thread to discuss James Carlson's latest noticing of a problem in the spec). He mentioned that currently the way to tell the difference between an IS-IS message and an encapsulated TRILL message is that the outer destination MAC address is "All-IS-IS-RBridges" for IS-IS messages, and "All-RBridges" for encapsulated data, and ESADI. But on PPP, there is no "outer destination MAC addres". So, how can we differentiate them? Here are the possibilities that were floated around: a) get two protocol types for TRILL for PPP (it's frowned on to "waste" these) b) use the top bit after the PPP header, which would be the version field if there were a TRILL header there, which fortuitously today happens to be 0 for TRILL-encapsulated, and 1 for IS-IS. (kind of kludgy, and uses a bit of the already small version number field) c) insert a "sub-protocol type" field after the PPP header, either a byte, or more if people want the header byte-or word aligned. Currently it would only have two values (IS-IS, or encapsulated) d) always encapsulate with a TRILL header, and differentiate based on the "inner" MAC destination address ************ I think I prefer suggestion c). Here's James Carlson's original post: James Carlson writes: > > - Given both this problem and the subtle MTU issues, reconsider the > > "unencapsulated" decision, and put IS-IS frames after a TRILL > > header that includes a flag indicating whether the payload is > > IS-IS or data. > I know, bad form to follow up my own posting, and I should have included this in my original message, but I could live with either the original Inner.MacDA test (needed for both IS-IS and ESADI) to distinguish these frames or with a discrete flag in the header. The latter would chew up a precious bit, but would be far easier to implement in hardware and embedded systems (much easier than looking for a giant Ethernet address later in the message at a potentially *variable* offset due to options). That latter sort of header would look like this on Ethernet: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All-IS-IS-RBridges MAC Destination | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MAC Destination (cont) | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Ethertype | V |R|C|M|Op-Length| Hop Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress RBridge Nickname | Ingress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TRILL Options ... +-+-+-+-+-+-+-+-+-... | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... and this on PPP: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ADDR = FF | CTRL = 03 | TRILL Protocol ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |R|C|M|Op-Length| Hop Count | Egress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ingress RBridge Nickname | TRILL Options ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... | IDRP = 83 | Length ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... Where the big change is the addition of the "C" (command/data) flag: If C == 0, then the header specifies TRILL data, and contains an Ethernet MAC header with VLAN tag after the TRILL Options (if any). If C == 1, then: M must be 0 Op-Length is zero because no options are yet defined Hop Count must be 0 Egress and Ingress RBridge Nicknames are 0 on transmit The IS-IS packet (starting with IDRP) begins after the (non-existent) options field. A form like this guarantees that the key information required to distinguish control from data is always at fixed offsets in the packet, which is necessary for hardware design, simple embedded systems, and some kinds of packet filtering mechanisms. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ rbridge mailing list rbridge at postel.org http://mailman.postel.org/mailman/listinfo/rbridge ------------------------------ Message: 2 Date: Thu, 19 Feb 2009 08:18:38 -0500 From: James Carlson Subject: Re: [rbridge] Ambiguity with IS-IS messages on PPP To: Radia Perlman Cc: TRILL/RBridge Working Group Message-ID: <18845.23598.604855.274846 at gargle.gargle.HOWL> Content-Type: text/plain; charset=us-ascii Radia Perlman writes: > c) insert a "sub-protocol type" field after the PPP header, either a > byte, or more if > people want the header byte-or word aligned. Currently it would only > have two > values (IS-IS, or encapsulated) [...] > I think I prefer suggestion c). I dislike (c) for at least two reasons: - This means that TRILL headers have two "formats." There's one format (without this sub-protocol [kludge] field) on media that have L2 addresses that we can allocate, and another format (with the field) on media that don't. This is unlike all other L3 protocols that run on PPP, which all have well-known formats that also work on other media. - The difference in headers means that the already-subtle MTU computation (which is marred by the "unencapsulated" IS-IS running in parallel with the highly-encapsulated data) gains another wart. I'm already a bit wary that we can get interoperability at all (the spec has no real guidance on MTU issues), and this will add to the problem. - What happens if we simplify the implementation? Why not have the same TRILL header format on both Ethernet and PPP? To me, the sub-protocol idea is just a half-step into using a flag to distinguish data from control, so we might as well go all the way. OK; at least two reasons. Probably more like three. ;-} One more possible solution (peculiar to PPP) would be: (e) use the rarely-used 4xxx range of PPP Protocol IDs to carry TRILL IS-IS and ESADI. That's still not so great, as we'd need to distinguish between TRILL IS-IS and ESADI, and it still doesn't address MTU. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 ------------------------------ Message: 3 Date: Thu, 19 Feb 2009 09:30:08 -0500 From: Donald Eastlake Subject: Re: [rbridge] IS-IS for TRILL interoperability issues To: "TRILL/RBridge Working Group" Message-ID: <1028365c0902190630o3196bb2bk660d096ecf40bdda at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi Jim, On Wed, Feb 18, 2009 at 10:02 AM, James Carlson wrote: > In testing our TRILL implementation, I've discovered that there's both > ambiguity in the current -11 I-D (separate designers here were able to > interpret the specification in incompatible ways), as well as serious > issues looming further down the road. > > I recommend (1) including explicit diagrams of *all* of the packet > formats (currently, only the TRILL header is diagrammed), so that > there can be no confusion about the meaning of the English text, and > (2) reconsidering the recent "IS-IS unencapsulated" decision. > > The first issue (specification ambiguity) revolves around the way > IS-IS frames are transmitted, and the interpretation of the word "and" > in this text: > > o "TRILL" frames are those (1) with a multicast destination > address allocated to the TRILL protocol (see Section 7.2) and > (2) non-control frames with the TRILL Ethertype. There are Probably should be re-worded something like: "TRILL" frames are those that either (1) have a multicast destination address allocated to the TRILL protocol (see Section 7.2) or (2) are non-control frames with the TRILL Ethertype. ... > Assuming Ethernet, my reading of the current specification results in > TRILL IS-IS frames that look like this: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | All-IS-IS-RBridges MAC Destination | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | MAC Destination (cont) | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Ethernet 802 Length Field | DSAP = FE | SSAP = FE | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | CTRL = 03 | IDRP = 83 | Length ... > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... > > That is, I read the "and" in the text to mean that any packet with > that special multicast destination *OR* with the appropriate Ethertype > is to be considered a "TRILL frame," and I understood "unencapsulated" > to mean that the *ONLY* thing distinguishing this from a regular IS-IS > frame was the destination MAC address (and woe unto those > implementations that don't filter multicast perfectly). The above is indeed what was intended. > The other designer here apparently read that "and" literally, because > his frames look like this on the wire: > > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | All-IS-IS-RBridges MAC Destination | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | MAC Destination (cont) | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Source MAC Address | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | TRILL Ethertype | IDRP = 83 | Length ... > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... > > Both seem plausible, and given the lack of information, it's possible > that others may read this specification in other ways, perhaps > including the LLC header with the Ethertype, depending on what those > readers consider to be the IS-IS frame. Thus, I strongly encourage > the authors to include an appropriate diagram (such as the above) to > make the frame format unambiguous. Yes, the wording in the definition of a TRILL frame should be improved and an explicit frame diagram for a TRILL IS-IS frame should also be included. > ... [see other messages re PPP problem] Thanks, Donald -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com ------------------------------ Message: 4 Date: Thu, 19 Feb 2009 10:15:31 -0500 From: James Carlson Subject: Re: [rbridge] IS-IS for TRILL interoperability issues To: Donald Eastlake Cc: TRILL/RBridge Working Group Message-ID: <18845.30611.813491.436287 at gargle.gargle.HOWL> Content-Type: text/plain; charset=us-ascii Donald Eastlake writes: > On Wed, Feb 18, 2009 at 10:02 AM, James Carlson wrote: > > o "TRILL" frames are those (1) with a multicast destination > > address allocated to the TRILL protocol (see Section 7.2) and > > (2) non-control frames with the TRILL Ethertype. There are > > Probably should be re-worded something like: > "TRILL" frames are those that either (1) have a multicast > destination address allocated to the TRILL protocol (see Section 7.2) > or (2) are non-control frames with the TRILL Ethertype. ... Yes, that's clearer, though I still like the idea of pictures better. There's no arguing with an array of bytes. > Yes, the wording in the definition of a TRILL frame should be improved > and an explicit frame diagram for a TRILL IS-IS frame should also be > included. That'll help; thanks. > > ... [see other messages re PPP problem] For what it's worth, I had them in one message because the two issues are tangled together. The text is unclear in part because of this change ... and the change itself has also made the text dependent on features of Ethernet that aren't present on other media. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 ------------------------------ Message: 5 Date: Thu, 19 Feb 2009 10:58:51 -0500 From: Donald Eastlake Subject: Re: [rbridge] WG last call on draft-trill-rbridge-protocol-10.txt To: Ayan Banerjee Cc: rbridge at postel.org, Radia Perlman Message-ID: <1028365c0902190758l2d564959lc80c60f06bf48f0b at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi Ayan, See below... On Tue, Jan 13, 2009 at 5:56 PM, Ayan Banerjee wrote: > Radia and Donald, > > ... > > Clarification for EASDI: > If my understanding of the draft is accurate, there are possibly 4K > vlan-ESADIs and on each node and we are "only" running the CSNP > functionality (of traditional IS-IS). The "hello" functionality is not run > on them. Consider a router that is interested in all "vlans"; such a router > will have to support *all* VLAN-ESADIs. I assume by router you mean RBridge. No RBridge "has to support" any VLAN-ESADIs. It always works, in the sense that behavior is correct, for the RBridge to just learn from the data plane. No one implementing an RBridge is required to implement any ESADI support. Support/use of ESADI would likely be important in cases where their is a Layer 2 registration protocol for end stations and you want to transmit the end station MAC address information securely or the rate of mobility or arrival/departure of end stations is so high that, in the absence of ESADI, you would have excessive black-holing due to out-of-date address cache information that has not yet timed out or excessive broadcast traffic due to unknown unicast addresses. I can think of scenarios where an RBridge is appointed forwarder for 4K VLANs. But I have a much harder time thinking of plausible scenarios where there are 4K VLANs all of which have the special requirements to make ESADI very important. Can you give me an example? > I believe that this is a significant > load on the router. I believe that we should have an optional single-ESADI > instance as well; this will allow for control-plane learning of unicast MAC > addresses in a more scalable fashion. I am fine with having the optional 4K > instances also co-exist; but my preference is to have a single instance for > unicast MAC distribution. As I say, I don't see why there would ever be thousands of VLANs for which the ESADI protocol was being used. But let's assume there are. So what exactly do you mean by "instance"? It seems to me its mostly a matter of implementation whether it is more "load" on an RBridge that for some reason is running ESADI for thousands of VLANs to get separate LSPs/CSNPs for each VLAN or some kind of merged single set of LSPs/CSNPs that are received, processed, and possibly re-emitted hop-by-hop throughout the entire campus. In order for this "single-ESADI instance" you propose to work, wouldn't all transit RBridges have to implement it? Doesn't that impose a big burden, since no RBridge has to implement ESADI right now. Doesn't it add a big load to almost all RBridges that are actually interested in running ESADI for zero or maybe one or two VLANs? Wouldn't they have to actively process all ESADI protocol mediated updates for all VLANs? ESADI is carefully designed current to impose zero control plane burden on transit RBridges that don't implement it and have it enabled. Wouldn't you lose that? Also, wouldn't the information in the "single instance" LSPs have to be labeled as to what VLAN it applies to? Doesn't that imply the specification of a bunch of additional TLVs or optional fields in TLVs with VLAN fields? And doesn't that break VLAN translation within RBridges or at least make it much more complex? > P2P IIHs and LAN IIHs: > When TRILL-IS-IS sends out hellos it does so based on the link capability. > On P2P links (configured or real ones) it sends P2P IIHs and on multi-access > links it sends LAN IIHs. I presume that in TRILL we want to default sending > out LAN IIHs (is this accurate?). We should have a section on P2P IIHs and > just talk about if any functionality/sub-tlvs are not required for that case > (for example, do we need to find a common vlan in P2P like in a LAN - > probably not etc). Note that a P2P IIH and LAN IIH will not be bring up an > adjacency. Yes, I think the default should be LAN Hellos. I'm not sure there are many differences between P2P and LAN Hello contents. Couldn't you have a "point-to-point" link between two RBridges that was, in fact, over carrier Ethernet facilities or something that only provide connectivity on, say, VLAN 42? But if there are any differences at all, a brief section on P2P Hellos seems reasonable. > Parallel links between rbridges: > We need information in the draft that states that we break ties using (a) > extended circuit id on P2P links (makes 3-way handshake mandatory) and (b) > in a LAN, use lan circuit id. I'm confused by what you say. Assume we have RBridges 1 and 2 such that there is, say, a point to point link between port A on RB1 and port A on RB2 and also between port B on RB1 and port B on RB2. There are two points of view depending on whether you are one of these two RBridges or some other RBridge in the campus: Assume you are some other RBridge, RB3. Do you even see both the A and B adjacencies in the link state? I would think not and that this should be reported as only a single adjacency in the RB1 and RB2 LSPs. If, for some reason, you do see it as two adjacencies, why would you care? As long as you know there is connectivity between RB1 and RB2 you can use that in SPF calculations. I suppose you need a way of determining the cost from the two costs you would see but you could just use the minimum of the two or something. And if for some really bizarre reason, even though you are remote from RB1 and RB2 you not only see the two parallel paths in the link state but you actually care which path is taken, there is no space in the LSP TLVs to encode any tie breaking information such as you suggest. So I don't see any need for a tiebreaker here. Assume the other case, that you are either RB1 or RB2. I don't see any difficulty here either. You should accept TRILL traffic on both connections and we should say that as a clarification. (You wouldn't want the Reverse Path Forwarding Check or something causing TRILL frames on one of the parallel connections to be discarded.) And you can send traffic on either connection. Or can do Equal Cost MultiPath on both. But if you send over only one of them then, assuming they are equal cost, it seems like a purely local decision which one and I don't see why we need to specify a tie breaker. > Thanks, > Ayan > > P.S. I have not fully cross-checked with version 11 to see all that has gone > in, but I will take a look. Also, I will take a closer look on the > hello-AF/AC issue with mis-configurations and get back to you. > > ... > Thanks, Donald ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com ------------------------------ Message: 6 Date: Thu, 19 Feb 2009 14:43:29 -0500 From: Donald Eastlake Subject: Re: [rbridge] potential L2 forwarding loop issue in RBridges To: Radia Perlman Cc: TRILL/RBridge Working Group Message-ID: <1028365c0902191143kac30515nb8aa362c71d2eb76 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Below is a slightly more specific suggestion for "two Hellos"... On Tue, Feb 17, 2009 at 12:17 AM, Radia Perlman wrote: > Yup. Nasty problem. > > So...it seems to be that there are two purposes to a Hello: > a) ensuring that you can hear your neighbors > b) testing to see how big a frame you can really send. > > So these are kind of competing purposes. > > I'd suggest having RBridges send two types of Hellos: > a) a minimal-sized Hello that includes the RBridge's ID and priority, > and the designated VLAN -- perhaps > some other stuff like what it thinks the MTU size actually is > b) fully padded Hello with all the other information included. > > The second type of Hello only needs to be sent on the designated VLAN, > and perhaps could > be sent less frequently than the minimal sized Hello. More detailed proposal on two Hellos (this would be changes in Section 4.2.3.1 and possibly other sections of the protocol specifciation): The current Hello contents discussion in 4.2.3.1.2 needs to be augmented to discuss and distinguish adjacency Hellos and native frame loop safety Hellos and discuss MTU considerations. Adjacency Hellos: Only send on the Designated VLAN. Have all the additional data elements listed in the current section 4.2.3.1.2 including the IS Neighbor TLV. Have the usual IS-IS padding to the expected MTU tweaked as needed. Native frame loop safety Hellos: Sent as in the current Draft on the Designated VLAN and possibly many others. Have the same IS PDU type and fixed header fields as adjacency Hello but only have the additional data needed for loop safety. In particular, only items 1, 2, and 4 from the current section 4.2.3.1.2 and do not have an IS Neighbor TLV and do not have padding. Does the above seem reasonable as a starting point? Thanks, Donald > ... > > Radia > > > James Carlson wrote: >> ... -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com ------------------------------ _______________________________________________ rbridge mailing list rbridge at postel.org http://mailman.postel.org/mailman/listinfo/rbridge End of rbridge Digest, Vol 57, Issue 9 ************************************** From james.d.carlson at sun.com Thu Feb 19 12:11:11 2009 From: james.d.carlson at sun.com (James Carlson) Date: Thu, 19 Feb 2009 15:11:11 -0500 Subject: [rbridge] two minor issues with the isis-02 draft Message-ID: <18845.48351.5175.126918@gargle.gargle.HOWL> The good news is that I'm done with this pass over our source code and the relevant TRILL documents, so you likely won't hear many more issues out of me for a while. There's light at the end of the tunneling protocol. First issue: why does isis-02 specify the following? 1. Information on reachable RBridge neighbors and the cost of the hop via the Extended IS Reachability TLV (Type #22) [RFC5305] (wide metric). TRILL IS-IS does not use the IS Reachability TLV (Type #2) (narrow metric). This restriction doesn't appear to be necessary, and for bridges that don't need the extended metrics, it seems slightly wasteful. If there's a purpose behind this (I have a bug filed against our code, as we're still using #2), it'd be helpful to spell that purpose out here. (I was unable to find anything in protocol-11 that supported the need for this particular restriction, though it's possible I missed something.) Second issue: the following mechanism from protocol-11 seems to be missing: 6.6 Per VLAN appointed forwarder status lost counter (see Section 4.6.2). The protocol draft doesn't seem to indicate how large this counter needs to be (is 8 bits enough or would more be better?), but I think this should be added into the existing "2.d VLANs and Bridge Roots" sub-TLV. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at Sun.COM Thu Feb 19 12:13:06 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Thu, 19 Feb 2009 15:13:06 -0500 Subject: [rbridge] potential L2 forwarding loop issue in RBridges In-Reply-To: <1028365c0902191207o42992e26qf53e597f1c480ff4@mail.gmail.com> References: <18836.30287.649263.814351@gargle.gargle.HOWL> <499A4855.2010808@sun.com> <1028365c0902191143kac30515nb8aa362c71d2eb76@mail.gmail.com> <18845.47400.573275.584245@gargle.gargle.HOWL> <1028365c0902191207o42992e26qf53e597f1c480ff4@mail.gmail.com> Message-ID: <18845.48466.654901.878161@gargle.gargle.HOWL> Donald Eastlake writes: > OK. Item 5 is just a one bit flag so it doesn't seem like a big deal either way. Yep. Just showing off that I've actually read this stuff. ;-} -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From ddutt at cisco.com Thu Feb 19 13:59:47 2009 From: ddutt at cisco.com (Dinesh G Dutt) Date: Thu, 19 Feb 2009 13:59:47 -0800 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <18845.23598.604855.274846@gargle.gargle.HOWL> References: <18844.8981.454420.809959@gargle.gargle.HOWL> <18844.20857.278803.129874@gargle.gargle.HOWL> <499CA383.2010408@sun.com> <18845.23598.604855.274846@gargle.gargle.HOWL> Message-ID: <499DD653.7020203@cisco.com> I do not like any solution that in Ethernet doesn't distinguish based on the DA or an immediate ethertype. All control protocols so far have relied on this property (at least on Ethernet) and is easiest to implement in hardware. There is also precedence which makes hardware use existing logic. So, I dislike (d) for sure. I think (a) maybe simplest. With 64K possible protocol types, using two seems in the noise. IEEE has also typically easily granted two protocol types for distinguishing control from data frames. If we go with inserting a "C" bit in the TRILL header, I vote for it to be only relevant on PPP and that even if we must set the bit on Ethernet, that the DA is the only way to identify IS-IS frames. I don't want there to be the possibility of saying that we can use the IS-IS DA to carry some form of data that is now distinguished based on the presence of the "C" bit. Dinesh James Carlson wrote: > Radia Perlman writes: > >> c) insert a "sub-protocol type" field after the PPP header, either a >> byte, or more if >> people want the header byte-or word aligned. Currently it would only >> have two >> values (IS-IS, or encapsulated) >> > [...] > >> I think I prefer suggestion c). >> > > I dislike (c) for at least two reasons: > > - This means that TRILL headers have two "formats." There's one > format (without this sub-protocol [kludge] field) on media that > have L2 addresses that we can allocate, and another format (with > the field) on media that don't. This is unlike all other L3 > protocols that run on PPP, which all have well-known formats that > also work on other media. > > - The difference in headers means that the already-subtle MTU > computation (which is marred by the "unencapsulated" IS-IS running > in parallel with the highly-encapsulated data) gains another wart. > I'm already a bit wary that we can get interoperability at all > (the spec has no real guidance on MTU issues), and this will add > to the problem. > > - What happens if we simplify the implementation? Why not have the > same TRILL header format on both Ethernet and PPP? To me, the > sub-protocol idea is just a half-step into using a flag to > distinguish data from control, so we might as well go all the way. > > OK; at least two reasons. Probably more like three. ;-} > > One more possible solution (peculiar to PPP) would be: > > (e) use the rarely-used 4xxx range of PPP Protocol IDs to carry > TRILL IS-IS and ESADI. > > That's still not so great, as we'd need to distinguish between TRILL > IS-IS and ESADI, and it still doesn't address MTU. > > -- We make our world significant by the courage of our questions and by the depth of our answers. - Carl Sagan From james.d.carlson at Sun.COM Thu Feb 19 14:26:53 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Thu, 19 Feb 2009 17:26:53 -0500 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <499DD653.7020203@cisco.com> References: <18844.8981.454420.809959@gargle.gargle.HOWL> <18844.20857.278803.129874@gargle.gargle.HOWL> <499CA383.2010408@sun.com> <18845.23598.604855.274846@gargle.gargle.HOWL> <499DD653.7020203@cisco.com> Message-ID: <18845.56493.191714.793274@gargle.gargle.HOWL> Dinesh G Dutt writes: > I do not like any solution that in Ethernet doesn't distinguish based on > the DA or an immediate ethertype. All control protocols so far have > relied on this property (at least on Ethernet) and is easiest to > implement in hardware. There is also precedence which makes hardware use > existing logic. > > So, I dislike (d) for sure. We're talking about an Ethertype, which behaves like a network layer protocol -- specifically, a tunneling protocol that has a control and a data portion. As a former (reformed?) hardware implementor, it seems to me that if you are parsing the TRILL header at all (which you must if you're forwarding or doing encaps/decaps in hardware), then checking for a single bit at a fixed location to determine whether what you're looking at is control (kick to software) or data (forward normally) is quite trivial. Compared with the other issues that running unencapsulated causes -- the ambiguity with other L2 media and the MTU difficulties -- it seems hard to say that this is a burden. > I think (a) maybe simplest. With 64K possible protocol types, using two > seems in the noise. IEEE has also typically easily granted two protocol > types for distinguishing control from data frames. Doing (e) (where we allocate from the 4xxx range) is much more likely to fly through the working group, though it does still smell pretty funny to me, and it leaves the MTU issue unresolved. > If we go with inserting a "C" bit in the TRILL header, I vote for it to > be only relevant on PPP and that even if we must set the bit on > Ethernet, that the DA is the only way to identify IS-IS frames. I don't > want there to be the possibility of saying that we can use the IS-IS DA > to carry some form of data that is now distinguished based on the > presence of the "C" bit. In this case (which is (d) from Radia's list), the DA is no longer significant. We can allocate a single multicast MAC address on Ethernet, because the "C" bit unambiguously tells us whether we're looking at control or data for that fixed TRILL Ethertype, and we needn't bother with address checks. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From cait at asomi.com Thu Feb 19 16:04:49 2009 From: cait at asomi.com (Caitlin Bestler) Date: Thu, 19 Feb 2009 16:04:49 PST Subject: [rbridge] Ambiguity with IS-IS messages on PPP Message-ID: <5297.1235088289@asomi.com> >On Thu Feb 19 13:59 , Dinesh G Dutt wrote:I do not like any solution that in Ethernet doesn't distinguish based on >the DA or an immediate ethertype. All control protocols so far have >relied on this property (at least on Ethernet) and is easiest to >implement in hardware. There is also precedence which makes hardware use >existing logic. > I agree. We should not create anything in the TRILL header that could conflict with existing L2 methods for differentiation. We could either define specific solutions for other transports or defining something that can be ignored as redundant over Ethernet (or other L2s that already have mechanisms to avoid any ambiguity). Ethernet implementations should not be cluttered with solutions only relevant to PPP. From james.d.carlson at Sun.COM Fri Feb 20 05:42:38 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Fri, 20 Feb 2009 08:42:38 -0500 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <5297.1235088289@asomi.com> References: <5297.1235088289@asomi.com> Message-ID: <18846.45902.247704.442601@gargle.gargle.HOWL> Caitlin Bestler writes: > I agree. We should not create anything in the TRILL header that could conflict with existing L2 methods for differentiation. There's no "conflict" here other than a self-imposed one. We can define both control and data frames to run over TRILL (with the TRILL Ethertype in the Ethernet L2 header), or we can define them to run apart with the provision that we then need special L2 mechanisms (separately defined for each medium over which TRILL runs) to distinguish the two types of frames. If you do the former, then the special L2 methods for differentiation based on Outer.MacDA aren't needed. They're not in "conflict;" they just are not required to distinguish frames. I suggest the former because it's more conservative: the fewer hard dependencies we have on L2 features, the easier it will be to get TRILL running on other media. The more we have, the worse it is. Consider, for example, the use of TRILL over other transports, such as FR, ATM, and MPLS. In all of those cases, we lack Outer.MacDA to do this special differentiation, and we'll need further hacks there. And, unfortunately, the 4xxx hack that's available for PPP doesn't exist with those other media. Moreover, I believe that baking a dependency on Outer.* into the base specification is itself a mistake. This forces the future "TRILL Over Foo" documents to rewrite parts of the base specification: the PPP document will have to say something like, "ignore all parts of the TRILL base specification that mention Outer.MacDA; they're wrong, and this document is right." I think that telling readers that the base needs to be modified to accomodate new transports means that it wasn't factored correctly. Those future documents shouldn't come with patches for the base document. And then there's the MTU issue. Running IS-IS unencapsulated means that the IS-IS implementation needs to somehow "know" about the TRILL header overhead in order to do the padding correctly, even though it's not using the TRILL header at all. > We could either define specific solutions for other transports or defining something that can be ignored as redundant > over Ethernet (or other L2s that already have mechanisms to avoid any ambiguity). Ethernet implementations should > not be cluttered with solutions only relevant to PPP. I think "only relevant to PPP" may be a short-sighted way to look at this. I don't want the base document cluttered with things that can only work if the underlying medium is exactly Ethernet, and are broken otherwise. We might have a path towards agreement if we talk about "ignored as redundant" here. I'd be fine with requiring the use of a special Outer.MacDA to distinguish IS-IS (control) frames when TRILL is run over Ethernet, _provided that_ the Ethertype is always TRILL for those frames, the TRILL header is always present, and the TRILL header distinguishes the control frames. sgai writes: > I completely agree with Caitlin and I prefer to define "specific solutions > for other transports", if needed. For the moment I don't see the need. I don't believe that we can or should write a base document that makes unwarranted assumptions about the underlying medium. Protocols have a long history ahead of them. I don't think we can predict that they'll run over Ethernet alone, in all cases, and forever. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From Radia.Perlman at sun.com Sat Feb 21 12:36:11 2009 From: Radia.Perlman at sun.com (Radia Perlman) Date: Sat, 21 Feb 2009 12:36:11 -0800 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <18846.45902.247704.442601@gargle.gargle.HOWL> References: <5297.1235088289@asomi.com> <18846.45902.247704.442601@gargle.gargle.HOWL> Message-ID: <49A065BB.7090102@sun.com> I think this would reach consensus more easily if people would list all the proposals, rank them according to their own preference, and perhaps editorialize about each one as well. But mostly, make it clear which proposals you like, which proposals you can live with, which proposals you think won't work at all...all in one email. Thanks, Radia From james.d.carlson at Sun.COM Mon Feb 23 07:45:31 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Mon, 23 Feb 2009 10:45:31 -0500 Subject: [rbridge] Ambiguity with IS-IS messages on PPP In-Reply-To: <49A065BB.7090102@sun.com> References: <5297.1235088289@asomi.com> <18846.45902.247704.442601@gargle.gargle.HOWL> <49A065BB.7090102@sun.com> Message-ID: <18850.50331.302405.436636@gargle.gargle.HOWL> Radia Perlman writes: > I think this would reach consensus more easily if people would list all > the proposals, rank them according to their > own preference, and perhaps editorialize about each one as well. But > mostly, make it clear which proposals you like, > which proposals you can live with, which proposals you think won't work > at all...all in one email. I'll try to summarize the options discussed so far, based on the list that you gave to start this thread, and then follow up with my own ranking. a) "Per-L2 Mechanism" Use Outer.MacDA on Ethernet to distinguish TRILL IS-IS messages, with no other encapsulation (use Outer.Length instead of Outer.Ethertype, and use LLC). On PPP, use a separate protocol number (likely 4xxx) to distinguish the IS-IS messages from data messages (0xxx). On other media, find some local mechanism to distinguish these messages. b) "The Version Field Hack" Set Outer.Ethertype to be TRILL for TRILL IS-IS messages, and use the fact that the "V" field in the TRILL header is currently binary '00', while the first octet in a valid IS-IS message is hex 83 and thus would be binary '10' to distinguish the two messages. c) "Per-L2 Mechanism, Part Deux" Use Outer.MacDA on Ethernet to distinguish TRILL IS-IS messages (as in [a] above), and use a special "sub-protocol" field everywhere else. d) "The Single Protocol" Use Outer.Ethertype to distinguish all TRILL messages. Change one of the "R" bits in the TRILL header to be a "command/data" bit: if set, then the message is IS-IS, and the normal IS-IS message follows the TRILL header directly. If clear, then the message is data, and can be forwarded. For options (b) and (d), on Ethernet (and other media where it matters), reserve a multicast address for TRILL IS-IS to use in Outer.MacDA, so that monitoring applications can listen selectively for these messages. (The Outer.MacDA does *not* indicate message type for either of these options. Outer.Etherype indicates protocol, and some fixed offset in the TRILL header indicates type of message.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Mon Feb 23 18:22:36 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Mon, 23 Feb 2009 21:22:36 -0500 Subject: [rbridge] suggested minor tweak to align TRILL documents In-Reply-To: <18845.47770.873752.290539@gargle.gargle.HOWL> References: <18845.47770.873752.290539@gargle.gargle.HOWL> Message-ID: <1028365c0902231822x2085872ewd1b31589f7c9731f@mail.gmail.com> The suggested change in protocol specification text seems reasonable. I also prefer the first of the two options given for changed text. Does anyone object? Thanks, Donald On Thu, Feb 19, 2009 at 3:01 PM, James Carlson wrote: > I-D protocol-11 says this on page 35: > > 6. If the sender is DRB, the Rbridges (including itself) that it > appoints as forwarders for that link and the VLANs for which it > appoints them. > > Note the "including itself" phrase, which indicates that the list must > be explicit. I-D isis-02 says this: > > An RBridge's nickname may occur as appointed forwarder for > multiple VLAN ranges within the same or different Port Capability > TLVs within a DRB's Hello. In the absence of appointed forwarder > subTLVs referring to a VLAN, the DRB acts as the appointed > forwarder for that VLAN if end station service is enabled. > > Note the "in the absence of" clause. This latter spec would allow a > sender to omit any cases where the AF is the DRB, potentially omitting > the option entirely if all AFs are the DRB. This seems a lot more > reasonable to me than the -11 language, which requires a "dummy" > option for the most obvious case. > > This is especially so, as there's really nothing useful that anyone > else listening to this message could garner from hearing the DRB > appoint itself as AF. Those other (non-designated) RBs need to hear > if they're being tapped as AF, but they don't need to do anything if > the DRB is AF. > > Any chance we can update the protocol-11 language to conform to the > isis-02 text? Suggestions would be either: > > 6. If the sender is DRB, the Rbridges (excluding itself) that it > > or: > > 6. If the sender is DRB, the Rbridges that it appoints as > forwarders for that link and the VLANs for which it appoints > them. Where the appointed forwarder is the DRB, the VLANs need > not be listed explicitly. > > (I like the former better, as it makes the expected result more > obvious, but the latter works, too.) > > -- > James Carlson, Solaris Networking > Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge > -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/rbridge/attachments/20090223/c1c9235f/attachment.html From james.d.carlson at Sun.COM Tue Feb 24 06:07:56 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Tue, 24 Feb 2009 09:07:56 -0500 Subject: [rbridge] Jim's ranking [was Re: Ambiguity with IS-IS messages on PPP] In-Reply-To: <18850.50331.302405.436636@gargle.gargle.HOWL> References: <5297.1235088289@asomi.com> <18846.45902.247704.442601@gargle.gargle.HOWL> <49A065BB.7090102@sun.com> <18850.50331.302405.436636@gargle.gargle.HOWL> Message-ID: <18851.65340.114248.413247@gargle.gargle.HOWL> With (apparently?) little dissent on the option summaries, my ranking and commentary would be: d) This is by far my most preferred solution. It matches what we do in other protocols, and even has non-IETF precedent with hardware-friendly protocols like HDLC. By far the most important benefit of this solution is that it makes the base protocol specification (as well as the base protocol itself) lower-layer agnostic by removing the need for explicit dependencies on Ethernet-specific data. It becomes trivial to map TRILL onto other transports, including ones that we haven't thought about yet. (Tunnel through L2TP, anyone?) b) It's a brutal hack, but I could live with it. It at least preserves the (important to me) L2-agnostic mapping. a) This isn't very good. It means that the base document needs to be modified by (or will be in conflict with) future documents, so it's obviously not very general (or "base") at all, and it's substantially unclear whether future mappings over other media will in fact work without more serious kludges. Each one will at least need to be different, which is a source of needless complexity. I suppose I could barely live with it if the alternative is that we simply cannot reach any rough consensus, and thus the protocol itself is doomed. But I'd want to try many other things first. c) I think this one is just pointless, and I'd have a hard time living with it. The new sub-protocol field is tantamount to using a flag in the header (as in option [d]), so it really has no obvious advantages over that solution. It has the serious detraction that "the" TRILL header becomes effectively different on different underlying media -- complicating packet decode needlessly. And it doesn't solve the problem that the base specification is still dependent on Ethernet for the outer header, and thus the document will still be in conflict with future mappings over other media. It's the worst of all worlds. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Tue Feb 24 21:09:45 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Wed, 25 Feb 2009 00:09:45 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <18845.48351.5175.126918@gargle.gargle.HOWL> References: <18845.48351.5175.126918@gargle.gargle.HOWL> Message-ID: <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> Jim, Thanks for review this. See below... On Thu, Feb 19, 2009 at 3:11 PM, James Carlson wrote: > > The good news is that I'm done with this pass over our source code and > the relevant TRILL documents, so you likely won't hear many more > issues out of me for a while. ?There's light at the end of the > tunneling protocol. > > > First issue: why does isis-02 specify the following? > > ? 1. Information on reachable RBridge neighbors and the cost of the hop > ? ? ?via the Extended IS Reachability TLV (Type #22) [RFC5305] (wide > ? ? ?metric). TRILL IS-IS does not use the IS Reachability TLV (Type > ? ? ?#2) (narrow metric). > > This restriction doesn't appear to be necessary, and for bridges that > don't need the extended metrics, it seems slightly wasteful. ?If > there's a purpose behind this (I have a bug filed against our code, as > we're still using #2), it'd be helpful to spell that purpose out > here. > > (I was unable to find anything in protocol-11 that supported the need > for this particular restriction, though it's possible I missed > something.) I don't think the Extended IS Reachability TLV is particularly wasteful. In the common case, both the TLV #2 and #22 data for an adjacency is 11 bytes long so 23 will fit into a TLV for either. I think the primary motivation for the 24 bit metric per adjacency in the Extended IS Reachability TLV is that you have only 6 bits of default metric per adjacency in the original IS Reachability TLV (it's an octet but one bit is used up to indicate internal/external and one bit is now the up/down bit [RFC 2966]). A 6 bit integer metric (1 to 63) is very coarse and is insufficient to express the relative costs of a 10 megabit versus a 1 gigabit link, let alone the existing 10 gigabit links and the 40 and 100 gigabit links under development. [RFC 3784] Given this problem, mandating the Extended IS Reachability TLV seems reasonable to me do you can have a globally consistent link speed derived metric. > Second issue: the following mechanism from protocol-11 seems to be > missing: > > ? ? ?6.6 Per VLAN appointed forwarder status lost counter (see Section > ? ? ? ? ?4.6.2). > > The protocol draft doesn't seem to indicate how large this counter > needs to be (is 8 bits enough or would more be better?), but I think > this should be added into the existing "2.d VLANs and Bridge Roots" > sub-TLV. Yes, draft-eastlake-trill-rbridge-isis is no longer being updated. The plan is for all the TRILL IS-IS TLV and sub-TLVs to be added to the draft-ward-l2isis draft. I'll check that this is being included in that transfer. > -- > James Carlson, Solaris Networking ? ? ? ? ? ? ? > Sun Microsystems / 35 Network Drive ? ? ? ?71.232W ? Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 ? 42.496N ? Fax +1 781 442 1677 Thanks, Donald -- ============================= Donald E. Eastlake 3rd ? +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From d3e3e3 at gmail.com Wed Feb 25 08:33:22 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Wed, 25 Feb 2009 11:33:22 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <18853.28352.318661.573947@gargle.gargle.HOWL> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> Message-ID: <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> Hi Jim, On Wed, Feb 25, 2009 at 11:16 AM, James Carlson wrote: > Donald Eastlake writes: >>... > >> I think the primary motivation for the 24 bit metric per adjacency in >> the Extended IS Reachability TLV is that you have only 6 bits of >> default metric per adjacency in the original IS Reachability TLV (it's >> an octet but one bit is used up to indicate internal/external and one >> bit is now the up/down bit [RFC 2966]). A 6 bit integer metric (1 to >> 63) is very coarse and is insufficient to express the relative costs >> of a 10 megabit versus a 1 gigabit link, let alone the existing 10 >> gigabit links and the 40 and 100 gigabit links under development. [RFC >> 3784] Given this problem, mandating the Extended IS Reachability TLV >> seems reasonable to me do you can have a globally consistent link >> speed derived metric. > > That's a great argument for why one *should* use the newer metric > scheme. ?What's the argument for an effective "must"? OK, unless other people speak up, it should be changed to SHOULD. >... > > -- > James Carlson, Solaris Networking ? ? ? ? ? ? ? > Sun Microsystems / 35 Network Drive ? ? ? ?71.232W ? Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 ? 42.496N ? Fax +1 781 442 1677 Thanks, Donald -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From james.d.carlson at sun.com Wed Feb 25 08:16:00 2009 From: james.d.carlson at sun.com (James Carlson) Date: Wed, 25 Feb 2009 11:16:00 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> Message-ID: <18853.28352.318661.573947@gargle.gargle.HOWL> Donald Eastlake writes: > > (I was unable to find anything in protocol-11 that supported the need > > for this particular restriction, though it's possible I missed > > something.) > > I don't think the Extended IS Reachability TLV is particularly > wasteful. In the common case, both the TLV #2 and #22 data for an > adjacency is 11 bytes long so 23 will fit into a TLV for either. OK; maybe "wasteful" is too strong. "Needless" is better. > I think the primary motivation for the 24 bit metric per adjacency in > the Extended IS Reachability TLV is that you have only 6 bits of > default metric per adjacency in the original IS Reachability TLV (it's > an octet but one bit is used up to indicate internal/external and one > bit is now the up/down bit [RFC 2966]). A 6 bit integer metric (1 to > 63) is very coarse and is insufficient to express the relative costs > of a 10 megabit versus a 1 gigabit link, let alone the existing 10 > gigabit links and the 40 and 100 gigabit links under development. [RFC > 3784] Given this problem, mandating the Extended IS Reachability TLV > seems reasonable to me do you can have a globally consistent link > speed derived metric. That's a great argument for why one *should* use the newer metric scheme. What's the argument for an effective "must"? > > The protocol draft doesn't seem to indicate how large this counter > > needs to be (is 8 bits enough or would more be better?), but I think > > this should be added into the existing "2.d VLANs and Bridge Roots" > > sub-TLV. > > Yes, draft-eastlake-trill-rbridge-isis is no longer being updated. The > plan is for all the TRILL IS-IS TLV and sub-TLVs to be added to the > draft-ward-l2isis draft. I'll check that this is being included in > that transfer. OK; thanks. That sounds good. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From mshand at cisco.com Wed Feb 25 09:09:07 2009 From: mshand at cisco.com (mike shand) Date: Wed, 25 Feb 2009 17:09:07 +0000 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> Message-ID: <49A57B33.9030503@cisco.com> An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/rbridge/attachments/20090225/1fed8bfa/attachment.html From james.d.carlson at sun.com Wed Feb 25 09:25:40 2009 From: james.d.carlson at sun.com (James Carlson) Date: Wed, 25 Feb 2009 12:25:40 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <49A57B33.9030503@cisco.com> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> <49A57B33.9030503@cisco.com> Message-ID: <18853.32532.724145.816601@gargle.gargle.HOWL> mike shand writes: >
> I don't see the benefit of using the narrow metrics, especially when > you take into account the not inconsiderable pain of moving to wide > metrics as and when it becomes necessary. Since this is effectively > starting with a clean sheet, I would have thought it much better to > mandate wide metrics and stick to them. I don't know of anybody who has > wanted to migrate back from wide metrics to narrow metrics.
>
>     Mike
I don't see it as an issue of migrating -- that's something that deployers need to determine, and is described in some detail in RFC 3787 -- but rather whether it's _necessary_ for TRILL interoperability. If there were some reason that TRILL would fail to function correctly with all nodes using the old 6-bit default metric, then that'd be a good reason to outlaw its use. However, merely observing that there's a new metric that everyone "should" be using doesn't actually mean that it _must_ be used in order to achieve interoperability. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From ayabaner at cisco.com Wed Feb 25 10:31:03 2009 From: ayabaner at cisco.com (Ayan Banerjee) Date: Wed, 25 Feb 2009 10:31:03 -0800 Subject: [rbridge] WG last call on draft-trill-rbridge-protocol-10.txt In-Reply-To: <499A4990.40004@sun.com> Message-ID: Radia, Some comments inline. Thanks, Ayan On 2/16/09 9:22 PM, "Radia Perlman" wrote: > Ayan -- I"ll answer two of the things you mentioned. > > Ayan Banerjee wrote: >> >> Clarification for EASDI: >> If my understanding of the draft is accurate, there are possibly 4K >> vlan-ESADIs and on each node and we are "only" running the CSNP >> functionality (of traditional IS-IS). The "hello" functionality is not run >> on them. Consider a router that is interested in all "vlans"; such a router >> will have to support *all* VLAN-ESADIs. I believe that this is a significant >> load on the router. I believe that we should have an optional single-ESADI >> instance as well; this will allow for control-plane learning of unicast MAC >> addresses in a more scalable fashion. I am fine with having the optional 4K >> instances also co-exist; but my preference is to have a single instance for >> unicast MAC distribution. >> > I think that combining the ESADIs into one would be a complication, and > it involves more overhead > since RBridges would have to keep information for VLANs they are not > interested in. So I think > that in most cases, having the separate ESADIs would work best. An > RBridge is allowed to ignore > all ESADIs and only learn from data traffic, and it could choose to > listen to ESADIs on some > subset of VLANs, and learn from data traffic on the others. I do not disagree with the fact that a single EASDI will imply that Rbridges that are not interested in some VLANs will have additional information. However, for the Rbridges that have multiple VLANS configured on them and need to run multiple instances will find this onerous. I believe you are suggesting in that case data-path learning is the way to go (since control plane learning is optional)? I was proposing that why not in addition have a single instance ESADI (since it is optional anyways) to account for that scenario. >> >> >> Parallel links between rbridges: >> We need information in the draft that states that we break ties using (a) >> extended circuit id on P2P links (makes 3-way handshake mandatory) and (b) >> in a LAN, use lan circuit id. >> >> > I think that if R1 and R2 have two pt-to-pt links between them, they > should not be reporting both of > them in their LSP. There's already a tie-breaker for pseudonodes, > right? Maybe I don't understand this issue. The issue is when there are parallel links between a pair of nodes, you do want the tree to be bi-directional on the same link. As a result, we would need extended circuit id to break the tie. This mandates 3-way handshake on P2P links. > > Radia > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge From Radia.Perlman at sun.com Wed Feb 25 10:45:28 2009 From: Radia.Perlman at sun.com (Radia Perlman) Date: Wed, 25 Feb 2009 10:45:28 -0800 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <18853.32532.724145.816601@gargle.gargle.HOWL> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> <49A57B33.9030503@cisco.com> <18853.32532.724145.816601@gargle.gargle.HOWL> Message-ID: <49A591C8.2010000@sun.com> Actually, I agree with Mike. Just seems complicated and an opportunity for noninteroperability to have two formats for metrics. I have long regretted doing the narrow metrics in the first place. TRILL for IS-IS is an opportunity to get rid of things that, in the wisdom of 25 years, we would have decided differently. Radia James Carlson wrote: > mike shand writes: > >>
>> I don't see the benefit of using the narrow metrics, especially when >> you take into account the not inconsiderable pain of moving to wide >> metrics as and when it becomes necessary. Since this is effectively >> starting with a clean sheet, I would have thought it much better to >> mandate wide metrics and stick to them. I don't know of anybody who has >> wanted to migrate back from wide metrics to narrow metrics.
>>
>>     Mike
>> > > I don't see it as an issue of migrating -- that's something that > deployers need to determine, and is described in some detail in RFC > 3787 -- but rather whether it's _necessary_ for TRILL > interoperability. > > If there were some reason that TRILL would fail to function correctly > with all nodes using the old 6-bit default metric, then that'd be a > good reason to outlaw its use. However, merely observing that there's > a new metric that everyone "should" be using doesn't actually mean > that it _must_ be used in order to achieve interoperability. > > From ayabaner at cisco.com Wed Feb 25 10:46:56 2009 From: ayabaner at cisco.com (Ayan Banerjee) Date: Wed, 25 Feb 2009 10:46:56 -0800 Subject: [rbridge] WG last call on draft-trill-rbridge-protocol-10.txt In-Reply-To: <1028365c0902190758l2d564959lc80c60f06bf48f0b@mail.gmail.com> Message-ID: Donald, Please see below. Thanks, Ayan On 2/19/09 7:58 AM, "Donald Eastlake" wrote: > Hi Ayan, > > See below... > > On Tue, Jan 13, 2009 at 5:56 PM, Ayan Banerjee wrote: >> Radia and Donald, >> >> ... >> >> Clarification for EASDI: >> If my understanding of the draft is accurate, there are possibly 4K >> vlan-ESADIs and on each node and we are "only" running the CSNP >> functionality (of traditional IS-IS). The "hello" functionality is not run >> on them. Consider a router that is interested in all "vlans"; such a router >> will have to support *all* VLAN-ESADIs. > > I assume by router you mean RBridge. No RBridge "has to support" any > VLAN-ESADIs. It always works, in the sense that behavior is correct, > for the RBridge to just learn from the data plane. No one implementing > an RBridge is required to implement any ESADI support. > > Support/use of ESADI would likely be important in cases where their is > a Layer 2 registration protocol for end stations and you want to > transmit the end station MAC address information securely or the rate > of mobility or arrival/departure of end stations is so high that, in > the absence of ESADI, you would have excessive black-holing due to > out-of-date address cache information that has not yet timed out or > excessive broadcast traffic due to unknown unicast addresses. > > I can think of scenarios where an RBridge is appointed forwarder for > 4K VLANs. But I have a much harder time thinking of plausible > scenarios where there are 4K VLANs all of which have the special > requirements to make ESADI very important. Can you give me an example? > I was envisioning the case when there is only "protocol" based learning of MAC addresses (and no data path learning). However, if this is not the intention of ESADI then this is fine. >> I believe that >> this is a significant >> load on the router. I believe that we should have an optional single-ESADI >> instance as well; this will allow for control-plane learning of unicast MAC >> addresses in a more scalable fashion. I am fine with having the optional 4K >> instances also co-exist; but my preference is to have a single instance for >> unicast MAC distribution. > > As I say, I don't see why there would ever be thousands of VLANs for > which the ESADI protocol was being used. But let's assume there are. > So what exactly do you mean by "instance"? It seems to me its mostly a > matter of implementation whether it is more "load" on an RBridge that > for some reason is running ESADI for thousands of VLANs to get > separate LSPs/CSNPs for each VLAN or some kind of merged single set of > LSPs/CSNPs that are received, processed, and possibly re-emitted > hop-by-hop throughout the entire campus. > > In order for this "single-ESADI instance" you propose to work, > wouldn't all transit RBridges have to implement it? Doesn't that > impose a big burden, since no RBridge has to implement ESADI right > now. Doesn't it add a big load to almost all RBridges that are > actually interested in running ESADI for zero or maybe one or two > VLANs? Wouldn't they have to actively process all ESADI protocol > mediated updates for all VLANs? ESADI is carefully designed current to > impose zero control plane burden on transit RBridges that don't > implement it and have it enabled. Wouldn't you lose that? Also, > wouldn't the information in the "single instance" LSPs have to be > labeled as to what VLAN it applies to? Doesn't that imply the > specification of a bunch of additional TLVs or optional fields in TLVs > with VLAN fields? And doesn't that break VLAN translation within > RBridges or at least make it much more complex? > >> P2P IIHs and LAN IIHs: >> When TRILL-IS-IS sends out hellos it does so based on the link capability. >> On P2P links (configured or real ones) it sends P2P IIHs and on multi-access >> links it sends LAN IIHs. I presume that in TRILL we want to default sending >> out LAN IIHs (is this accurate?). We should have a section on P2P IIHs and >> just talk about if any functionality/sub-tlvs are not required for that case >> (for example, do we need to find a common vlan in P2P like in a LAN - >> probably not etc). Note that a P2P IIH and LAN IIH will not be bring up an >> adjacency. > > Yes, I think the default should be LAN Hellos. I'm not sure there are > many differences between P2P and LAN Hello contents. Couldn't you have > a "point-to-point" link between two RBridges that was, in fact, over > carrier Ethernet facilities or something that only provide > connectivity on, say, VLAN 42? But if there are any differences at > all, a brief section on P2P Hellos seems reasonable. > Thanks this would make things clearer. >> Parallel links between rbridges: >> We need information in the draft that states that we break ties using (a) >> extended circuit id on P2P links (makes 3-way handshake mandatory) and (b) >> in a LAN, use lan circuit id. > > I'm confused by what you say. Assume we have RBridges 1 and 2 such > that there is, say, a point to point link between port A on RB1 and > port A on RB2 and also between port B on RB1 and port B on RB2. There > are two points of view depending on whether you are one of these two > RBridges or some other RBridge in the campus: > > Assume you are some other RBridge, RB3. Do you even see both the A and > B adjacencies in the link state? I would think not and that this > should be reported as only a single adjacency in the RB1 and RB2 LSPs. > If, for some reason, you do see it as two adjacencies, why would you > care? As long as you know there is connectivity between RB1 and RB2 > you can use that in SPF calculations. I suppose you need a way of > determining the cost from the two costs you would see but you could > just use the minimum of the two or something. And if for some really > bizarre reason, even though you are remote from RB1 and RB2 you not > only see the two parallel paths in the link state but you actually > care which path is taken, there is no space in the LSP TLVs to encode > any tie breaking information such as you suggest. So I don't see any > need for a tiebreaker here. > > Assume the other case, that you are either RB1 or RB2. I don't see any > difficulty here either. You should accept TRILL traffic on both > connections and we should say that as a clarification. (You wouldn't > want the Reverse Path Forwarding Check or something causing TRILL > frames on one of the parallel connections to be discarded.) And you > can send traffic on either connection. Or can do Equal Cost MultiPath > on both. But if you send over only one of them then, assuming they are > equal cost, it seems like a purely local decision which one and I > don't see why we need to specify a tie breaker. When you have two parallel links and only one of them becomes a member of the tree, then RPF as you have pointed out will fail on the other. If the parallel links are part of a bundle, we do not need this. However, if the user has specifically made them distinct, we need a method to ensure that tree is computed along only one of them. > >> Thanks, >> Ayan >> >> P.S. I have not fully cross-checked with version 11 to see all that has gone >> in, but I will take a look. Also, I will take a closer look on the >> hello-AF/AC issue with mis-configurations and get back to you. >> >> ... >> > > Thanks, > Donald > > ============================= > Donald E. Eastlake 3rd +1-508-634-2066 (home) > 155 Beaver Street > Milford, MA 01757 USA > d3e3e3 at gmail.com From james.d.carlson at Sun.COM Wed Feb 25 10:59:31 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Wed, 25 Feb 2009 13:59:31 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <49A591C8.2010000@sun.com> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> <49A57B33.9030503@cisco.com> <18853.32532.724145.816601@gargle.gargle.HOWL> <49A591C8.2010000@sun.com> Message-ID: <18853.38163.276970.992693@gargle.gargle.HOWL> Radia Perlman writes: > Actually, I agree with Mike. Just seems complicated and an opportunity > for noninteroperability to have > two formats for metrics. I have long regretted doing the narrow metrics > in the first place. > TRILL for IS-IS is an opportunity to get rid of things that, in the > wisdom of 25 years, we would > have decided differently. It still seems like an unnecessary restriction for job at hand, but if that's the general consensus, then my complaint really is a fairly minor one, and I'd be ok with leaving it as-is. (For what it's worth, I doubt that anyone's writing IS-IS from scratch to do this, so the "opportunity" really is just at the spec level. Implementations will almost certainly be carrying around ancient luggage for some time to come.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Wed Feb 25 11:48:53 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Wed, 25 Feb 2009 14:48:53 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <18853.32532.724145.816601@gargle.gargle.HOWL> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> <49A57B33.9030503@cisco.com> <18853.32532.724145.816601@gargle.gargle.HOWL> Message-ID: <1028365c0902251148o1c6ccc69w1eddb5bb10173a1@mail.gmail.com> There is actually one other reason to mandate use of the Extended IS Reachability TLV: In the protocol specification provisions related to ports configured as "access ports", it says that if an RBridge detects adjacency to another RBridge on an access port, it should "not report" that adjacency to avoid TRILL data frames being routed that way. The ability to "not report" an adjacency you have detected seems like a much bigger change than to just say that, for access ports, report adjacencies as having metric 2**24 - 1. The Extended IS Reachability TLV specification provides that links with that exact maximum cost value MUST NOT be used in shortest path first calculations, thus achieving the desired effect of being sure that TRILL data frames will not be routed over them. This facility is not available with the original narrow metric TLV. Thanks, Donald On Wed, Feb 25, 2009 at 12:25 PM, James Carlson wrote: > mike shand writes: >>
>> I don't see the benefit of using the narrow metrics, especially when >> you take into account the not inconsiderable pain of moving to wide >> metrics as and when it becomes necessary. Since this is effectively >> starting with a clean sheet, I would have thought it much better to >> mandate wide metrics and stick to them. I don't know of anybody who has >> wanted to migrate back from wide metrics to narrow metrics.
>>
>>     Mike
> > I don't see it as an issue of migrating -- that's something that > deployers need to determine, and is described in some detail in RFC > 3787 -- but rather whether it's _necessary_ for TRILL > interoperability. > > If there were some reason that TRILL would fail to function correctly > with all nodes using the old 6-bit default metric, then that'd be a > good reason to outlaw its use. ?However, merely observing that there's > a new metric that everyone "should" be using doesn't actually mean > that it _must_ be used in order to achieve interoperability. > > -- > James Carlson, Solaris Networking ? ? ? ? ? ? ? > Sun Microsystems / 35 Network Drive ? ? ? ?71.232W ? Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 ? 42.496N ? Fax +1 781 442 1677 -- ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From james.d.carlson at sun.com Wed Feb 25 12:10:55 2009 From: james.d.carlson at sun.com (James Carlson) Date: Wed, 25 Feb 2009 15:10:55 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <1028365c0902251148o1c6ccc69w1eddb5bb10173a1@mail.gmail.com> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> <49A57B33.9030503@cisco.com> <18853.32532.724145.816601@gargle.gargle.HOWL> <1028365c0902251148o1c6ccc69w1eddb5bb10173a1@mail.gmail.com> Message-ID: <18853.42447.628931.517@gargle.gargle.HOWL> Donald Eastlake writes: > There is actually one other reason to mandate use of the Extended IS > Reachability TLV: In the protocol specification provisions related to > ports configured as "access ports", it says that if an RBridge detects > adjacency to another RBridge on an access port, it should "not report" > that adjacency to avoid TRILL data frames being routed that way. The > ability to "not report" an adjacency you have detected seems like a > much bigger change than to just say that, for access ports, report > adjacencies as having metric 2**24 - 1. The Extended IS Reachability That's what I was looking for. Yes, that's a clear reason to mandate the use of the newer TLV. (It wouldn't be bad to cite that case as a rationale in whatever future document ends up holding this text.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From james.d.carlson at Sun.COM Wed Feb 25 10:59:31 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Wed, 25 Feb 2009 13:59:31 -0500 Subject: [rbridge] two minor issues with the isis-02 draft In-Reply-To: <49A591C8.2010000@sun.com> References: <18845.48351.5175.126918@gargle.gargle.HOWL> <1028365c0902242109s1174d34bpc133b39fcad47ca4@mail.gmail.com> <18853.28352.318661.573947@gargle.gargle.HOWL> <1028365c0902250833j43b3c357necdec0e1fc0910f9@mail.gmail.com> <49A57B33.9030503@cisco.com> <18853.32532.724145.816601@gargle.gargle.HOWL> <49A591C8.2010000@sun.com> Message-ID: <18853.38163.276970.992693@gargle.gargle.HOWL> Radia Perlman writes: > Actually, I agree with Mike. Just seems complicated and an opportunity > for noninteroperability to have > two formats for metrics. I have long regretted doing the narrow metrics > in the first place. > TRILL for IS-IS is an opportunity to get rid of things that, in the > wisdom of 25 years, we would > have decided differently. It still seems like an unnecessary restriction for job at hand, but if that's the general consensus, then my complaint really is a fairly minor one, and I'd be ok with leaving it as-is. (For what it's worth, I doubt that anyone's writing IS-IS from scratch to do this, so the "opportunity" really is just at the spec level. Implementations will almost certainly be carrying around ancient luggage for some time to come.) -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From d3e3e3 at gmail.com Wed Feb 25 18:18:19 2009 From: d3e3e3 at gmail.com (Donald Eastlake) Date: Wed, 25 Feb 2009 21:18:19 -0500 Subject: [rbridge] WG last call on draft-trill-rbridge-protocol-10.txt In-Reply-To: References: <1028365c0902190758l2d564959lc80c60f06bf48f0b@mail.gmail.com> Message-ID: <1028365c0902251818p571b0ba4v7ea3db412ff8fbfd@mail.gmail.com> Hi Ayan, It looks like we are essentially in agreement on your first two points. See below on your third point. On Wed, Feb 25, 2009 at 1:46 PM, Ayan Banerjee wrote: > Donald, > > ... > > On 2/19/09 7:58 AM, "Donald Eastlake" wrote: >> Hi Ayan, >> >> See below... >> >> On Tue, Jan 13, 2009 at 5:56 PM, Ayan Banerjee wrote: >>> Radia and Donald, >>> >>> ... >>> >>> Parallel links between rbridges: >>> We need information in the draft that states that we break ties using (a) >>> extended circuit id on P2P links (makes 3-way handshake mandatory) and (b) >>> in a LAN, use lan circuit id. >> >> I'm confused by what you say. Assume we have RBridges 1 and 2 such >> that there is, say, a point to point link between port A on RB1 and >> port A on RB2 and also between port B on RB1 and port B on RB2. There >> are two points of view depending on whether you are one of these two >> RBridges or some other RBridge in the campus: >> >> Assume you are some other RBridge, RB3. Do you even see both the A and >> B adjacencies in the link state? I would think not and that this >> should be reported as only a single adjacency in the RB1 and RB2 LSPs. >> If, for some reason, you do see it as two adjacencies, why would you >> care? As long as you know there is connectivity between RB1 and RB2 >> you can use that in SPF calculations. I suppose you need a way of >> determining the cost from the two costs you would see but you could >> just use the minimum of the two or something. And if for some really >> bizarre reason, even though you are remote from RB1 and RB2 you not >> only see the two parallel paths in the link state but you actually >> care which path is taken, there is no space in the LSP TLVs to encode >> any tie breaking information such as you suggest. So I don't see any >> need for a tiebreaker here. >> >> Assume the other case, that you are either RB1 or RB2. I don't see any >> difficulty here either. You should accept TRILL traffic on both >> connections and we should say that as a clarification. (You wouldn't >> want the Reverse Path Forwarding Check or something causing TRILL >> frames on one of the parallel connections to be discarded.) And you >> can send traffic on either connection. Or can do Equal Cost MultiPath >> on both. But if you send over only one of them then, assuming they are >> equal cost, it seems like a purely local decision which one and I >> don't see why we need to specify a tie breaker. > > When you have two parallel links and only one of them becomes a member of > the tree, then RPF as you have pointed out will fail on the other. If the > parallel links are part of a bundle, we do not need this. However, if the > user has specifically made them distinct, we need a method to ensure that > tree is computed along only one of them. What does it mean when you say "only one of them becomes a member of the tree"? How does that happen? As I say above, remotely from the RBridges with these two (or more) between them, I don't think you should be able to tell that there is more than one connection. And if you could, you shouldn't care whether one or the other is used or traffic is split across them. The topology is still the same regardless of which of these happens. And if you did care, there is no place in the LSP to add tie breaking info. And locally (that is, you are one of the two multiply connected RBridges), it's should just a local choice how to send, and you should have to be able to receive on any of the parallel links. Apparently the problem you see is just with the Reverse Path Forwarding Check at the receiving end of two or more parallel links between the same pair of RBridges. I responded above on that point by saying "You should accept TRILL traffic on both connections and we should say that [in the protocol specification] as a clarification. (You wouldn't want the Reverse Path Forwarding Check or something causing TRILL frames on one of the parallel connections to be discarded.)" Isn't that a satisfactory answer? (Maybe I shouldn't have said "both" because there could be more than two...) Why should the local RBridges be forced to tie break and leave all but one of there parallel point-to-point links idle? Thanks, Donald >>> Thanks, >>> Ayan >>> >>> ... >>> >> >> Thanks, >> Donald ============================= Donald E. Eastlake 3rd +1-508-634-2066 (home) 155 Beaver Street Milford, MA 01757 USA d3e3e3 at gmail.com From Radia.Perlman at sun.com Fri Feb 27 13:31:23 2009 From: Radia.Perlman at sun.com (Radia Perlman) Date: Fri, 27 Feb 2009 13:31:23 -0800 Subject: [rbridge] Proposed resolution: Re: encoding of TRILL IS-IS frames In-Reply-To: <1028365c0810291645la92b71dk47210c7b81fc3d90@mail.gmail.com> References: <48EFDA99.9070304@sun.com> <1028365c0810101932i6e17140dw90abc00d8643ac74@mail.gmail.com> <48F08FEA.1000302@cisco.com> <1028365c0810291645la92b71dk47210c7b81fc3d90@mail.gmail.com> Message-ID: <49A85BAB.9060703@sun.com> This is what I think we should do: a) unless really broken, we should leave the effect of the current spec, so that if someone has implemented it, we don't change their implementation b) so, that would imply that on Ethernet, encoding would be as the spec says (data and ESADI is TRILL-encapsulated, TRILL-IS-IS is not TRILL-encapsulated, but has a distinct (outer) destination MAC address. c) if it is considered safer somehow, we could use a bit in the TRILL header that is currently set to 0 and define it as the "control" bit, saying that implementations must set it to 0 (as they would anyway according to the current spec) and SHOULD check to make sure it's 0 on receipt. But I don't think we really need that bit. d) we need to define how to encode TRILL on PPP. As Jim Carlson pointed out, the current spec is at best incomplete, and at worst, incorrect about PPP. One way we could fix this is to make sure we clarify, when things are Ethernet specific, that we are talking only about Ethernet links. An example is when we say you can tell the difference between TRILL IS-IS and TRILL data based on the destination MAC address. It should say "on Ethernet links". And perhaps another sentence like "on other types of links, such as PPP, some other mechanism must exist to differentiate". e) proposal for PPP is two protocol types, one for TRILL encapsulated data and ESADI frames (both of which would have a TRILL header), and one for "other", where we'd say, for instance, that the first 2 bytes after the PPP header would be a "TRILL protocol type", so that we could then encode more types of "other" than just IS-IS, if we ever need to. f) we should have a separate document "how to encode TRILL over PPP. g) I was thinking about how to tunnel TRILL over IP. For that, undoubtedly we'd put in a UDP header. Perhaps it would be useful to define TRILL over IP at some point, again it would be in a separate document. So....are people OK with this? Radia Proposed From james.d.carlson at Sun.COM Fri Feb 27 14:50:04 2009 From: james.d.carlson at Sun.COM (James Carlson) Date: Fri, 27 Feb 2009 17:50:04 -0500 Subject: [rbridge] Proposed resolution: Re: encoding of TRILL IS-IS frames In-Reply-To: <49A85BAB.9060703@sun.com> References: <48EFDA99.9070304@sun.com> <1028365c0810101932i6e17140dw90abc00d8643ac74@mail.gmail.com> <48F08FEA.1000302@cisco.com> <1028365c0810291645la92b71dk47210c7b81fc3d90@mail.gmail.com> <49A85BAB.9060703@sun.com> Message-ID: <18856.28188.700649.936645@gargle.gargle.HOWL> Radia Perlman writes: > about Ethernet links. An example > is when we say you can tell the difference between TRILL IS-IS and TRILL > data based on the destination > MAC address. It should say "on Ethernet links". And perhaps another > sentence like "on other types > of links, such as PPP, some other mechanism must exist to differentiate". I don't like to have "Foo over X" documents that specify how you must deviate from the base spec for "Foo" in order to implement, so I think that sort of a clarification in the base spec is required. The mapping described in the base specification (for good or ill) is Ethernet-only, but it doesn't actually say that. I also _strongly_ recommend that: a.) the ASCII graphic diagrams of the various packet formats be included in the base specification to make the encoding clear; it's not clear now. b.) we say something about the MTU weirdness that this solution imposes, and we make the expectations for MTU handling completely clear, as it's an interoperability issue. > f) we should have a separate document "how to encode TRILL over PPP. I can volunteer to write that one. > So....are people OK with this? I think it's a poor solution for a number of reasons (what does "filter for TRILL" look like?), and this was broken only "recently" (the old mapping was in the -09 draft, and was changed in -10, well after we had started implementing and testing), so it should have been something fixable. But since it seems we cannot get consensus without keeping this late and incompatible change to the spec, and I certainly value consensus, I'll concede. I hope it doesn't come back to bite. -- James Carlson, Solaris Networking Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 From Jim.Burrows at stellarswitches.com Sat Feb 28 14:31:40 2009 From: Jim.Burrows at stellarswitches.com (Jim Burrows) Date: Sat, 28 Feb 2009 17:31:40 -0500 Subject: [rbridge] Proposed resolution: Re: encoding of TRILL IS-IS frames In-Reply-To: <18856.28188.700649.936645@gargle.gargle.HOWL> References: <48EFDA99.9070304@sun.com> <1028365c0810101932i6e17140dw90abc00d8643ac74@mail.gmail.com> <48F08FEA.1000302@cisco.com> <1028365c0810291645la92b71dk47210c7b81fc3d90@mail.gmail.com> <49A85BAB.9060703@sun.com> <18856.28188.700649.936645@gargle.gargle.HOWL> Message-ID: <4B77B4B9-257D-401C-98A2-3E565648B947@stellarswitches.com> I'll have to agree with Jim Carlson with regard to the notion of "Foo over X" documents and specs whose scope intermingles one specific protocol from one level into a specification at another level. I'm new here, and I don't know how far folks may have gotten in implementing the draft 10 spec, vs the 09 or the 11, but it also feels wrong to stick with a problem that crept into a latish but not final draft of the spec fore backward compatibility reasons. I've had to live with the sort of problems that this introduces too often. Is it really too late to fix this? JimB. On Feb 27, 2009, at 5:50 PM, James Carlson wrote: > Radia Perlman writes: >> about Ethernet links. An example >> is when we say you can tell the difference between TRILL IS-IS and >> TRILL >> data based on the destination >> MAC address. It should say "on Ethernet links". And perhaps another >> sentence like "on other types >> of links, such as PPP, some other mechanism must exist to >> differentiate". > > I don't like to have "Foo over X" documents that specify how you must > deviate from the base spec for "Foo" in order to implement, so I think > that sort of a clarification in the base spec is required. The > mapping described in the base specification (for good or ill) is > Ethernet-only, but it doesn't actually say that. > > I also _strongly_ recommend that: > > a.) the ASCII graphic diagrams of the various packet formats be > included in the base specification to make the encoding clear; > it's not clear now. > > b.) we say something about the MTU weirdness that this solution > imposes, and we make the expectations for MTU handling > completely clear, as it's an interoperability issue. > >> f) we should have a separate document "how to encode TRILL over PPP. > > I can volunteer to write that one. > >> So....are people OK with this? > > I think it's a poor solution for a number of reasons (what does > "filter for TRILL" look like?), and this was broken only "recently" > (the old mapping was in the -09 draft, and was changed in -10, well > after we had started implementing and testing), so it should have been > something fixable. > > But since it seems we cannot get consensus without keeping this late > and incompatible change to the spec, and I certainly value consensus, > I'll concede. I hope it doesn't come back to bite. > > -- > James Carlson, Solaris Networking > > Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 > 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 > 1677 > _______________________________________________ > rbridge mailing list > rbridge at postel.org > http://mailman.postel.org/mailman/listinfo/rbridge