[e2e] Reacting to corruption based loss

Clark Gaylord gaylord at dirtcheapemail.com
Wed Jun 29 06:42:47 PDT 2005


Detlef Bosau wrote:

>Sam Manthorpe wrote:
>  
>
>>>As an example of the latter, a major telecom company, whose services many of
>>>us are using this instant, called a few years back, asking for help
>>>      
>>>
>>How many (years?)
>>    
>>
>
>Alex reminded me on a strange situation, I met myself a couple of years
>ago.
>  
>
you did? that is strange.  what did you say?  :-)

>One day, I met strong TCP/IP problems on a WAN line exhibiting a BER
>10^-9, which was more than specified. However, I have thought about the
>  
>
Ok, while we're discussing "corruption-based loss" and weirdness, here's 
mine:

We often talk about bit errors being random.  I put it to you that this 
may not be true.  Perhaps it is the traffic data that are the random 
element and the bit errors are more predictable than we believe.

A user called us years ago, when our backbone was a FDDI ring, about a 
several megabyte file he could not send to a neighboring building.  He 
had successfully sent it throughout his LAN, and there were other 
buildings to which he could send it, but not to this one.  He was using 
ftp.  As it turns out, the intended destination was counter-clockwise 
from him on the ring; all buildings he had successfully sent it to via 
the backbone were clockwise from him.  We did further testing with the 
user and found that, in fact, there were no buildings to which he could 
send this file that were counter-clockwise on the ring.  Weird.  So, we 
split the file in half and found that one piece would successfully 
traverse the ring, the other would not.  And so we continued via binary 
search splitting the unsuccessful piece until we had a piece of the file 
with a few hundred bytes that were the problem.  Out of the entire 
several megabyte file, these few hundred bytes absolutely could not be 
convinced to traverse the ring counter-clockwise from this building, yet 
could travel anywhere else just fine.  If we tried to send a packet with 
these data, the FDDI interface would always accumulate an error.

We sent out a field tech with an alcohol swap, and fixed the problem.

The conjecture is that there was a particular bit pattern that would 
reliably get corrupted by the reflections on this fiber.  Cleaning the 
fiber fixed the problem.

--ckg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050629/c6ee75bb/attachment.html


More information about the end2end-interest mailing list