[e2e] Port numbers, SRV records or...?

John Day day at std.com
Sun Aug 6 21:51:28 PDT 2006

>The thread seems to have veered off, into an interesting discussion 
>of history.
>  There were a few responses to the main line of your query, and I am hoping we
>can consider it more extensively.
>In what follows, I decided to make straight assertions, mostly to try to
>engender focused responses.  I also decided to wander around the topic, a bit,
>to see what might resonate...
>At every layer of a protocol architecture, there is a means of distinguishing
>services at the next layer up.  Without having done an exhaustive 
>study, I will
>nonetheless assert that there is always a field at the current layer, for
>identifying among the entities at the next layer up.

Be careful here.  This is not really true.  The protocol-id field in 
IP for example identifies the *syntax* of the protocol in the layer 
above.  (It can't be the protocol because there could be more than 
one instance of the same protocol in the same system.  Doesn't happen 
often but it can happen.) Port-ids or sockets identify an instance of 
communication, not a  particular service.  Again, the well-known 
socket approach only works as long as there is only one instance of 
the protocol in the layer above and certainly only one instance of a 
"service." (We were lucky in the early work that this was true.)

>That field always has at least one standard, fixed value.  Whether it has more
>than one is the interesting question, and depends on how the standard value(s)
>get used.  (If there is only one, then it will be used as a dynamic 

Actually, if you do it right, no one standard value is necessary at 
all.  You do have to know the name of the application you want to 
communicate with, but you needed to know that anyway.

>The questions of how and where the information is encoded both strike me as
>completely irrelevant, for any basic discussion about the topic you 
>raise.  The
>questions are obviously essential for matters of packet parsing and 
>possibly for
>some aspects of scaling, but irrelevant to any sort of information theoretic
>perspective.  Whether the bits are interpreted as a number or an ascii string
>does not matter.  Whether the field in distinct or part of some other
>"addressing" field also does not matter. They well might be 
>extremely important
>for human administration and/or encoding efficiency, but not for basic

Agreed. Bits are bits.

>(Caveat:  XNS had the equivalent of the port number be part of the network
>address and this had an impact of what information its routers used.  It took
>some years before developers decided to have IP routers started 
>paying attention
>to port number...)
>What *does* matter is how to know what values to use. This, in turn, creates a
>bootstrapping/startup task.  I believe the deciding factor in 
>solving that task
>is when the binding is done to particular values.  Later binding gives more
>flexibility -- and possibly better scaling and easier administration -- but at
>the cost of additional mechanism and -- probably always -- complexity, extra
>round-trips and/or reliability.

Not really. But again, you have to do it the right way.  There are a 
lot of ways to do it that do require all sorts of extra stuff.

>Which is better, polling or interrupts?  The answer, of course, is that it
>depends.  It depends on the number of participants -- both total possible and
>currently active -- and their activity pattern.  Similarly, the choice between
>relatively static, pre-registration versus dynamic assignment depends upon how
>many services are involved and how quickly things change.  (Hmmm.  Dynamic
>assignment requires pre-registration too...)

Well, it depends on other things as well.  For example, if the 
identifiers being assigned are suppose to be location-dependent, then 
the assigner has to be able to interpret the location of the entity 
having the identifier assigned to it, so that it can assign the 
correct location-dependent identifier.

>In discussing the differences between email and instant messaging, I came to
>believe that we need to make a distinction between "protocol" and "service".
>The same protocol can be used for very different services, according to how it

Excellent.  This is very important. This is a fundamental principle 
of computer science: The idea that the black box abstracts the 
machinations of the box itself and that how the box accomplishes its 
function can be changed without the "user" of the box knowing it.

>is implemented at operated.  Some folks will remember that in the 70's, email
>had an instant messaging function.  While it involved a different FTP command
>than what we call email, the protocol was otherwise identical. Today, the

C'mon Dave, you know better.  MAIL and MLFL were there for the same 
purpose (sending mail).  MAIL wasn't for IM, it was because the TIPs 
(and others) didn't have a file system to act as a mailbox. The first 
IM on the 'Net was a hack by Jim Calvin in 72 to the Tenex command 
that let you link two terminals together so both users saw what the 
other typed.  Okay, you could construe it as being for IM, but that 
was after the fact. Not the reason it was created.  (MAIL sent mail 
on the Telnet connection of FTP; while MLFL opened a data connection 
and sent like any other file.)

>service distinctions are immediacy and reliability.  That is, email 
>is reliable
>push, except that delivery is into a mailbox rather than the screen, thereby
>making the last hop be "pull". This creates the view that email is not
>immediate. But it *is* reliable, in that a message survives most 
>crashes by the
>host holding the message.  IM is push all the way, but a message does not
>survive a crash.  My point, here, is that these are implementation 
>and operation
>distinctions, rather than inherent differences in the exchange protocols.

Dave, email is not reliable. Email is connectionless.  (Okay, it is 
much more reliable than UDP, but it is still not reliable.)  General 
rule:  To be reliable, if there is relaying at layer N, then there 
must be end-to-end error control at layer N+1.  Mail relays but there 
is no end-to-end error control over it.  (Receipt confirmation is the 
equivalent of the D-bit in X.25 and it wasn't e2e either.) There are 
crashes that mail won't survive.

>If a "protocol" does not automatically define a "service" then what does?  In
>the world of ports, it is the port number.  In the world of SRVs, it 
>is the SRV.
>Either way, they permit repurposing a protocol for different uses.

Yes, a protocol defines a service regardless of whether the designers 
thought it did.

>Observation:  Our specifications usually fail to make the distinction between
>protocol and service, either by conflating them or by ignoring the latter.  I
>suspect we regularly get into trouble because of this.  At the 
>least, we tend to
>carry implicit assumptions about the service that fail to consider 
>likely evolution.

Most definitely.  This is something the IETF is very bad at.  Much of 
the IPv6 work in changing all of the "other" specifications happened 
because there were no clean service definitions.  That other specs 
could refer to, rather than to the protocol itself.  If we write code 
the way we write specs, it could explain a lot.  ;-)

One should write the service definition before writing the protocol. 
This is something that OSI understood. There was a service definition 
for every protocol spec, including the unit-data protocols. (Frankly, 
I don't know how to write a protocol spec without writing the API 
first. But then I was raised on finite-state machines!)

>To say "What if there were no well known numbers" we cannot mean 
>"What if there
>were no initialization rendezvous mechanism?"

No, because they aren't necessary for that.

>I'll suggest that the question is really "What are the requirements for such a
>mechanism that might be better than our current model?"
>Are we seriously interested in trying to "trick" firewalls, by eliminating
>predictable service identifiers?  Do we think that will really work?  Do we
>really need to solve this "problem"?

Firewalls and NATs are a red herring in all of this.  To paraphrase 
Bucky Fuller, NATs only break broken architecture.

>Are there scaling, reliability or performance issues that suggest 
>problems with
>the current problem?
>Eliot's lear-iana-no-more-well-known-ports seems to focus on the 
>problem that we
>have a large number of unused and/or defunct well-known ports and 
>that using SRV
>would be better.  While it is certainly true that the SRV name space is much
>larger, it is also true that the performance, complexity and reliability
>differences in using SRVs are massive.
>In other words, what problem do we have or do we anticipate?
>When we have some sense of that, we can consider how to solve it.

SRVs are just one more band-aid.  It is time to stop with the 
band-aids and figure out what the "answer" is. We need to know the 
goal we should at least approximate. Incremental change without 
knowing where you are going is being lost.  The general rule when you 
are lost is to stay put and let the rescuers find you. What? There 
are no rescuers!  Hmmmm.

Take care,

More information about the end2end-interest mailing list