[e2e] Port numbers, SRV records or...?

Wed Aug 9 17:26:24 PDT 2006

At 17:50 -0400 2006/08/09, Keith Moore wrote:
>  > >I think of port numbers identifying a distinguished service or
>>  >default instance of a service.
>>
>>  No, you are carrying the semantics of well-known ports to ports in
>>  general.  You don't want to make this association.  Look at the use
>>  of ports in other protocols and in operating systems.
>
>Agreed, I was being sloppy.  I should have said "I think of well-known
>port numbers as identifying a distinguished service or default instance
>of a service" (as opposed to identifying, say, a protocol)
>
>>  >e.g. A host can have web servers (i.e. something that serves web
>>  >pages for a web browser), using the HTTP protocol, on any number of
>>  >ports. The web server running HTTP on port 80 is the default
>>  >instance, the one that people get if they don't specify anything
>>  >more than the name of the service (which is a DNS name but not
>>  >necessarily the name of a single host).
>>  >
>>  >A host can also use HTTP to provide things other than web servers,
>>  >and a host can have web servers running other protocols such as FTP.
>>  >So we have service names, host names, services, protocols, and ports
>>  >- each subtly different than those next to it.
>>
>>  First of all, be careful.  Again, the infatuation with names of hosts
>>  are an artifact of the current tradition.  They are not necessary for
>>  naming and addressing for communications.  TCP can be used for any
>>  application which may or may not use one of the standard protocols.
>
>I'm not sure what point you are trying to make here.  Applications care
>about names of hosts, services, things in DNS, etc. I don't think
>that's infatuation, but recognition of a need, as apps have to
>interface with humans.  TCP doesn't, and IMHO shouldn't, care about
>such things.

No Applications care about names of other applications.  TCPs care 
about names of other TCPs.  Host names were something that came in 
very early and were a bit of sloppiness that we got very attached to. 
I used "infatuation" because I have seen the look of abject terror on 
some people's faces when I suggested it was irrelevant to the naming 
and addressing problem.

The only place I have found a host-name useful is for network 
management when you want to know about things you are managing that 
are in the same system.  But for communication it is pretty much 
irrelevant.  Yes, there are special cases where you want to know that 
an application is on a particular system but they are just that: 
special cases.

>  > >>>Again, the well-known socket approach only
>>  >>>works as long as there is only one instance of the protocol in the layer
>>  >>>above and certainly only one instance of a "service." (We were lucky in
>>  >>>the early work that this was true.)
>>  >
>>  >[realizing that here I'm replying to the previous message...]
>>  >Well-known ports work well for selecting the default instance of a
>>  >service.  Often that is what we want to do, and we find it
>>  >convenient to have a distinguished instance of a service that serves
>>  >as a default. Well-known ports don't prevent us from offering
>>  >alternate instances of the same service on other ports.
>>
>>  They are kludge that you have grown use to because you have never
>>  known anything else.  That is the nice thing about software we can
>>  almost anything work.  But it does require that we reserve well-known
>>  ports on all systems whether they need them or not.  We have been
>>  lucky that for 20 years we only had 3 applications.
>
>There are lots of different ways to solve a problem.  TCP could have
>been designed to specify the protocol instead of a port.  Then we would
>need some sort of kludge to allow multiple instances of a protocol on a
>host.  Or it could have been designed to specify both a protocol and an
>instance, and applications designed to run on top of TCP would have
>needed to specify protocol instance when connecting (much as they
>specify port #s now).

Actually it shouldn't have been at all.  This is really none of TCP's 
business.  TCP implements mechanisms to create a reliable channel and 
the pair of port-ids are there to be a connection-identifier, i. e. 
identify an instance.  Binding that channel to a pair of applications 
is a separate problem.  It was done in NCP as a short cut, partly 
because we didn't know any better and partly because we had bigger 
problems to solve.  TCP just did what NCP did.

>I don't know of many things that break if a system puts some other
>engine or service on a port besides that indicated by the well-known
>port assignment for that port.  Traffic sniffers and interception
>proxies, maybe, but I'm not sure that it's a good thing architecturally
>for networks to be able to agressively monitor and alter traffic.

I remember someone putting a system dump on what others thought was 
the socket for new Telnet. It came as a bit of a surprise.  ;-)

>  > >(Though I'll admit that it has become easier to do so since the URL
>>  >name:port convention because popular.  Before that, the port number
>>  >was often wired into applications, and it still is often wired into
>>  >some applications, such as SMTP.  But a lot of the need to be able
>>  >to use alternate port numbers has resulted from the introduction of
>>  >NAPTs. Before NAPTs we could get away with assigning multiple IP
>>  >addresses (and multiple DNS names) to a host if we needed to run
>>  >multiple instances of a service on that host.  And we still find it
>>  >convenient to do that for hosts not behind NAPTs, and for hosts
>>  >behind firewalls that restrict which ports can be used.)
>>
>>  URLs try to do what needs to be done.  But we really on have them for
>>  HTTP.  It is not easy to use in general.
>
>URLs are becoming more and more popular.  They're just more visible in
>HTTP than in other apps.  and even apps that don't use URLs are often
>now able to specify ports.  Every MUA I know of lets you specify ports
>for mail submission, POP, and IMAP.  (I think it would be even better
>if they let people type in smtp:, pop:, and imap: URls)
>

As I said before the great thing about software is that you can heap 
band-aid upon band aid and call it a system.

>  > >>>Actually, if you do it right, no one standard value is necessary at
>>  >>>all.  You do have to know the name of the application you want to
>>  >>>communicate with, but you needed to know that anyway.
>>  >
>>  >To me this looks like overloading because you might want more than
>>  >one instance of the same application on the host.  I'd say that you
>>  >need to know the name of the instance of the service you want to
>>  >talk to.  Now in an alternate universe that name might be something
>>  >like "foo.example.com:web:http:1" - encompassing server name
>>  >(foo.example.com), the name of the service (web) protocol name
>>  >(http), and an identifier (1) to distinguish one instance of the
>>  >same service/protocol from another. But we might not actually need
>>  >that much complexity, and it would expose it to traffic analysis
>>  >which is good or bad depending on your point-of-view.
>>
>>  Indeed. The name would have to allow for both type and instance.  As
>>  well as applications with multiple application protocols and multiple
>>  instances of them.  But this was worked out year ago.
>
>I won't claim that it can't be done, but is it really needed?  or worth
>it?

Only for those that need it.  Remember there are lots of people out 
there developing applications for the net that will never see the 
IETF or the "common" protocols.  These people are struggling to solve 
their problems because there are being forced to use the network 
equivalent of DOS, because the "new bell-heads" see no need to have a 
"Unix".  Our job is to provide the complete tool set in such a way 
that if they don't need it doesn't get in their way and if they do 
they have it.  We aren't even debating wrenches vs sockets, we are 
debating whether nails can't be used for everything.

>  > >Right now we need to know the name of the peer we want to
>>  >communicate with.  That name is composed of an IP address and a port
>  > >number.  This is unambiguous.  Having a default port for a
>>  >particular service doesn't prevent alternate instances of a service
>>  >from being established at the same IP address, as most modern
>>  >applications allow other port numbers to be specified.  Each
>>  >application can perform that binding as late as it wants to.  The
>>  >only real problem (other than legacy apps that don't support late
>>  >bindings) is caused by NAPTs that want to make port numbers
>>  >realm-specific, and we already know that NATs were a bad idea.
>>
>>  NATs only cause trouble because we have half an architecture. 
>
>Any architecture is incomplete, and the Internet architecture is no
>exception.  I'm not sure what benefit there is to berating it now,

That is a common misconception.  No, this architecture is incomplete 
on an absolute measure.  It is an unfinished demo and the demo was 
held in '72. It is very possible to have a complete architecture. 
Oddly enough, it would yield simpler solutions to what we have.

>though there are certainly things to be learned by examining its
>limitations.  Nor do I really see a better way (even after all this
>time) of doing either routing or referral than to have a large address
>space with distributed assignment of addresses and each address being
>unique within the network.   So I'm inclined to think that even if we
>had a whole architecture, NATs as we know them would still be a
>hinderance.   Of course, given a somewhat different architecture, we
>would have had different kinds of NATs - or maybe, not had the
>need for NATs.

NATs would not be a problem.  They would either integrate cleanly or 
not exist depending on your point of view.

>  > People
>>  have gotten use to the pain and think that it is perfectly normal to
>>  run a net this way. 
>
>People have become accustomed to heart disease too.  That doesn't
>mean that trans fat is good for you.  And yet people are slowly
>learning to not put this stuff in food.
>
>>  As for the making work, as I said, the nice thing about software is
>>  you can usually find a way to make almost anything work.
>
>Until you drown in complexity.  When that happens, you blame anything
>but your own efforts to make the system more complex.  (Or you let some
>other disruptive factor - such as y2k or ipv6 - be the excuse for
>scrapping things and starting over with a cleaner slate)
>
>>  >>The key question is "what is late bound". IMO, we could really use
>>  >>something that decouples protocol identifier from instance (e.g.,
>>  >>process demultiplexing) identifier.
>>  >
>>  >We could also use something that decouples service from protocol.
>>  >(do we really want to be stuck with HTTP forever as the only way to
>>  >get web pages?  SMTP as the only way to transmit mail?)  How many
>>  >layers do we want?
>>
>>  Don't think it is a question of want, a question of need.  25 years
>>  ago we figured out how much naming and addressing we need but we
>>  choose to ignore the answer.
>
>Care to supply a pointer?

RFC 1498

>
>>  >In summary: Port numbers are sufficient for the endpoints, and well
>>  >known ports are a useful convention for a default or distinguished
>>  >instance of a service as long as we don't expect them to be rigidly
>>  >adhered to.  The question is: how much information about what
>>  >services/protocols are being used should be exposed to the network?
>>  >And if we had such a convention for exposing services/protocols to
>>  >the network are we in a position to demand that hosts rigidly
>>  >enforce that convention?
>>
>>  Why would you want to? 
>
>I'm not sure we do.  But I'm also not sure why network-visible service
>or protocol identifiers are useful if the network can't take them as a
>reliable indicator of content.  So absent some means to police them, I
>doubt that they are useful at all.

Ahhh, then we are in agreement?

Take care,
John