[e2e] Port numbers, SRV records or...?

Wed Aug 9 13:21:19 PDT 2006

At 11:32 -0400 2006/08/08, Keith Moore wrote:
>>>Port-ids or sockets identify an instance of communication,
>>>not a  particular service.
>>
>>They currently do both for the registered numbers, at least as a
>>convention, although individual host-pairs override that protocol
>>meaning by a-priori (usually out-of-band) agreement.
>
>I think of port numbers identifying a distinguished service or 
>default instance of a service.

No, you are carrying the semantics of well-known ports to ports in 
general.  You don't want to make this association.  Look at the use 
of ports in other protocols and in operating systems.

>e.g. A host can have web servers (i.e. something that serves web 
>pages for a web browser), using the HTTP protocol, on any number of 
>ports. The web server running HTTP on port 80 is the default 
>instance, the one that people get if they don't specify anything 
>more than the name of the service (which is a DNS name but not 
>necessarily the name of a single host).
>
>A host can also use HTTP to provide things other than web servers, 
>and a host can have web servers running other protocols such as FTP. 
>So we have service names, host names, services, protocols, and ports 
>- each subtly different than those next to it.

First of all, be careful.  Again, the infatuation with names of hosts 
are an artifact of the current tradition.  They are not necessary for 
naming and addressing for communications.  TCP can be used for any 
application which may or may not use one of the standard protocols.

>>>Again, the well-known socket approach only
>>>works as long as there is only one instance of the protocol in the layer
>>>above and certainly only one instance of a "service." (We were lucky in
>>>the early work that this was true.)
>
>[realizing that here I'm replying to the previous message...]
>Well-known ports work well for selecting the default instance of a 
>service.  Often that is what we want to do, and we find it 
>convenient to have a distinguished instance of a service that serves 
>as a default. Well-known ports don't prevent us from offering 
>alternate instances of the same service on other ports.

They are kludge that you have grown use to because you have never 
known anything else.  That is the nice thing about software we can 
almost anything work.  But it does require that we reserve well-known 
ports on all systems whether they need them or not.  We have been 
lucky that for 20 years we only had 3 applications.

>(Though I'll admit that it has become easier to do so since the URL 
>name:port convention because popular.  Before that, the port number 
>was often wired into applications, and it still is often wired into 
>some applications, such as SMTP.  But a lot of the need to be able 
>to use alternate port numbers has resulted from the introduction of 
>NAPTs. Before NAPTs we could get away with assigning multiple IP 
>addresses (and multiple DNS names) to a host if we needed to run 
>multiple instances of a service on that host.  And we still find it 
>convenient to do that for hosts not behind NAPTs, and for hosts 
>behind firewalls that restrict which ports can be used.)

URLs try to do what needs to be done.  But we really on have them for 
HTTP.  It is not easy to use in general.

>>>Actually, if you do it right, no one standard value is necessary at
>>>all.  You do have to know the name of the application you want to
>>>communicate with, but you needed to know that anyway.
>
>To me this looks like overloading because you might want more than 
>one instance of the same application on the host.  I'd say that you 
>need to know the name of the instance of the service you want to 
>talk to.  Now in an alternate universe that name might be something 
>like "foo.example.com:web:http:1" - encompassing server name 
>(foo.example.com), the name of the service (web) protocol name 
>(http), and an identifier (1) to distinguish one instance of the 
>same service/protocol from another. But we might not actually need 
>that much complexity, and it would expose it to traffic analysis 
>which is good or bad depending on your point-of-view.

Indeed. The name would have to allow for both type and instance.  As 
well as applications with multiple application protocols and multiple 
instances of them.  But this was worked out year ago.

>Right now we need to know the name of the peer we want to 
>communicate with.  That name is composed of an IP address and a port 
>number.  This is unambiguous.  Having a default port for a 
>particular service doesn't prevent alternate instances of a service 
>from being established at the same IP address, as most modern 
>applications allow other port numbers to be specified.  Each 
>application can perform that binding as late as it wants to.  The 
>only real problem (other than legacy apps that don't support late 
>bindings) is caused by NAPTs that want to make port numbers 
>realm-specific, and we already know that NATs were a bad idea.

NATs only cause trouble because we have half an architecture.  People 
have gotten use to the pain and think that it is perfectly normal to 
run a net this way.  It is a lot like old DOS users, wondering why 
UNIX guys thought it was so much better.

As for the making work, as I said, the nice thing about software is 
you can usually find a way to make almost anything work.

>
>>The key question is "what is late bound". IMO, we could really use
>>something that decouples protocol identifier from instance (e.g.,
>>process demultiplexing) identifier.
>
>We could also use something that decouples service from protocol. 
>(do we really want to be stuck with HTTP forever as the only way to 
>get web pages?  SMTP as the only way to transmit mail?)  How many 
>layers do we want?

Don't think it is a question of want, a question of need.  25 years 
ago we figured out how much naming and addressing we need but we 
choose to ignore the answer.

>Right now we have a convention, not a rule, that certain port 
>numbers imply certain protocols.  I actually think it's about the 
>right level of slipperiness.  Hosts can do what they want with their 
>ports.  Networks can look at the port convention in an attempt to 
>measure traffic levels, and they can block ports based on knowledge 
>or belief about what applications are going to use them, but if they 
>try to insist that particular apps (and only those apps) use 
>particular ports (and no other ports) they're going to break things 
>and there will be backlash.  In my mind this encourages networks to 
>be more transparent (thus creating an environment more favorable to 
>new apps) than they would be if we had a protocol identifier exposed 
>in the packet.
>
>In summary: Port numbers are sufficient for the endpoints, and well 
>known ports are a useful convention for a default or distinguished 
>instance of a service as long as we don't expect them to be rigidly 
>adhered to.  The question is: how much information about what 
>services/protocols are being used should be exposed to the network? 
>And if we had such a convention for exposing services/protocols to 
>the network are we in a position to demand that hosts rigidly 
>enforce that convention?

Why would you want to?  More the question is what applications are 
not being built because they are either impossible or far too complex 
to do in the current environment. ABTW, these won't be brought up at 
IETF meetings, they will squelched long before they get there.  You 
can put a more complete system in place and let the old one co-exist.

>>This argues for three fields: demux ID (still needed), protocol, and
>>service name.
>>
>>At that point, we could allow people to use HTTP for DNS exchanges if
>>they _really_ wanted, rather than the DNS protocol. I'm not sure that's
>>the point of the exercise, but modularity is a good idea.
>
>Is giving people more ways to do the same thing inherently a good 
>thing?  Seems to me that at some point it degrades interoperability 
>without adding much in the way of new functionality.

I think it is a good thing to give people ways to do the things they 
can't do.  Or ways that are much simpler so the concentrate on doing 
their thing rather than our thing.

Take care,
John