[e2e] Application Layer Video Adaptation
shrin.krishnan at gmail.com
Thu Mar 2 09:03:17 PST 2006
Recently there has been discussion on the list over using Application
layer more effectively to boost network performance. The work Ketan
Mayer-Patel and I are doing at UNC is based on using application
conditions to drive the layers below for video streaming i.e which
packets to send, how much to wait etc.
The interest behind this research stems from a teleimmersion system
for Remote Medical Consultation [http://www.cs.unc.edu/Research/nlm/]
being built at UNC.At any given time there are set of 8 cameras
looking at a given subject and then transmitting images back to the
server. The server then has to send these images/video on demand to a
client. The cameras are also controlled remotely by the user and his
region of interest. So if the user is interested in the lower quadrant
the camera closest to the quadrant should be the one sending the
However, if we use a traditional fully reliable network protocol like
say TCP it will send the video frames to the user quite
unintelligently. TCP will sends each frame regardless of any temporal
and spatial dependencies that might be present in the video stream.
To overcome this problem we have a developed a adaptation layer that
sits between the application and network layers and drives which
frame/packet to send next.The original algorithm was developed by
David Gotz here at UNC for providing multicast adaptation at the
client side http://www.cs.unc.edu/~kmp/publications/mm2004_gotz/mm2004_gotz.pdf].
We have taken this algorithm and changed it to perform server side
video adaptation i.e the server decides based on user input which
packets are best suited to his current needs.
The algorithm is based on a graph utility space and all the video
encoders from the application layer add the video frames as nodes to
this graph. In the teleimmersion space we have 8 encoders producing
video streams at 3 different resolutions. This results in a 3
dimensional (Time, Camera,Resolution) utility space over which we need
to make adaptation decisions. The edges between various nodes
represent intra-frame dependencies (I-P-P-P-I or lower to higher
resolution). Each nodes has also an ID that tells us the
time*Camera*Resolution. Finally each node can be in one of 3 states
Idle (node just added), Available (process of being sent, previous
dependencies have been resolved) and resolved (information sent over).
The adaptation decision is then based on a Utility Cost Ratio, where
Utility is the euclidean distance from a given reference frame closest
to the point of interest to the frame being sent and cost is just the
Regarding DCCP, the UDP-TFRC protocol I defined in an earlier post
works quite similar to DCCP but has some added features. A notion of
smart reliability has been added, in the graph model I described we
add an extra node for each existing node and associate with it cost of
retransmission. So, each frame has a set number of packets associated
with it and if we encounter a loss we add this loss as a cost to the
new node. Nodes which have been successfully sent have zero cost and
can be used as a reference frame to send any new frame. For eg. say we
experience a packet loss, by the time we get the loss feedback an I
Frame is produced, our adaptation decision will then send the I Frame
and never send the old lost packet, or say our region of interest
changes, we will now use a new reference frame and send those packets
and forget about the loss. Also we have concept of a time bound, i.e.
we will only look at nodes say 30 frames back, the rest are not used
in computation, this takes into account the 500 ms latency.
The best way to describe the adaptation process is that its like
late/dynamic binding, only the appropriate packets are sent and the
decision is made on the fly in real time.
Dept. Of Computer Science
**Office: (919) 962-1920
Email: krishnan at cs.unc.edu, shrin.krishnan at gmail.com
More information about the end2end-interest