Computer communications generally require reliable
data transfers among the communicating parties. This is usually achieved by
implementing reliability at different levels in the protocol stack, either at
the data-link layer or using end-to-end protocols at the transport level (such
as TCP), or directly in the application. Usually ARQ protocols (such as
Go-Back-N, or TCP) are used to achieve this reliability, which might be highly
inefficient when retransmissions are needed, or there is network congestion. In
addition, if we could avoid the feedback required by these protocols we could
achieve better throughput. Of course not all ARQ protocols can be avoided. If a
low level network interface uses ARQ to reliably transmit the data there is
nothing we can do about it.
However the main problem of these protocols is that
they don’t have natural extension to many-to-one transmission, which we use in
our application. In fact what we are trying to achieve is exactly this - a
reliable many-to-one transmission, using FEC codes.
When we use FEC codes, we say that for a file
consisting of ‘k’ packets, it is enough to receive any ‘k’
packets from any combination of sources, in any order to being able to
reconstruct the original file. TCP connection gives us more - it promises that
all the packets received from the different sources are the first packets
at this source. For example if we have a 1MB file, and we get 512K from the
first server, 256K from the second server and another 256K from the third
server, we know that those 512K are the first 512K at the first server, and each
256K are the first 256K at both second and the third servers. We don’t need
this, since we can do well even if these were the last packets at each server,
and we pay a price for something we don’t need. A natural step would be to
extend our system to use an unreliable UDP protocol, which does not impose all
the overhead of TCP, and therefore is more efficient. Of course a special care
should be taken when too many packets are lost and retransmission is needed.
In order to do that step over into UDP there is a
need to build a server that will support our algorithms over UDP connections.
Since in UDP packets can be lost or be duplicated
we will need to augment our packets with a special header,
that will includes the index of the packet in the encoded file.
We suggest the following protocol for the UDP
implementations:
A TCP control connection is opened with the server
at the beginning of the download process and is closed when all the file has
been received. The control connection is used to request files from the UDP
server and to inform it that we don’t want to receive more packets from the
server.
The data itself is send over UDP.
The TCP control connection can also be used as a
fallback if the UDP packets do not arrive due to network problems or network
restrictions (for example some firewalls don’t allow UDP packets to pass trhu).
Dynamic Mirrors
In our implementation we defined our mirrors sites
in a static way, i.e, we produced an HTML file containing those links which
later on will be used by the applet in order to open multiple connection to the
mirror sites.
We limited the size of this predefined group up to
eight servers due to the limitations on our FEC implementation.
Though this limitation doesn’t bother us because
in any perspective, eight multiple connection is more than enough (even too
much).
Choosing the predefined group of servers
dynamically should do a more important improvement.
A reasonable mirroring service (such as SunSite)
has much more than eight servers and the load on those servers is changing
rapidly during the day.
A better solution will be a clever choice of the
predefined group of servers.
A reasonable solution will be a server that will
know the topology of the network at any given time and will dispatch client
calls for multiple downloadable connection in a way that those servers will be
the best servers at that time.
The definition of “best servers” should
consider bandwidth, server resources available, CPU usage, and bottleneck
disjoints paths to the client.
By saying “bottleneck disjoints paths to the
client” we mean that a server that will share a common path with another
server in the predefined group may induce congestion because the new connection
will interfere with the performance of existing connections. Therefore we should
care that our selection won’t induce packet loss or slow packet transmission
in another mirror site in the group.
Bulk Data Transfers
We can use a variant of our implementation for bulk
data transfers (e.g. Email, FTP, and News….) when the user who pays for the
time being online but does not use the bandwidth he pays for due to networks
congestion.
When downloading that kind of data to one’s
computer the whole process of decoding and reconstruction can be done offline
(user is not online) due to the fact that the user will want to use this
information later on (e.g. a MP3 zipped file).
By doing the decoding offline we get pure multiple
connections downloading without any overhead in terms of user’s time online.
This can be effective mostly for users who pay for
time online.
ARQ vs. FEC
ARQ is based on the retransmission of missing data
packets, which is triggered upon implicit ACK/NACK’s from the receiver’s.
ARQ is generally used in unicast protocols because
it is very effective, simple to implement, and it does not require any
processing of the data stream.
Its biggest drawback is the need for a feedback
channel and the time required recovering missing packets (a full RTT).
This could be important limitation when the channel
travels through a satellite or when there is real-time application involved
(e.g. full duplex video/audio conferencing systems…).
ARQ scales poorly for multicast protocols and large
groups, because of two main reasons. First the sender might have to deal with
exceedingly growing number of ACK’s or NACK’s. Second the chance for
uncorrelated packet losses grows as the size of the group grows.
Another problem with ARQ in multicast protocols is
that it requires a precise feedback from the receivers in order to decide which
packets to retransmit.
FEC can tolerate some amount of losses. Therefore,
by using Forward Error Correcting codes, the packet loss at the receiver can be
made arbitrarily small, at a price of sending redundant data. The special cases
when there are too many losses can be handled than by usual ARQ techniques.
FEC-based multicast protocols scale much better for
large groups than ARQ-based protocols, since uncorrelated packet losses will be
handled at the receiver and will not occupy the multicast line.