|
|
|
Project
Design
Benefits and Drawbacks of our Project Improvements, Applications And Related Work
|
The question why use our
algorithm will probably arise since there are other ways to download data from
multiple mirror sites. The most simple of them is to try downloading the file
from one of the servers. If the download is slow, stop that download and try
another (hopefully) faster mirror site. Some programs such as GetRight try to
figure out which server has the highest chances to be fast and download from
that server. However, if the program is wrong we are stuck with a slow server.
Another approach would be to split the file into chunks and download every chunk
from another server. The problem with this approach is that if one of the
servers is slow or down, we cannot receive the file. Our approach eliminates the
difficulties of the methods described above. The main advantages of our
approach are: ·
Using our project we gain better throughput at the client side,
thus gaining more speed than downloading from a single site. However this is
true only when the bottleneck is at the server side and the client has available
bandwidth that is not used due to the bottleneck at the server side. While using
our algorithm we use more of the available bandwidth (since we open connections
to several mirror sites) and therefore improving throughput when possible. ·
Fault tolerance – using our algorithm allows the user to
download a file even if all the servers but one are down. As explained before
every server holds a complete image of the file. Connecting to that sever will
accomplish the task of downloading the file. (This is also true for the scenario
when the server crashes during the download process). Using the methods
described above when a server fails can result in a lost download if the servers
do not support resuming. ·
Trying to minimize the amount of unnecessary data sent on the
network. When using one of the methods above, if we choose not to use a server
all the data we have received from that server is lost. In our case, even if a
single kilobyte was received from the server it can be used. Of course this
means that servers send less data, and therefore the load on the servers
decrease, that in turn reduces the chances of server crashing down. ·
The download is highly parallel – All the time we receive data
from servers. Using the above methods is likely to end in waiting for a single
server to send its part of data. Some connection can be very fast but the
slowest connection will detain the whole downloading process. Using our
algorithm does not have this effect since we download from all the servers all
the time, until we have enough data to complete the download. Of course fast
servers will contribute more to the downloaded file. No time wasted on waiting
for “slow” servers. The main disadvantages of
our approach are: -
The use of our
algorithm involves downloading an applet that takes time and cannot be performed
in parallel. The solution can be permanently installing the applet on the
user’s computer. -
The use of this
method without checking the net topology may cause more load on the network
and/or network routers. Opening a multiple connection that will route eventually
through the same router will overload the router and lead to poor results for
all network users. A solution might be running “traceroute” on the
background, and determining the topology of the part of the network that is of
our interest. -
It does not consider
the network load and the load on the server. We could improve the performance if
we could avoid connecting to busy servers if we have more available servers that
we are going to connect to. The scalability problem is addressed later in this
paper. -
The current implementation requires trusting our applet that it will not
cause harm to user’s computer. The user may need to install a root CA, which
he has no reason to trust. Explorer users are required to manually allow our
applet to step out of the sandbox, a task that only experienced users can
perform. This can be solved by purchasing an object signing certificate (for
$200) for Netscape, and an Authenticode certificate for IE for the same price. -
Our method in not applicable when mirror sites are not available (we have
only one server). -
Redundant packets are sent over the network. There are two possible
reasons for redundant packets. One reason is specific to our implementation.
When we download the last packets in every strip, some packets are redundant,
and do not contribute to the download effort. This can be solved by working with
a single strip of 1MB. The other reason for redundant packets is the TCP window
size. If we need only 1Kb of data to complete the download, and the TCP brings
us 64KB of data, the last 63Kb will be useless and lost. This can be solved by
running our own server for the downloads, and can be easily included in the
implementation of the UDP server as described below. -
CPU Load while decoding the chunks. Using our download method is not
recommended on slow machines or machines which don’t have JIT for Java. |
|
Please contact Genady or Nir regarding copyright issues
|