Posted by: lrrp | September 12, 2006

Copying Files from the Internet by Dr. Heinz M. Kabutz (for JDK version: JDK 1.5)

Sometimes you need to download files using HTTP from a machine that you cannot run a browser on. In this simple Java program we show you how this is done. We include information of your progress for those who are impatient, and look at how the volatile keyword can be used.

Copying Files from the Internet

Part of the job of installing our own dedicated server involves downloading software from the internet onto our machine. I did not want to punch a hole in my router to allow me to open up an X session onto the server. Considering my slow internet connection, I also did not want to first download the files onto my machine, then upload onto the server.

A technique that I have used many times for downloading files from the internet is to open up a URL, grap the bytes, and add them to a local file. Here is a small program that does this for you. You can specify any URL, and it will fetch the file from the internet for you and show you the progress.

You can either specify the URL and the destination filename or let the Sucker work that out for himself.

Some URLs can tell you how many bytes the content is, others do not reveal that information. I use the Strategy Pattern to differentiate between the two. We have a top level Strategy class called Stats and two implementations, BasicStats and ProgressStats.

The stats are displayed in a background thread. This means that the Stats class has to ensure that changes to the fields are visible to the background thread.

In my System.out.println(), I output a new Date() to show the progress of the download. This is usually a bad practice. It would be better to use the DateFormat to reduce the amount of processing that needs to be done to display the date.

The last comment about this class is the size of the buffer. At the moment it is set to 1MB. This is larger than necessary, so actual length will often be much smaller.

import java.util.*;

public class Sucker {
private final String outputFile;
private final Stats stats;
private final URL url;

public Sucker(String path, String outputFile) throws IOException {
this.outputFile = outputFile;
System.out.println(new Date() + " Constructing Sucker");
url = new URL(path);
System.out.println(new Date() + " Connected to URL");
stats = Stats.make(url);

public Sucker(String path) throws IOException {
this(path, path.replaceAll(".*\\/", ""));

private void downloadFile() throws IOException {
Timer timer = new Timer();
timer.schedule(new TimerTask() {
public void run() {
}, 1000, 1000);

try {
System.out.println(new Date() + " Opening Streams");
InputStream in = url.openStream();
OutputStream out = new FileOutputStream(outputFile);
System.out.println(new Date() + " Streams opened");

byte[] buf = new byte[1024 * 1024];
int length;
while ((length = != -1) {
out.write(buf, 0, length);
} finally {

private static void usage() {
System.out.println("Usage: java Sucker URL [targetfile]");
System.out.println("\tThis will download the file at the URL " +
"to the targetfile location");

public static void main(String[] args) throws IOException {
Sucker sucker;
switch (args.length) {
case 1: sucker = new Sucker(args[0]); break;
case 2: sucker = new Sucker(args[0], args[1]); break;
default: usage(); return;

The Stats class needs a little bit of explaining. The field totalBytes is written to by one thread, and read from by another. Since we are writing with only one thread, we can get away with just making the field volatile. We have to make it at least volatile to ensure that the timer thread can see our changes.

The printf() statement "%10dKB%5s%% (%d KB/s)%n" looks beautiful, does it not? The %10d means a decimal number with 10 places, right justified. The “KB” stands for kilobytes. The %5s means a String with 5 spaces, right justified. Then we have a %%, which represents the % sign. The newline is done with %n. Cryptic I know, but for experienced C programmers this should read like poetry 🙂

The Stats class contains a factory method that returns a different strategy, depending on whether the content length is known. Having the factory method inside Stats allows us to introduce new types of Stats without modifying the context class, in this case Sucker.

import java.util.Date;

public abstract class Stats {
private volatile int totalBytes;
private long start = System.currentTimeMillis();
public int seconds() {
int result = (int) ((System.currentTimeMillis() - start) / 1000);
return result == 0 ? 1 : result; // avoid div by zero
public void bytes(int length) {
totalBytes += length;
public void print() {
int kbpersecond = (int) (totalBytes / seconds() / 1024);
System.out.printf("%10d KB%5s%% (%d KB/s)%n", totalBytes/1024,
calculatePercentageComplete(totalBytes), kbpersecond);

public abstract String calculatePercentageComplete(int bytes);

public static Stats make(URL url) throws IOException {
System.out.println(new Date() + " Opening connection to URL");
URLConnection con = url.openConnection();
System.out.println(new Date() + " Getting content length");
int size = con.getContentLength();
return size == -1 ? new BasicStats() : new ProgressStats(size);

The ProgressStats class is used when we know the content length of the URL, otherwise BasicStats is used.

public class ProgressStats extends Stats {
private final long contentLength;
public ProgressStats(long contentLength) {
this.contentLength = contentLength;
public String calculatePercentageComplete(int totalBytes) {
return Long.toString((totalBytes * 100L / contentLength));

public class BasicStats extends Stats {
public String calculatePercentageComplete(int totalBytes) {
return "???";

Let’s run the Sucker class. To download a picture of me at the Tsinghua University in China, you would do the following:

java Sucker

which produces the following output on my slow connection to the internet:

    Wed Mar 08 12:24:27 GMT+02:00 2006 Constructing Sucker
Wed Mar 08 12:24:27 GMT+02:00 2006 Connected to URL
Wed Mar 08 12:24:27 GMT+02:00 2006 Opening connection to URL
Wed Mar 08 12:24:27 GMT+02:00 2006 Getting content length
Wed Mar 08 12:24:27 GMT+02:00 2006 Opening Streams
Wed Mar 08 12:24:28 GMT+02:00 2006 Streams opened
6 KB 2% (6 KB/s)
56 KB 17% (28 KB/s)
104 KB 32% (34 KB/s)
158 KB 49% (39 KB/s)
203 KB 63% (40 KB/s)
257 KB 79% (42 KB/s)
295 KB 91% (42 KB/s)
322 KB 100% (46 KB/s)

When I tried downloading the latest Tomcat version from my server, the speed was far more acceptable:

    Wed Mar 08 11:25:52 CET 2006 Constructing Sucker
Wed Mar 08 11:25:52 CET 2006 Connected to URL
Wed Mar 08 11:25:52 CET 2006 Opening connection to URL
Wed Mar 08 11:25:52 CET 2006 Getting content length
Wed Mar 08 11:25:57 CET 2006 Opening Streams
Wed Mar 08 11:25:58 CET 2006 Streams opened
1056 KB 18% (1056 KB/s)
2272 KB 38% (1136 KB/s)
3200 KB 54% (1066 KB/s)
4121 KB 70% (1030 KB/s)
5200 KB 89% (1040 KB/s)
5829 KB 100% (1165 KB/s)

There are ways of running this through a proxy as well, which you apparently do like this (according to my friends Pat Cousins and Leon Swanepoel):

    System.getProperties().put("proxySet", "true");
System.getProperties().put("proxyHost", "");
System.getProperties().put("proxyPort", "8080");

If you need to supply a password, you can do that by changing the authenticator:

    Authenticator.setDefault(new Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(
"username", "password".toCharArray());

I have not tried this out myself, so use at own risk 🙂

That is all for this week. Thank you for your continued support by reading this newsletter, and forwarding it to your friends 🙂

Kind regards




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: