Lab 25: Additional Java

A. Intro

Download the code for Lab 25 and create a new Eclipse project out of it.

Learning Goals

In today's lab, we'd like to cover some loose ends that all Java programmers should know about. You can't really claim to know the language Java until you at least understand the basics of this stuff. First, we'll first go over a class known as StringBuilder, then investigate wildcard types, and then learn a little bit about packages, importing, running Java from the terminal (rather than Eclipse). We'll finish by learning about JAR files.

B. String Immutability

In Java, Strings are immutable, which means that once a String is created, it can never be changed. "What?" you protest, "I've added things to a String all the time." But you haven't. Perhaps you're thinking of code like this:

String s = "love";
s = s + "hate";
System.out.println(s); // prints lovehate, so s has been changed, hasn't it?

Actually, you haven't modified the original String "love". Instead, this code sets s to be an entirely new String, which is "lovehate". How do we know it's a new String? Because we can do the following:

String s1 = "love";
String s2 = s1;
s1 = s1 + "hate";
System.out.println(s1); // prints lovehate
System.out.println(s2); // prints love

So we can see that the original String remains the unchanged.

Why does this matter? Because, every time you append to a String, you have to create an entirely new String, which takes time proporitional to the length of the whole new String. For example, consider the following code:

String s = ... // some String that is length 1000000000
s = s + "a";

We create a really long String, and then append a single character to the end. But, because in the append step we have to create an entirely new String, this takes time proportional to 1000000001, not 1.

Self-test: Time Needed to Append in a Loop

Consider the following method. What is an asymptotic estimate of its runtime? Use the variables N and the length of s, which we'll call L.

public static String multiply(String s, int N) {
    String results = s;
    for (int i = 0; i < N; i++) {
        results += s;
    }
    return results;
}
O(N)
Incorrect. This would be true if adding a String cannot be done in constant time, but instead it takes time proportional to the length of the Strings being added.
O(N * L)
Incorrect. This would be true if adding two Strings could be done in time proportional to the length of the second string. But actually, it takes time proportional to the length of the entire resulting String.
O(N 2 * L)
Correct! Adding the first two Strings takes L + L time, or 2L time. Adding the next String in takes 2L + L time, or 3L. The next takes 3L + L or 4L. Overall, the time taken is 2L + 3L + 4L + 5L + ... + NL, which is proprotional to N 2 * L.
Check Solution

Introducing StringBuilder

Because adding Strings together is slow, Java has another class, called StringBuilder, which is kind of like a String, but is mutable, and thus supports fast appending. Here's how you could write the previous method with StringBuilder.

public static String fastMultiply(String s, int N) {
    StringBuilder results = new StringBuilder(s); // create a StringBuilder from s
    for (int i = 0; i < N; i++) {
        results.append(s); // use StringBuilder's fast append
    }
    return results.toString(); // turn the StringBuilder back into a String
}

Notice we aren't setting results to a new StringBuilder each time, and are instead just modifying the same one. Because of this, appending to a StringBuilder doesn't require copying over the whole object, so it only takes time proportional to the length of the String you're appending.

Self-test: Appending to StringBuilder

How long does the fastMultiply method take?

O(N)
Incorrect. This would be true if adding a String cannot be done in constant time, but instead it takes time proportional to the length of the String being added.
O(N * L)
Correct! This is true because append only takes time proprotional to the length of the String being appended, rather than the size of the whole resulting String.
O(N 2 * L)
Incorrect. Using StringBuilder will be faster than using just Strings, because we don't have to copy over the whole old StringBuilder every time we append to it.
Check Solution

String Experiments

Experiment with the multiply and fastMultiply methods. Try to find the smallest number of N such that multiply takes longer than 10 seconds. How long does fastMultiply take for this N? Can you find an N such that fastMultiply takes 10 seconds?

Describe your results in a file, string_experiments.txt, which you'll submit at the end of the lab.

The Moral of the Story

The difference between using Strings and StringBuilder is so extreme that Java programmers say you should never, ever append to Strings in a loop. It's okay to do a few isolated appends, but if you ever append repeatedly in a loop, you should always convert to StringBuilder, then do the appends, then convert back to String.

C. Wildcard Types

A Problem with Generics and Inheritance

Back when we worked with generics, we noticed a very curious fact. Suppose we have a class Point, which has a subclass TracedPoint. It was okay to do:

Point p = new TracedPoint(3, 4);

However, it was not okay to do:

ArrayList<Point> points = new ArrayList<TracedPoint>();

Although Point is the superclass of TracedPoint, ArrayList<Point> is not a superclass of ArrayList<TracedPoint>! This seems unfortunate. After all, say we have a method that prints out the x coordinates of a lot of points:

public static void printPoints(ArrayList<Point> points) {
    for (Point p: points) {
        System.out.println(p.getX());
    }
}

We would not be allowed to pass in an ArrayList<TracedPoint> as an argument! How disappointing.

Of course, the real question is why this happens.

Say we allowed the assignment ArrayList<Point> points = new ArrayList<TracedPoint>(); If we did, then we'd get a problem: the compiler, which sees the static type, would think that you're allowed to add any subclass of Point to points. For example, the compiler would expect that you'd be allowed to do:

// assume WeirdPoint is a subclass of Point but not TracedPoint
points.add(new WeirdPoint(4, 7)); 

All the compiler knows is that points is an ArrayList of Point objects, so it thinks points should be able to contain any subclass of Point. However, this doesn't make sense, because we know the dynamic type of points is ArrayList<TracedPoint>, so we know it should not be allowed to contain WeirdPoints. Therefore, storing an ArrayList<TracedPoint> in an ArrayList<Point> reference is unsafe.

Introducing Wildcards

What can be done instead? Java introuces a special type known as a wildcard type to help resolve this problem. Here's the syntax:

ArrayList<? extends Point> points = new ArrayList<TracedPoint>();

Here's how you read it: "points is an ArrayList of some subclass of Point, but I'm not exactly sure which subclass."

Now if we had the following method:

public static void printPoints(ArrayList<? extends Point> points) {
    for (Point p: points) {
        System.out.println(p.getX());
    }
}

We would be able to pass in as an argument an ArrayList<TracedPoint> and it would work. This method can handle an ArrayList of any subclass of Point.

A Strange Asymmetry

Wildcard types have a strange assymtery associated with them. Consider our points vairable in the method above. We ARE allowed to do this:

Point p = points.get(0);

However, we are NOT allowed to do this:

Point p = new Point(0, 2);
points.add(p);

In other words, we can get things from points, but we cannot put things into points. This is a unique consequence of using wildcard types. Verify this fact for yourself in Eclipse. Is there any way to add and object to points?

In a text file, wildcards.txt, describe the results of your expreriment. Then explain why this makes sense. This asymmetry is not simply an arbitrary rule, but instead follows logically from our understanding of what the compiler knows and allows. In the text file, try to work through the logic of the compiler, and show how you could reason through this result yourself. You'll be submitting this at the end of the lab.

D. Packages

We will now shift gears to discussing packages.

Earlier in this course, you may have come across the line

package [package name];

at the top of a Java file. You may have also noticed that Eclipse likes to use packages to organize classes in projects. For example, when creating a class, Eclipse recommends that you add the class to a package, though so far in the class, we've told you to ignore this recommendation.

We've never had you use packages because the coding you do for this class is lightweight enough that more organization wouldn't be helpful. However, even though you haven't created your own packages, you have been using them all along, potentially without noticing.

So What Is a Package?

A package is a grouping of related classes (and interfaces) that provides namespace management and access protection. Sometimes packages are simply used for their organizational benefits.
At first glance, packages simply correspond to the folders a class is in. For example, if your code is in a package named proj1, then it must also be in a folder named proj1/. If your code is in a package named cs61bl.proj1, then it must be in a nested folder cs61bl/proj1/.

The reverse is not necessarily true. If a Java file is in a folder, then it's not necessarily in a package.

Namespace Management and Importing

One of the main benefits of packages outside of organization is namespace management. Namespace management refers to the concept of preventing naming collisions. After all, there are a lot of programmers out there writing a lot of Java files, some of which are bound to have the same name. We have to have some way to deal with this.

Say you're writing a program and you want to use two different classes named ArrayList, which were written by two different people. The packages they're in can be used to distinguish them. All you have to do is use the complete package name.

For example, the ArrayList you know and love is from the package java.util. But, what if I wrote my own completely unrelated class that happens to be named ArrayList, and I put it in a package called cs61bl? How could you use this ArrayList instead? You could instantiate either type of ArrayList in your code like this:

java.util.ArrayList arr1 = new java.util.ArrayList();
cs61bl.ArrayList arr2 = new cs61bl.ArrayList();

It turns out, in general, whenever you refer to a class, you must specify the complete package.

"But wait!" you protest, "In this class I've been using ArrayList all the time, but I never called it java.util.ArrayList. What's the deal?"

Good observation. The reason you never had to call it java.util.ArrayList before is that you imported it, with a statement like:

import java.util.ArrayList;

This import statement does nothing more than tell Java that we write ArrayList, we really mean java.util.ArrayList, and not some other ArrayList class. In other words, the import statement does nothing more than allow you to abbreviate class names. There is never a time where you need to use import. All it does is save you some typing.

E. Java and the Terminal

So far, we've always recommended you run your java programs from within Eclipse. All Java programmers should know the basics of running programs from the terminal too, though.

This isn't so hard, right? If you are in a folder with a file HelloWorld.class, you can run the file by typing java HelloWorld.

The Classpath

But, did you know you can actually run a Java program from any location on your computer? The class file doesn't even have to be in the same folder!

When you type java HelloWorld, Java will be able to run the class as long as Java knows where to look for it. By default, Java only looks in the current folder, but you can tell it to look in other places too. The way you do this is by adding these places to something called the classpath, which you can set with the -cp option.

For example, suppose you want to run a class file that is inside a folder called deep, relative to the current folder. All you have to do is type:

java -cp deep HelloWorld

The option -cp deep tells Java to look in the deep folder for the HelloWorld class. If you omit the -cp option altogether, then it is like you had typed

java -cp . HelloWorld

So you just search the current directory. If there is no HelloWorld.class in the current directory, then this command would error.

What if you want to search multiple directories? You can add multiple locations to the classpath, and separate them with colons.

java -cp .:deep:other HelloWorld

would search for the HelloWorld class in the current directory, in the deep directory, and in the other directory.

(Windows users should separate with semicolons instead of colons, by the way)

Quick gotcha

Notice! We do not use the line

java deep/HelloWorld

This will not work! deep/HelloWorld is not the name of the class. The name of the class is simply HelloWorld, even though it happens to be in a folder deep.

Packages Though

We said earlier that a Java class in a folder is not necessarily in a package, but a Java class in a package is necessarily in a folder. The previous section on running programs in folders only applied to programs not in packages. It's actually simpler to run programs in packages. Recall that the package name really is a part of the class name. Say you want to run the class HelloWorld in the package greetings. Then you should type the full name

java greetings.HelloWorld

In this case, Java is smart enough to kow it should look in the greetings folder, so you don't have to include it in the classpath. This works as long as the greetings packages is in the current folder.

Self-test: Running In Folders and Packages

Consider the following directory structure

home/
    emotions/
        deep/
            greetings/
                friendly/
                    HelloWorld.class

You're in the home/ folder, and you want to run HelloWorld. It's in a package greetings.friendly, but just in folders (not packages) emotions/deep/. What's the correct command?

java greetings.friendly.HelloWorld
Incorrect. Although the Java can use the package names to search folders, it doesn't know to look into emotions/deep in the first place, so it doesn't see the start of the package greetings .
java -cp emotions greetings.friendly.HelloWorld
Incorrect. This will tell Java to look in emotions/ for the package greetings , but it won't find it there.
java -cp emotions/deep greetings.friendly.HelloWorld
Correct! This will tell Java to look in emotions/deep/ . In there, it will see the package greetings , and then follow that until it finds HelloWorld .
java -cp emotions.deep greetings.friendly.HelloWorld
Incorrect. Dots are only used to separate packages, not normal folders, which emotions/ and deep/ are.
Check Solution

Compiling in Folders and Packages

Earlier we mentioned that you could add multiple different locations to the classpath. But why would you want to do this, if you can only run one file at a time? We'll see a couple of reasons is this necessary. One is that it can be useful for compiling one class that depends on others in different locations.

For example, say you're trying to compile the file A.java which references the classes B and C. In order to compile A.java, the classes B and C have to be on the classpath.

This extends to using packages as well. When compiling a class coolpackage.A which references the classes oddpackage.B and weirdpackage.C, the locations of the packages oddpackage and weirdpackage must be on the classpath.

Potentially confusing point: When we run Java programs, we always use the class name (with packages). If the class is in another folder, we have to include that folder on the classpath. When we compile, however, we use the file name, not the class name. Say you want to compile a file HelloWorld.java inside a folder deep/. The correct command is actually

javac deep/HelloWorld.java

Notice that this differs from how we would run the class. When running a Java program, -cp is used to indicate the location the class could be found. When compiling a Java file, -cp is used to indicate the location of classes and packages that the file might depend on.

Compiling Experiment

In the zip file we gave you, there is a directory structure:

emotions/
    deep/
        greetings/
            HelloWorld.java
        feelings/
            gratitude/
                ThankYou.java

What is the sequence of commands you need to do in order to compile and run HelloWorld.java from the folder above emotions/? Write this down in a file compiling.txt, which you'll be submitting at the end of the lab. Hint: It may be worth your time to inspect the source code for HelloWorld and ThankYou.

F. Using Other People's Code

Other Code and You

In this class you created a lot of your own Java classes, but you also used a lot of classes that other people wrote, such as ArrayList, HashMap, and StringBuilder. These classes were all included by default in Java. But often it's useful to use classes written by other people that aren't included by default in Java. You can download these off the internet, often in the form of JAR files.

JAR Files

JAR stands for Java Archive. It is a package file format used to aggregate many Java .class files and associated metadata (like text, images, etc.). Essentially, a JAR file is nothing more than a .zip file for Java projects.

JAR files are generally used to distribute application software or libraries. JAR files are used because it is often unreasonable to give many separate, unorganized .class files to developers. Packaging all these .class files together into a runnable archive makes it easy to share complicated Java programs with other people.

One example of a JAR file that we have used quite frequently in this course is the JUnit JAR files. JUnit is a complicated package with many different class files. These are all packaged into two JUnit JAR, which are all you need to use JUnit testing methods.

Your Own JAR File

If you are interested in creating a JAR archive for a library that you have created, you can read more about it in Oracle's tutorials on creating a JAR file. The rest of the lab will focus on using someone else's JAR that you have downloaded.

Exercise: Compiling and Running Using a JAR

Let's get some practice using JARs. Do you remember the StdDraw class back from project 1? We used it to help us draw pictures. For project 1, we just gave this class to you as a .java file. But the normal way to download it would have been to download it in a JAR. We've given you this class (along with a bunch of others you won't need) in a file stdlib.jar.

Write your own class, Circle, that uses StdDraw to draw a circle when run (it doesn't matter exactly what the circle looks like — just make it a circle). Then, try to compile and run Circle from the terminal. Since Circle depends on a class inside a JAR, you'll need to include that JAR in your classpath with compiling and running Circle. Note that, when using JAR files, you must include the JAR itself on the classpath, and not just to folder that the JAR is in.

Hint: To complete this task, you might want to have a look inside the JAR to see the source code of StdDraw, so you can see the methods it has. Try figuring out how to do this on your own, by searching what to do on the internet.

Using JARs in Eclipse

Using JARs in Eclipse is a little simpler. You don't need to include a -cp option to set the classpath. Instead, Eclipse lets you set the build path as a setting for your project. To add a JAR file to a project, right click on the project, then navigate to Build Path >> Add External Archives. Then select the JAR file on your computer. Now, you can use the classes in that JAR anywhere in your project.

G. Conclusion

As lab25, please turn in string_experiments.txt, wildcards.txt, compiling.txt, and Circle.java.