Converting a Code Sample from Java7 to Java8

Posted on - Last Modified on

Java 8 dropped into the world earlier this year and developers are still trying to get their heads around many of the concepts. It is by far the single biggest upgrade to the language since it's conception and brings a ton of new concepts and features. Naturally it's going to take time for people to become familiar and for places like stackexchange to fill up with answers to the many questions that inevitably fall out.

One of the things I've found to be lacking is concrete examples larger than a couple of lines of code. In this article I will take you through a basic Java 7 application and upgrade it to make use of some of the new Java 8 features, in particular lambdas, whilst sharing my thoughts on some of the new language features.

The Application

In this article we will write a program to allow us to amalgamate a set of mailing lists. We will then provide basic search functionality, as well as a count of the email providers that appear in the list. The application will take a directory of files as input. Each of these files is a mailing list; a set of email addresses separated by new lines (we assume all the emails are valid).

I will openly caveat that this is a simple example. There are bits I’ve clearly missed such as data validation and optomisation regards reading the files in. But it should serve as a decent exercise in how to go about upgrading your code!

This is the interface we will implement:

package com.corejavainterviewquestions;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
public interface EmailScanner {
    int numberOfEmails();
    Map<String, Long> totalByDomains();
    List<String> emailsContain(String search);
}
The first method returns the total emails across all files; the second a map of each domain to the number of times it occurs, and the last returns all email addresses which contain the search string.

First off we start with a test. In this test, we are going to create 3 files of varying names and populate them with test data. We then test using an EmailScanner to test the data matches up with the data we put in.

package com.corejavainterviewquestions;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.util.Map;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.containsInAnyOrder;
public class EmailScannerTeqst {
    private static final String GMAIL = "gmail.com";
    private final String filename2 = "emails2.txt";
    private final String filename3 = "weirederName.txt";
    private final String filename1 = "emails1.txt";
    @Before
    public void createFiles() {
        createFile(filename1, "hello@corejavainterviewquestions.com", "hello@" +
GMAIL, "something@hotmail.com");
createFile(filename2, "sam@corejavainterviewquestions.com", "test@" + GMAIL, "awesome@hotmail.com");

        createFile(filename3, "1@hotmail.com", "1@" + GMAIL, "2@hotmail.com",
"3@hotmail.com");
}

    @After
    public void clean() {
        new File(filename1).delete();
        new File(filename2).delete();
        new File(filename3).delete();
}

    @Test
    public void java8Scanner() throws Exception {
        EmailScanner emailScanner = new EmailScannerJava8(new
File("testdata/").toURI());
        testScanner(emailScanner);
    }
    @Test
    public void java7Scanner() throws Exception {
        EmailScanner emailScanner = new EmailScannerJava7(new
File("testdata/").toURI());
        testScanner(emailScanner);
    }
    private void testScanner(EmailScanner emailScanner) {
        assertThat(emailScanner.numberOfEmails(), is(10));
        assertThat(emailScanner.emailsContain("corejavainterviewquestions"),
containsInAnyOrder("hello@corejavainterviewquestions.com",
                "sam@corejavainterviewquestions.com"));
        Map<String, Long> domainsToEmailCount = emailScanner.totalByDomains();
        assertThat(domainsToEmailCount.keySet().size(), is(3));
        assertThat(domainsToEmailCount.get(GMAIL), is(3l));
}

    private void createFile(String filename, String... lines) {
        PrintWriter writer = null;
        try {
            writer = new PrintWriter("testdata/" + filename, "UTF-8");
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        for (String line : lines) {
            writer.println(line);
}

        writer.close();
    }
}
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.util.Map;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.containsInAnyOrder;
public class EmailScannerTeqst {
    private static final String GMAIL = "gmail.com";
    private final String filename2 = "emails2.txt";
    private final String filename3 = "weirederName.txt";
    private final String filename1 = "emails1.txt";
    @Before
    public void createFiles() {
        createFile(filename1, "hello@corejavainterviewquestions.com", "hello@" +
GMAIL, "something@hotmail.com");
createFile(filename2, "sam@corejavainterviewquestions.com", "test@" + GMAIL, "awesome@hotmail.com");

        createFile(filename3, "1@hotmail.com", "1@" + GMAIL, "2@hotmail.com",
"3@hotmail.com");
}

    @After
    public void clean() {
        new File(filename1).delete();
        new File(filename2).delete();
        new File(filename3).delete();
}

    @Test
    public void java8Scanner() throws Exception {
        EmailScanner emailScanner = new EmailScannerJava8(new
File("testdata/").toURI());
        testScanner(emailScanner);
    }
    @Test
    public void java7Scanner() throws Exception {
        EmailScanner emailScanner = new EmailScannerJava7(new
File("testdata/").toURI());
        testScanner(emailScanner);
    }
    private void testScanner(EmailScanner emailScanner) {
        assertThat(emailScanner.numberOfEmails(), is(10));
        assertThat(emailScanner.emailsContain("corejavainterviewquestions"),
containsInAnyOrder("hello@corejavainterviewquestions.com",
                "sam@corejavainterviewquestions.com"));
        Map<String, Long> domainsToEmailCount = emailScanner.totalByDomains();
        assertThat(domainsToEmailCount.keySet().size(), is(3));
        assertThat(domainsToEmailCount.get(GMAIL), is(3l));
}

    private void createFile(String filename, String... lines) {
        PrintWriter writer = null;
        try {
            writer = new PrintWriter("testdata/" + filename, "UTF-8");
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        for (String line : lines) {
            writer.println(line);
}

        writer.close();
    }
}

Pretty clear so far! At this point the implementation of both Java 7 and 8 are blank, and so the tests fail.

First, let’s look at getting the Java 7 version to pass. The easiest way to start is to implement the numberOfEmails() functionality. All we need to do is read each of the files in line by line and assign it to a result. We can make use of the try-with-resources functionality brought in with Java 7, along with the new Paths and Files APIs which make it much easier than with Java 6.

public int numberOfEmails() {
       List<String> result = readFiles();
       return result.size();
   }
private List<String> readFiles() {
       List<String> result = new ArrayList<>();
       Path dir = Paths.get(directory);
       try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
           for (Path entry : stream) {
               result.addAll(Files.readAllLines(entry));
           }
       } catch (IOException e) {
        throw new RuntimeException(e);
}

    return result;
}
   }
private List<String> readFiles() {
       List<String> result = new ArrayList<>();
       Path dir = Paths.get(directory);
       try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
           for (Path entry : stream) {
               result.addAll(Files.readAllLines(entry));
           }
       } catch (IOException e) {
        throw new RuntimeException(e);
}

    return result;
}

I’m opting to read the file in on each call as we’re only dealing with a tiny amount of data. I’ve pulled out the file reading as a separate method as it’s going to be used multiple times.

My first impression is that this code isn’t pretty, and you definitely need to take a couple of times to read over it to understand what it’s doing. The burden of checked exceptions adds an extra layer of complexity to the code too. Overall though, it’s simple and works well.

Now let’s turn to Java 8. As part of the new lambda syntax the new method Files.walk(Path p) has been introduced which allows us to create a Stream of files.

public int numberOfEmails() {
       return (int) allFileLines().count();
}

} }

private Stream<String> allFileLines() {
 try {
     return Files.walk(Paths.get(directory))
             .filter(p -> Files.isRegularFile(p))
             .flatMap(path -> {
                 try {
                     return Files.lines(path);
                 } catch (Exception e) {throw new RuntimeException(e);}
             });
 } catch (IOException e) {
     throw new RuntimeException(e);
}

} }

private Stream<String> allFileLines() {
 try {
     return Files.walk(Paths.get(directory))
             .filter(p -> Files.isRegularFile(p))
             .flatMap(path -> {
                 try {
                     return Files.lines(path);
                 } catch (Exception e) {throw new RuntimeException(e);}
             });
 } catch (IOException e) {
     throw new RuntimeException(e);

So what on earth is going on here? The first line is as we discussed and takes all of the files in. We then need to filter for “regular” files. This is a quirk of the API which means the result includes the root directory. This is not ideal but does allow us a nice example of how we can use filter and a boolean expression to identify whether to include the item in the stream.

We then move to flatmap. Each iteration we are converting 1 file to multiple lines, which are then aggregated together: a perfect use of flatmap. We can use the new Files.lines to return a Stream of Strings. We return a Stream so that we can continue to use lambdas on the implementation of the other methods.

You can’t have failed to notice the ugly syntax caused by the two try/catch blocks required due to the scoping of the code. Some developers have already started writing their own wrapper code to wrap the checked exception as an unchecked one so this isn’t needed, which I would recommend you do if you’re using Java 8 in anger.

Despite all these quirks I do really like the syntax, and it’s nice and easy when thinking about how to implement it. I want to loop through the files, then loop through the content of those things, and I can nicely chain that together in lambdas without having any nesting.

The other thing to note is the cast from long to int. Everything in lambda world appears to use longs, so if you’re current code returns ints then be prepared to do some casting.

Basic search

Now we get to the exciting parts. In Java 7 you will have implemented a basic list search hundreds of times. Iterate the collection, add the matching sets to a result set and then return it.

public List<String> emailsContain(String search) {
     List<String> result = new ArrayList<>();
     List<String> emails = readFiles();
     for (String email : emails) {
         if (email.contains(search)) result.add(email);
     }
     return result;
 }
         if (email.contains(search)) result.add(email);
     }
     return result;
 }
You can probably do this with your eyes shut. Syntactically it’s ok, although when I code this I get frustrated at the boiler plate code needed to create a result list.

In our Java 8 version we start to see how powerful it can be as a way to make things more concise and readable. As we touched on earlier, we can now directly apply a filter to the collection (functionality provided by Google’s Guava library in Java 7 and earlier) and return the results.

The collector at the bottom may seem a little strange when you first start coding like this. Streams also allow you to directly return an Object array or an iterator. I’m not sure why a simple toList terminator wasn’t built into Java 8 but using a collector isn’t the end of the world.

Aggregation

For the next piece of code we iterate through the emails and aggregate based on the domain and the number of time each one appears. In Java 7 this is relatively clunky. The solution is to loop through the emails splitting on the ‘@‘ symbol to get the domain. We check if the domain is present, and if so we increment the count. Otherwise we add the domain and set the count to one.

public List<String> emailsContain(String search) {
    return allFileLines()
}

.filter(s -> s.contains(search))
.collect(Collectors.toList());
public Map<String, Long> totalByDomains() {
    Map<String, Long> result = new HashMap<>();
    List<String> strings = readFiles();
    for (String string : strings) {
        String domain = string.split("@")[1];
        if (result.containsKey(domain))
            result.put(domain, result.get(domain) + 1);
        else
            result.put(domain, 1l);
}

    return result;
}
}

.filter(s -> s.contains(search))
.collect(Collectors.toList());
public Map<String, Long> totalByDomains() {
    Map<String, Long> result = new HashMap<>();
    List<String> strings = readFiles();
    for (String string : strings) {
        String domain = string.split("@")[1];
        if (result.containsKey(domain))
            result.put(domain, result.get(domain) + 1);
        else
            result.put(domain, 1l);
}

    return result;
}

This is another piece of boilerplate code that all developers have had to right. In particular, the code to check if the key is present or not before adding or updating the value is fairly painful to read, and is a potential source of coding errors.

This is the sort of problem which Java 8 is built for.

One line of code! We simply collect the results up using a Collectors.groupingBy. The first argument specifies the thing to group by, which for us is the domain. The second argument details we would like to aggregate the data. In this example we’re simply counting the number of matching instances but there’s an array of options available in the API. An example from the javadoc:

Syntactically I think it’s really clear what’s going on here and the syntax is incredibly concise. If there is no second argument the result would be Map<String, List> where List would be all of the emails for that domain.

The finished solution

Java 7

public Map<String, Long> totalByDomains() {
    return allFileLines()
}

.collect(Collectors.groupingBy(s -> s.split("@")[1], counting()));
Collectors.summingInt(Employee::getSalary)
package com.corejavainterviewquestions;
import java.io.IOException;
import java.net.URI;
import java.nio.file.DirectoryStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class EmailScannerJava7 implements EmailScanner {
    private URI directory;
    public EmailScannerJava7(URI directory) throws IOException {
        this.directory = directory;
}

    public int numberOfEmails() {
        List<String> result = readFiles();
        return result.size();
}

    private List<String> readFiles() {
        List<String> result = new ArrayList<>();
        Path dir = Paths.get(directory);
        try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
            for (Path entry : stream) {
                result.addAll(Files.readAllLines(entry));
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        return result;
    }
    public Map<String, Long> totalByDomains() {
        Map<String, Long> result = new HashMap<>();
        List<String> strings = readFiles();
        for (String string : strings) {
            String domain = string.split("@")[1];
            if (result.containsKey(domain))
                result.put(domain, result.get(domain) + 1);
            else
                result.put(domain, 1l);
}

        return result;
    }
    public List<String> emailsContain(String search) {
        List<String> result = new ArrayList<>();
        List<String> emails = readFiles();
        for (String email : emails) {
            if (email.contains(search)) result.add(email);
        }
        return result;
    }
}
}

.collect(Collectors.groupingBy(s -> s.split("@")[1], counting()));
Collectors.summingInt(Employee::getSalary)
package com.corejavainterviewquestions;
import java.io.IOException;
import java.net.URI;
import java.nio.file.DirectoryStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class EmailScannerJava7 implements EmailScanner {
    private URI directory;
    public EmailScannerJava7(URI directory) throws IOException {
        this.directory = directory;
}

    public int numberOfEmails() {
        List<String> result = readFiles();
        return result.size();
}

    private List<String> readFiles() {
        List<String> result = new ArrayList<>();
        Path dir = Paths.get(directory);
        try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
            for (Path entry : stream) {
                result.addAll(Files.readAllLines(entry));
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        return result;
    }
    public Map<String, Long> totalByDomains() {
        Map<String, Long> result = new HashMap<>();
        List<String> strings = readFiles();
        for (String string : strings) {
            String domain = string.split("@")[1];
            if (result.containsKey(domain))
                result.put(domain, result.get(domain) + 1);
            else
                result.put(domain, 1l);
}

        return result;
    }
    public List<String> emailsContain(String search) {
        List<String> result = new ArrayList<>();
        List<String> emails = readFiles();
        for (String email : emails) {
            if (email.contains(search)) result.add(email);
        }
        return result;
    }
}

Java 8

package com.corejavainterviewquestions;
import java.io.IOException;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import static java.util.stream.Collectors.counting;
public class EmailScannerJava8 implements EmailScanner {
    private URI directory;
    public EmailScannerJava8(URI directory) throws IOException {
        this.directory = directory;
}

    public int numberOfEmails() {
        return (int) allFileLines().count();
}

    public Map<String, Long> totalByDomains() {
        return allFileLines()
                .collect(Collectors.groupingBy(s -> s.split("@")[1], counting()));
}

    public List<String> emailsContain(String search) {
        return allFileLines()
                .filter(s -> s.contains(search))
                .collect(Collectors.toList());
}

    private Stream<String> allFileLines() {
        try {
            return Files.walk(Paths.get(directory))
                    .filter(p -> Files.isRegularFile(p))
                    .flatMap(path -> {
                        try {
                            return Files.lines(path);
                        } catch (Exception e) {throw new RuntimeException(e);}
                    });
        } catch (IOException e) {
            throw new RuntimeException(e);
} }

}
import java.io.IOException;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import static java.util.stream.Collectors.counting;
public class EmailScannerJava8 implements EmailScanner {
    private URI directory;
    public EmailScannerJava8(URI directory) throws IOException {
        this.directory = directory;
}

    public int numberOfEmails() {
        return (int) allFileLines().count();
}

    public Map<String, Long> totalByDomains() {
        return allFileLines()
                .collect(Collectors.groupingBy(s -> s.split("@")[1], counting()));
}

    public List<String> emailsContain(String search) {
        return allFileLines()
                .filter(s -> s.contains(search))
                .collect(Collectors.toList());
}

    private Stream<String> allFileLines() {
        try {
            return Files.walk(Paths.get(directory))
                    .filter(p -> Files.isRegularFile(p))
                    .flatMap(path -> {
                        try {
                            return Files.lines(path);
                        } catch (Exception e) {throw new RuntimeException(e);}
                    });
        } catch (IOException e) {
            throw new RuntimeException(e);
} }

}

 
Lambdas are cool but...

I don’t want to suggest that lambdas are all sunshine and roses. The syntax is generally clean and readable, and I think it’s going to improve being a Java developer on the whole. However, the Java 7 code in this example took me all of 10 minutes to put together, whereas the 8 code took me over an hour. Whilst some of that is inevitably due to my inexperience with some of the new syntax and the lack of examples online, a lot of it comes down to how difficult it is to debug. Anyone who has programmed scala before can attest to the pain of it’s cryptic error messages, and lambdas suffer from a lot of the same pain. It can be really hard to figure out what’s going on.

It’s going to get easier as we all spend more time with them, and some beautiful code is going to be written. But there’s going to be a learning curve to get there.

Hopefully this has given a useful insight into how you can use Java 8 to improve the appearance of your code base. This is obviously very high level and doesn’t even touch on features such as parallel streams, but it’s important to get the basics right first. 

Posted 18 December, 2014

samberic

Java Geek, Entrepreneur and All Round Technologist

Founder of www.corejavainterviewquestions.com and all round geek.

Next Article

Collections in Python