Saturday, March 24, 2012

JDK7: Part 1- The power of java 7 NIO.2 (JSR 203) (important concepts)

Get educated with the new file I/O mechanism introduced in the JDK 7 release.

This is a 2 parts series of article, focuses on JSR 203 java 7 NIO.2 API.
  1. Part 1 will introduce the main concepts involved in the new JSR. 
  2. Part 2 will covers the important aspects involved in developing NIO.2-based applications by examples and case studies for the concepts introduced in part 1, this will spice up your Java 7 applications with the new I/O capabilities. You will learn to develop NIO.2 applications, beginning with simple but essential stuff and gradually moving on to complex features such as sockets and asynchronous channels.
In this article part 1 I will introduce the new and powerful IO API, packaged & delivered in Java SE 7 NIO.2. Also I will cover all the important aspects involved in developing NIO.2-based applications in brief.

Table of contents:
1. Working with the new Path Class (Part 1).
2. Metadata File Attributes (Part 1, 2 (examples)).
3. What is Symbolic and Hard Links? (Part 1, 2 (examples)).
4. New API for Files and Directories (Part 1, 2 (examples)).
5. The FileVisitor Interface (Part 1, 2 (examples)).
6. Monitoring via Watch Service API (Part 1, 2 (examples)).
7. New powerful Random Access Files (Part 1, 2 (examples)).
8. Networking with the Sockets APIs (Part 1, 2 (examples)).
9. The Asynchronous Channel API (Part 1, 2 (examples)).
10. Important Things to Remember (& migration tips from java IO to java NIO.2) (Part 2).
11. References (Part 1, 2).


---------------------------------------------

1. Working with the new Path Class.
The Path class supports two types of operations:
Syntactic operations (almost any operation that involves manipulating paths without accessing the file system; these are logical manipulations done in memory) and operations over files referenced by paths. This section covers the first type of operations and introduces you to the Path API. In section 4 titled (New API for Files and Directories); I will focus on exploring the second type of operations. The concepts presented in this section will be very useful in the rest article and part 2.

A path resides in a file system, which:

Stores and organizes files on some form of media, generally one or more hard drives, in such a way that they can be easily retrieved.
The file system can be accessed through the java.nio.file.FileSystems final class, which is used to get an instance of the java.nio.file.FileSystem we want to work on.

FileSystems contains the following two important methods, as well as a set of newFileSystem() methods, for constructing new file systems:

  • getDefault(): This is a static method that returns the default FileSystem to the JVM—commonly the operating system default file system.
  • getFileSystem(URI uri): This is a static method that returns a file system from the set of available file system providers that match the given URI schema. The Path class manipulates a file in any file system (FileSystem) that can use any storage place (java.nio.file.FileStore; this class represents the underlying storage).

By default (and commonly), the Path refers to files in the default file system (the file system of the computer), but NIO.2 is totally modular—an implementation of FileSystem for data in memory, on the network, or on a virtual file system is perfectly agreeable to NIO.2.

NIO.2 provides us with all file system functionalities that we may need to perform over a file, a directory, or a link. The Path class is an upgraded version of the well-known java.io.File class, but the File class has kept a few specific operations, so it is not deprecated and cannot be considered obsolete. Moreover, starting with Java 7, both classes are available, which means programmers can mix their powers to obtain the best of I/O APIs. Java 7 provides a simple API for conversion between them. Remember the days when you had to do the following:

Well, those days are gone, because with Java 7 you can do this:

At a closer look, a Path is a programmatic representation of a path in the file system. The path string contains the file name, the directory list, and the OS-dependent file delimiter (e.g., backslash “\” on Microsoft Windows and forward slash “/” on Solaris and Linux), which means that a Path is not system independent since it is based on a system-dependent string path. Because Path is basically a string, the referenced resource might not exist.


Section 10 (Part 2) will provide a good tips for migration between the old IO and NIO.2 API.

---------------------------------------------

2. Metadata File Attributes.
If you have questions about a file or a directory, such as whether it is hidden, whether it is a directory, what its size is, and who owns it, you can get answers to those questions (and many others) from the metadata, which is data about other data.

NIO.2 associates the notion of metadata with attributes and provides access to them through the java.nio.file.attribute package. Since different file systems have different notions about which attributes should be tracked, NIO.2 groups the attributes into views, each of which maps to a particular file system implementation.

Generally, views provide the attributes in bulk through a common method, readAttributes(). In addition, you can extract and set a single attribute with the getAttribute() and setAttribute() methods, respectively, which are available in the java.nio.file.Files class. Depending On the view, other methods are available for additional tasks.

You can also view the access control list (ACL) of a file and set UNIX permissions on a file. Moreover, you will explore file store attributes and you can define your own attributes.

In part 2 you will learn how to use the views provided by NIO.2. You will see how to determine whether a file is read-only or hidden, when it was last accessed or modified, who owns it, and how to take ownership of it.

---------------------------------------------

3. What is Symbolic and Hard Links?
Linux and UNIX users (especially administrators) should be familiar with the concept of links.
There are two types of links:

  • Symbolic links.
  • Hard links. 
Links commonly reach a file through several names, instead of navigating through a series of directories and subdirectories from the root – think of a link as an entity mapping a file/directory path and identified through a set of names. If you are a dedicated Windows user, you might not be familiar with links, although Windows itself is perfectly aware of them, especially symbolic links, which most resemble Windows shortcuts.

NIO.2 provides support for both hard links and symbolic links. Each method of the Path class knows how to detect a link and will behave in the default manner if no configuration of behavior is specified.

In part 2, you will learn how to manipulate links through the java.nio.file API, including how to create a link and how to find the target of a link. Most operations are implemented through the java.nio.file.Files class, which provides methods such as createLink(), createSymbolicLink(), isSymbolicLink(), and readSymbolicLink().

---------------------------------------------

4. Files and Directories.
Now that you know how to point to a file or directory using the Path class in first section, you are ready to know how to accomplish the most common tasks for managing files and directories, such as create, read, write, move, delete, and so on.

NIO.2 comes with a set of brand new methods to accomplish these tasks, most of which are found in the java.nio.file.Files class.

The java.nio.file.Files provides some methods dedicated to checking if a Path is readable, writable, executable, regular, or hidden. These checks enable you to determine what kind of file or directory you are dealing with before you apply operations such as write or read.

Also java.nio.file.Files provides file operations, such as reading, writing, creating, and opening files. There is a wide array of file I/O methods to choose from. You will find at work methods for buffered and unbuffered streams, leaving coverage of the methods for channels for the next sections 7, 8 and 9, in which you will see the real power of NIO. Also you can work with the well-known delete, copy, and move operations.

Part 2 focuses on directory operations, showing you how to list, create, and read directories. You will see how to list the file system roots, create directories with methods such as createDirectory() and createTempDirectory(), write directory filters, and list a directory’s content using the newDirectoryStream() method.

Each of these tasks is detailed presented and, as you will see, many aspects were “redesigned” from previous Java 6, but you will also recognize many of the presented methods from the java.io.File class.

---------------------------------------------

5. The FileVisitor Interface.
NIO.2 takes advantage of recursion programming technique.

Many programming tasks that involve working with files require visiting all files in a file tree, which is a good opportunity for using the recursive programming mechanism because every file should be “touched” individually. This is a very common approach when performing tasks such as deleting, copying, or moving a file tree. Based on this mechanism, NIO.2 encapsulates the traversal process of a file tree in an interface, named FileVisitor, in the java.nio.file package.

This section starts by presenting the FileVisitor’s scope and methods. Once you are familiar with FileVisitor, part 2 will help you to develop a set of applications you can use to perform tasks that involve traversing a file tree, such as finding, copying, deleting, and moving files.

As previously mentioned, the FileVisitor interface provides the support for recursively traversing a file tree. The methods of this interface represent key points in the traversal process, enabling you to take control when a file is visited, before a directory is accessed, after a directory is accessed, and when a failure occurs; in other words, this interface has hooks for before, during, and after a file is visited, as well as for when failure occurs.

Once you have control (at any of these key points), you can choose how to process the visited file and decide what should happen to it next by indicating a visit result through the FileVisitResult enum, which contains four enum constants:

  • FileVisitResult.CONTINUE: This visit result indicates that the traversal process should continue. It can be translated into different actions depending on which FileVisitor method is returned. For example, the traversal process may continue by visiting the next file, visiting a directory’s entries, or skipping a failure.
  • FileVisitResult.SKIP_SIBLINGS: This visit result indicates that the traversal process should continue without visiting the siblings of this file or directory.
  • FileVisitResult.SKIP_SUBTREE: This visit result indicates that the traversal process should continue without visiting the rest of the entries in this directory.
  • FileVisitResult.TERMINATE: This visit result indicates that the traversal process should terminate.
The constants of this enum type can be iterated as follows:

---------------------------------------------

6. Monitoring via Watch Service API.
The Watch Service API was introduced in Java 7 (NIO.2) as a thread-safe service that is capable of watching objects for changes and events. The most common use is to monitor a directory for changes to its content through actions such as create, delete, and modify. You’ve probably seen the effect of such a service many times.

For example, when you open a text file in an editor (such as Notepad++, jEdit, etc.) and the file content is modified outside the editor, you will see a message that asks whether you want to reload the file because it was modified. This means the editor has detected a file change through a watch service and is reporting it accordingly. This is known as the file change notification mechanism, and starting with NIO.2, it is available through the Watch Service API.

The Watch Service API is a low-level API that can be used as is or can be customized. You can even write a high-level API on top of it. By default, this API uses the underlying file system functionalities to watch the file system for changes. It allows you to register a directory (or directories) to be monitored for different kinds of notification events that you specify during registration.

When one or more of the registered notification events are detected by the watch service, the watch service passes the notification events to the process that is registered to handles them through a separate thread or pool of threads.

Note Starting with NIO.2, you no longer need to poll the file system for changes or use other in-house solutions to monitor the file system changes. In previous Java versions, you have to implement an agent running in a separate thread that keeps track of all the contents of the watched directories, constantly polling the file system to see if anything important has happened. Now, regardless of whether you are running Mac OS X, Linux, Unix, Windows, or some other OS, you have the guarantee that the underlying operating system and file system provide the required functionalities to allow Java to register to receive notification of file system changes.

---------------------------------------------

7. New powerful Random Access Files.
In sections we have explored files sequentially. Files that can be explored sequentially are known as sequential files. In this section you will see the advantages of using non-sequential (random) access to a file’s contents. Files that permit random access to their contents are known as random access files (RAFs). Sequential files are used more often because they are easy to create, but RAFs are more flexible and their data can be located faster.

With a RAF, you can open the file, seek a particular location, and read from or write to that file. After you open a RAF, you can read from it or write to it in a random manner just by using a record number, or you can add to the beginning or end of the file since you know how many records are in the file. A RAF allows you to read a single character, read a chunk of bytes or a line, replace a portion of the file, append lines, delete lines, and so forth, and allows you to perform all of these actions in a random manner.

Java 7 (NIO.2) introduces a brand-new interface for working with RAFs. Its name is SeekableByteChannel and it is available in the java.nio.channels package. It extends the older ByteChannel interface and represents a byte channel that maintains a current position and allows that position to be modified. Moreover, Java 7 improves the well-known FileChannel class by implementing this interface and providing RAF and FileChannel power in a single shot. With a simple cast we can transform a SeekableByteChannel into a FileChannel.

FileChannel provides facilities such as mapping a region of the file directly into memory for faster access, locking a region of the file, and reading and writing bytes from an absolute location without affecting the channel’s current position.

---------------------------------------------

8. Networking with the Sockets APIs.
Java introduced support for sockets in JDK 1.0, but things have of course changed over time from version to version. Jumping to Java 7, NIO.2 has improved this support by updating existing classes with new methods and adding new interfaces/classes for writing TCP/UDP-based applications.

First of all, NIO.2 introduces an interface named NetworkChannel that provides methods commons to all network channel classes—any channel that implements this interface is a channel to a network socket.

The main classes dedicated to synchronous socket channels, ServerSocketChannel, SocketChannel, and DatagramChannel, implement this interface, which comes with methods for binding to and returning local addresses, and methods for setting and getting socket options through the new SocketOption interface and StandardSocketOptions class. This interface’s methods and the ones added directly into classes (for checking connection state, getting remote addresses, and shutdown) will prevent you from having to call the socket() method.

NIO.2 also introduces the MulticastChannel interface as a subinterface of NetworkChannel. As its name suggests, the MulticastChannel interface maps a network channel that supports IP multicasting. Keep in mind that MulticastChannel is implemented only by the datagram channel (the DatagramChannel class). When joining a multicast group you get a membership key, which is a token that represents the membership of a multicast group. Through the membership key, you can block/unblock datagrams from different addresses, drop membership, get the channel and/or multicast group for which this membership key was created, and more.

---------------------------------------------

9. The Asynchronous Channel API.
We’ve finally reached the most powerful feature introduced in NIO.2, the asynchronous channel API. As you’ll see in this section, the asynchronous I/O (AIO) Java 7 journey starts in the java.nio.channels.AsynchronousChannel interface, which extends a channel with asynchronous I/O operations support.

This interface is implemented by three classes: AsynchronousFileChannel, AsynchronousSocketChannel, and AsynchronousServerSocketChannel.

These classes are similar in style to the NIO.2 channel APIs. In addition, there is an asynchronous channel named AsynchronousByteChannel that can read and write bytes and stands up as a sub interface of AsynchronousChannel (this subinterface is implemented by the AsynchronousSocketChannel class). 

Moreover, the new API introduces a class named AsynchronousChannelGroup, which presents the concept of an asynchronous channel group, in which each asynchronous channel belongs to a channel group (the default one or a specified one) that shares a pool of Java threads. These threads receive instructions to perform I/O events and they dispatch the results to the completion handlers. All the effort is for the purpose of handling the completion of initiated asynchronous I/O operations.

In part 2, you will see the asynchronous mechanism from the Java perspective. You will see the big picture of how Java implements asynchronous I/O, after which you will develop related applications for files and sockets. We will start with asynchronous I/O for files by exploring the AsynchronousFileChannel class and continue with asynchronous I/O for TCP sockets and UDP sockets.

A short overview of the difference between synchronous I/O and asynchronous I/O is in order.

Synchronous I/O vs. Asynchronous I/O
The difference between synchronous and asynchronous execution may seem a bit confusing at first, so let’s clear it up. Basically, there are two types of input/output (I/O) synchronization: synchronous I/O and asynchronous I/O (also referred to as overlapped I/O). In a synchronous I/O operation, a thread enters into action and waits until the I/O request is completed (the program is “stuck” waiting for the process to end, with no way out).

When the same action occurs in an asynchronous environment, a thread performs the I/O operation with more kernel help. Actually, it immediately passes the request to the kernel and continues on to process another job. The kernel signals to the thread when the operation has completed, and the thread “respects” the signal by interrupting its current job and processing the data from the I/O operation as necessary.

In the Java spirit of platform independence, asynchronous I/O can be tied to multiple threads—basically, allowing something to be processed on a separate thread.

Asynchronous I/O and synchronous I/O serve different purposes. You can use synchronous I/O if you simply want to make a request and receive a response. Synchronous I/O limits performance and scalability since it is one thread per I/O connection, and running thousands of threads significantly increases overhead on the operating system. 

Asynchronous I/O is a different programming model, because you don’t necessarily wait for a response, but rather submit your work for execution and then come back for a response either almost immediately or sometime later. Therefore, asynchronous I/O seems to be better than synchronous I/O, since performance and scalability are keywords of the I/O system. Various important operating systems, such as Windows and Linux, support fast, scalable I/O based on the use of asynchronous notifications of I/O operations taking place in the OS layers.

In summary, I/O processing that is expected to take a large amount of time can be optimized by using asynchronous I/O. For relatively fast I/O operations, synchronous I/O would be better because the overhead of processing kernel I/O requests and kernel signals may make asynchronous I/O less beneficial.

Asynchronous I/O Big Picture
When talking about asynchronous I/O in Java, we are talking about the asynchronous channels. An asynchronous channel is a connection that supports multiple I/O operations in parallel through separate threads (connecting, reading, and writing, for example) and provides mechanisms for controlling the operations after they’ve been initiated.

Note that all asynchronous channels initiate I/O operations (does not block the application to perform other tasks) and provide notifications when I/O completes. This rule is the foundation of asynchronous channels, and from it derives the entire asynchronous channel API.
All asynchronous I/O operations have one of two forms:

  • Pending result
  • Complete result

--------------------------------------------------------------------------------------------------------------------------

Finally I have reached to the end of the article. I would like to say thanks to Java 7 for such a powerful NIO.2 API.

We will meet in part 2 for complete applications covering the above concepts and providing solid support for mastering NIO.2.

10. References:
1. JSR 203: More New I/O APIs for the JavaTM Platform ("NIO.2")
2. Java SE tutorial: File I/O (Featuring NIO.2
3. Pro java 7 NIO.2 by Anghel Leonard 2012.
4. Java 7 Recipes by (Josh Juneau, Carl Dea, Freddy Guime, John O’Conner).
5. java.nio.file.

13 comments :

  1. Nice tutorial.One of the useful addition I found is ability to create hidden files in Java which was not available before JDK 7 and you to apply trick to get around it e.g. using "." in front of file name.

    ReplyDelete
  2. The string "Part 2" at the beginning of the article has a link pointing to an XXX domain..

    ReplyDelete
    Replies
    1. I am still writing it, and when i finished i will change the link.

      Delete
  3. Nice article Mohamed. Keep posting such article, enjoyed reading it.

    ReplyDelete
  4. How do I find which user has accessed the folder while monitoring. Just getting the NTSystem user name is fine. ?

    ReplyDelete
    Replies
    1. By Using watch service then, in VisitedFile() method, get the attributes of the file debinding on your OS then get last access date and user.

      Delete
  5. Is there any way to delete content from file?.

    Thanks in Advance.

    ReplyDelete
  6. Thanks for this great post! When will you finish part 2?

    ReplyDelete
  7. Part 2 still missing :-(

    ReplyDelete