HADOOP-12009 Clarify FileSystem.listStatus() sorting order & fix FileSystemContractBaseTest:testListStatus. (J.Andreina via stevel)

This commit is contained in:
Steve Loughran 2015-06-28 19:13:48 +01:00
parent b543d1a390
commit 3dfa8161f9
4 changed files with 32 additions and 4 deletions

View File

@ -661,6 +661,10 @@ Release 2.8.0 - UNRELEASED
HADOOP-11958. MetricsSystemImpl fails to show backtrace when an error HADOOP-11958. MetricsSystemImpl fails to show backtrace when an error
occurs (Jason Lowe via jeagles) occurs (Jason Lowe via jeagles)
HADOOP-12009 Clarify FileSystem.listStatus() sorting order & fix
FileSystemContractBaseTest:testListStatus. (J.Andreina via stevel)
OPTIMIZATIONS OPTIMIZATIONS
HADOOP-11785. Reduce the number of listStatus operation in distcp HADOOP-11785. Reduce the number of listStatus operation in distcp

View File

@ -1498,7 +1498,9 @@ public boolean accept(Path file) {
/** /**
* List the statuses of the files/directories in the given path if the path is * List the statuses of the files/directories in the given path if the path is
* a directory. * a directory.
* * <p>
* Does not guarantee to return the List of files/directories status in a
* sorted order.
* @param f given path * @param f given path
* @return the statuses of the files/directories in the given patch * @return the statuses of the files/directories in the given patch
* @throws FileNotFoundException when the path does not exist; * @throws FileNotFoundException when the path does not exist;
@ -1540,6 +1542,9 @@ public RemoteIterator<Path> listCorruptFileBlocks(Path path)
/** /**
* Filter files/directories in the given path using the user-supplied path * Filter files/directories in the given path using the user-supplied path
* filter. * filter.
* <p>
* Does not guarantee to return the List of files/directories status in a
* sorted order.
* *
* @param f * @param f
* a path name * a path name
@ -1560,6 +1565,9 @@ public FileStatus[] listStatus(Path f, PathFilter filter)
/** /**
* Filter files/directories in the given list of paths using default * Filter files/directories in the given list of paths using default
* path filter. * path filter.
* <p>
* Does not guarantee to return the List of files/directories status in a
* sorted order.
* *
* @param files * @param files
* a list of paths * a list of paths
@ -1576,6 +1584,9 @@ public FileStatus[] listStatus(Path[] files)
/** /**
* Filter files/directories in the given list of paths using user-supplied * Filter files/directories in the given list of paths using user-supplied
* path filter. * path filter.
* <p>
* Does not guarantee to return the List of files/directories status in a
* sorted order.
* *
* @param files * @param files
* a list of paths * a list of paths
@ -1736,6 +1747,8 @@ public LocatedFileStatus next() throws IOException {
* while consuming the entries. Each file system implementation should * while consuming the entries. Each file system implementation should
* override this method and provide a more efficient implementation, if * override this method and provide a more efficient implementation, if
* possible. * possible.
* Does not guarantee to return the iterator that traverses statuses
* of the files in a sorted order.
* *
* @param p target path * @param p target path
* @return remote iterator * @return remote iterator
@ -1763,6 +1776,8 @@ public FileStatus next() throws IOException {
/** /**
* List the statuses and block locations of the files in the given path. * List the statuses and block locations of the files in the given path.
* Does not guarantee to return the iterator that traverses statuses
* of the files in a sorted order.
* *
* If the path is a directory, * If the path is a directory,
* if recursive is false, returns files in the directory; * if recursive is false, returns files in the directory;

View File

@ -183,6 +183,10 @@ to the same path:
forall fs in listStatus(Path) : forall fs in listStatus(Path) :
fs == getFileStatus(fs.path) fs == getFileStatus(fs.path)
**Ordering of results**: there is no guarantee of ordering of the listed entries.
While HDFS currently returns an alphanumerically sorted list, neither the Posix `readdir()`
nor Java's `File.listFiles()` API calls define any ordering of returned values. Applications
which require a uniform sort order on the results must perform the sorting themselves.
### Atomicity and Consistency ### Atomicity and Consistency

View File

@ -20,6 +20,7 @@
import java.io.FileNotFoundException; import java.io.FileNotFoundException;
import java.io.IOException; import java.io.IOException;
import java.util.ArrayList;
import junit.framework.TestCase; import junit.framework.TestCase;
@ -224,9 +225,13 @@ public void testListStatus() throws Exception {
paths = fs.listStatus(path("/test/hadoop")); paths = fs.listStatus(path("/test/hadoop"));
assertEquals(3, paths.length); assertEquals(3, paths.length);
assertEquals(path("/test/hadoop/a"), paths[0].getPath()); ArrayList<String> list = new ArrayList<String>();
assertEquals(path("/test/hadoop/b"), paths[1].getPath()); for (FileStatus fileState : paths) {
assertEquals(path("/test/hadoop/c"), paths[2].getPath()); list.add(fileState.getPath().toString());
}
assertTrue(list.contains(path("/test/hadoop/a")));
assertTrue(list.contains(path("/test/hadoop/b")));
assertTrue(list.contains(path("/test/hadoop/c")));
paths = fs.listStatus(path("/test/hadoop/a")); paths = fs.listStatus(path("/test/hadoop/a"));
assertEquals(0, paths.length); assertEquals(0, paths.length);