diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java index de76090512..7380402eb6 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java @@ -114,6 +114,16 @@ default int maxReadSizeForVectorReads() { * As a result of the call, each range will have FileRange.setData(CompletableFuture) * called with a future that when complete will have a ByteBuffer with the * data from the file's range. + *
+ * The position returned by getPos() after readVectored() is undefined. + *
+ *+ * If a file is changed while the readVectored() operation is in progress, the output is + * undefined. Some ranges may have old data, some may have new and some may have both. + *
+ *+ * While a readVectored() operation is in progress, normal read api calls may block. + *
* @param ranges the byte ranges to read * @param allocate the function to allocate ByteBuffer * @throws IOException any IOE. diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md b/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md index 197b999c81..f64a2bd03b 100644 --- a/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md +++ b/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md @@ -454,6 +454,13 @@ Also, clients are encouraged to use `WeakReferencedElasticByteBufferPool` for allocating buffers such that even direct buffers are garbage collected when they are no longer referenced. +The position returned by `getPos()` after `readVectored()` is undefined. + +If a file is changed while the `readVectored()` operation is in progress, the output is +undefined. Some ranges may have old data, some may have new, and some may have both. + +While a `readVectored()` operation is in progress, normal read api calls may block. + Note: Don't use direct buffers for reading from ChecksumFileSystem as that may lead to memory fragmentation explained in HADOOP-18296.