hadoop

Go to file

HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766 )

HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.

2022-08-31 16:52:12 +01:00

.github

HADOOP-15184. Add GitHub pull request template. (#1419 )

2019-09-11 11:10:11 +09:00

dev-support

HADOOP-11867. Add a high-performance vectored read API. (#3904 )

2022-06-23 17:09:16 -05:00

hadoop-assemblies

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

hadoop-build-tools

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

hadoop-client-modules

HADOOP-18332. Remove rs-api dependency by downgrading jackson to 2.12.7. (#4552 )

2022-07-16 18:18:52 +01:00

hadoop-cloud-storage-project

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

hadoop-common-project

HADOOP-18375. Fix failure of shelltest for hadoop_add_ldlibpath. (#4652 )

2022-08-30 10:44:11 +00:00

hadoop-dist

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

hadoop-hdfs-project

HDFS-16684. Exclude the current JournalNode (#4786 )

2022-08-28 11:15:04 -07:00

hadoop-mapreduce-project

MAPREDUCE-7403. manifest-committer dynamic partitioning support. (#4728 )

2022-08-24 11:19:05 +01:00

hadoop-maven-plugins

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

hadoop-minicluster

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

hadoop-project

Revert "HADOOP-18417. Upgrade to M7 of surefire plugin (#4795 )"

2022-08-25 03:53:34 +05:30

hadoop-project-dist

HADOOP-18305. Release Hadoop 3.3.4: upstream changelog and jdiff files

2022-08-05 14:02:28 +01:00

hadoop-tools

HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766 )

2022-08-31 16:52:12 +01:00

hadoop-yarn-project

YARN-11248. Add unit test for FINISHED_CONTAINERS_PULLED_BY_AM event on DECOMMISSIONING (#4721 )

2022-08-16 19:07:42 +09:00

licenses

HADOOP-17666. Update LICENSE for 3.3.1 (#3011 )

2021-05-21 18:15:48 -07:00

licenses-binary

HADOOP-17666. Update LICENSE for 3.3.1 (#3011 )

2021-05-21 18:15:48 -07:00

.gitattributes

…

.gitignore

YARN-10407. Add phantomjsdriver.log to gitignore. (#2244 )

2021-02-17 10:28:17 +09:00

BUILDING.txt

HADOOP-18214. Update BUILDING.txt (#3811 )

2022-04-21 18:39:51 +01:00

LICENSE-binary

HADOOP-18333. Upgrade jetty version to 9.4.48.v20220622 (#4600 )

2022-08-24 08:16:49 +08:00

LICENSE-binary-yarn-applications-catalog-webapp

HADOOP-17666. Update LICENSE for 3.3.1 (#3011 )

2021-05-21 18:15:48 -07:00

LICENSE-binary-yarn-ui

HADOOP-17666. Update LICENSE for 3.3.1 (#3011 )

2021-05-21 18:15:48 -07:00

LICENSE.txt

HADOOP-18044. Hadoop - Upgrade to jQuery 3.6.0 (#3791 )

2022-02-11 23:18:25 +08:00

NOTICE-binary

HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864 )

2022-01-18 12:20:12 +00:00

NOTICE.txt

HADOOP-15958. Revisiting LICENSE and NOTICE files.

2019-08-27 13:47:12 +09:00

pom.xml

HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 )

2022-06-22 13:09:50 +01:00

README.txt

HADOOP-15958. Revisiting LICENSE and NOTICE files.

2019-08-27 13:47:12 +09:00

start-build-env.sh

HADOOP-18052. Support Apple Silicon in start-build-env.sh (#3817 )

2021-12-23 18:14:16 +09:00

README.txt

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/

Languages

Java 92.9%

C++ 2.8%

C 1.8%

JavaScript 1.1%

Shell 0.5%

Other 0.6%