Steve Loughran e4331a73c9
HADOOP-16932. distcp copy calls getFileStatus() needlessly and can fail against S3 (#1936)
Contributed by Steve Loughran.

This strips out all the -p preservation options which have already been
processed when uploading a file before deciding whether or not to query
the far end for the status of the (existing/uploaded) file to see if any
other attributes need changing.

This will avoid 404 caching-related issues in S3, wherein a newly created
file can have a 404 entry in the S3 load balancer's cache from the
probes for the file's existence prior to the upload.

It partially addresses a regression caused by HADOOP-8143,
"Change distcp to have -pb on by default" that causes a resurfacing
of HADOOP-13145, "In DistCp, prevent unnecessary getFileStatus call when
not preserving metadata"

Change-Id: Ibc25d19e92548e6165eb8397157ebf89446333f7
2020-04-09 18:23:47 +01:00

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/
Description
No description provided
Readme 264 MiB
Languages
Java 92.9%
C++ 2.8%
C 1.8%
JavaScript 1.1%
Shell 0.5%
Other 0.6%