Commit Graph

26462 Commits

Author SHA1 Message Date
PJ Fanning
6a07b5dc10
HADOOP-18575. Make XML transformer factory more lenient (#5224)
Due diligence followup to
HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940)

Contributed by P J Fanning
2022-12-18 12:25:10 +00:00
Steve Loughran
33785fc5ad
HADOOP-18577. Followup: javadoc fix (#5232)
Fixes a javadoc error which came with
HADOOP-18577. ABFS: Add probes of readahead fix (#5205)

Part of the HADOOP-18521 ABFS readahead fix; MUST be included.

Contributed by Steve Loughran
2022-12-18 12:19:33 +00:00
Chengbing Liu
ca3526da92
HADOOP-18567. LogThrottlingHelper: properly trigger dependent recorders in cases of infrequent logging (#5215)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Co-authored-by: Chengbing Liu <liuchengbing@qiyi.com>
2022-12-16 09:15:11 -08:00
Xing Lin
f7bdf6c667
HDFS-16852. Skip KeyProviderCache shutdown hook registration if already shutting down (#5160)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2022-12-16 08:46:14 -08:00
Happy-shi
c5b42d59d2
HDFS-16866. Fix a typo in Dispatcher (#5202)
Signed-off-by: Tao Li <tomscut@apache.org>
2022-12-16 11:07:41 +08:00
Steve Loughran
cf1244492d
HADOOP-18577. ABFS: Add probes of readahead fix (#5205)
Followup patch to  HADOOP-18456 as part of HADOOP-18521,
ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

Add probes of readahead fix aid in checking safety of
hadoop ABFS client across different releases.

* ReadBufferManager constructor logs the fact it is safe at TRACE
* AbfsInputStream declares it is fixed in toString()
  by including fs.azure.capability.readahead.safe" in the
  result.

The ABFS FileSystem hasPathCapability("fs.azure.capability.readahead.safe")
probe returns true to indicate the client's readahead manager has been fixed
to be safe when prefetching.

All Hadoop releases for which probe this returns false
and for which the probe "fs.capability.etags.available"
returns true at risk of returning invalid data when reading
ADLS Gen2/Azure storage data.

Contributed by Steve Loughran.
2022-12-15 17:08:25 +00:00
Steve Loughran
5f08e51b72
HADOOP-18561. Update commons-net to 3.9.0 (#5214)
Addresses CVE-2021-37533, which *only* relates to FTP.

Applications not using the ftp:// filesystem, which, as
anyone who has used it will know is very minimal and
so rarely used, is not a critical part of the project.

Furthermore, the FTP-related issue is at worst information leakage
if someone connects to a malicious server.

This is a due diligence PR rather than an emergency fix.

Contributed by Steve Loughran
2022-12-15 16:45:05 +00:00
Steve Loughran
f7b1bb4dcc
HADOOP-18573. Improve error reporting on non-standard kerberos names (#5221)
The kerberos RPC does not declare any restriction on
characters used in kerberos names, though
implementations MAY be more restrictive.

If the kerberos controller supports use non-conventional
principal names *and the kerberos admin chooses to use them*
this can confuse some of the parsing.

The obvious solution is for the enterprise admins to "not do that"
as a lot of things break, bits of hadoop included.

Harden the hadoop code slightly so at least we fail more gracefully,
so people can then get in touch with their sysadmin and tell them
to stop it.
2022-12-15 11:42:36 +00:00
Mehakmeet Singh
32414cfe46
HADOOP-18574. Changing log level of IOStatistics increment to make the DEBUG logs less noisy (#5223)
Contributed by: Mehakmeet Singh
2022-12-15 10:19:18 +05:30
slfan1989
6172c3192d
YARN-11358. [Federation] Add FederationInterceptor#allow-partial-result config. (#5056) 2022-12-14 14:37:56 -08:00
Steve Loughran
aaf92fe183
HADOOP-18526. Leak of S3AInstrumentation instances via hadoop Metrics references (#5144)
This has triggered an OOM in a process which was churning through s3a fs
instances; the increased memory footprint of IOStatistics amplified what
must have been a long-standing issue with FS instances being created
and not closed()

*  Makes sure instrumentation is closed when the FS is closed.
*  Uses a weak reference from metrics to instrumentation, so even
   if the FS wasn't closed (see HADOOP-18478), this back reference
   would not cause the S3AInstrumentation reference to be retained.
*  If S3AFileSystem is configured to log at TRACE it will log the
   calling stack of initialize(), so help identify where the
   instance is being created. This should help track down
   the cause of instance leakage.

Contributed by Steve Loughran.
2022-12-14 18:21:03 +00:00
slfan1989
63b9a6a2b6
YARN-11350. [Federation] Router Support DelegationToken With ZK. (#5131) 2022-12-14 09:09:38 -08:00
Doroszlai, Attila
4de8791deb
HADOOP-18569. NFS Gateway may release buffer too early (#5212)
(cherry picked from commit df4812df65)
2022-12-14 15:55:44 +01:00
Steve Loughran
1cecf8ab70
HADOOP-18183. s3a audit logs to publish range start/end of GET requests. (#5110)
The start and end of the range is set in a new audit param "rg",
e.g "?rg=100-200"

Contributed by Ankit Saurabh
2022-12-14 14:01:28 +00:00
Ashutosh Gupta
85ec7969a7
MAPREDUCE-7428. Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp (#5209)
Contributed by: Ashutosh Gupta
2022-12-14 12:54:08 +00:00
curie71
fdcbc8b072
HDFS-16868. Fix audit log duplicate issue when an ACE occurs in FSNamesystem. (#5206). Contributed by Beibei Zhao.
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-12-13 12:24:51 +08:00
slfan1989
a71aaef9a9
YARN-11385. Fix hadoop-yarn-server-common module Java Doc Errors. (#5182) 2022-12-10 15:03:49 -08:00
Jack Richard Buggins
a46b20d25f
HADOOP-18329. Support for IBM Semeru JVM > 11.0.15.0 Vendor Name Changes (#4537)
The static boolean PlatformName.IBM_JAVA now identifies
Java 11+ IBM Semeru runtimes as IBM JVM releases.

Contributed by Jack Buggins.
2022-12-10 14:27:05 +00:00
Steve Loughran
0a7dfcc332
HADOOP-18546. Followup: ITestReadBufferManager fix (#5198)
This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
2022-12-09 13:47:11 +00:00
Anurag P
e76616f690
HDFS-16860 Upgrade moment.min.js to 2.29.4 (#5194) 2022-12-09 11:18:44 +05:30
dingshun3016
2fa540dca1
HDFS-16858. Dynamically adjust max slow disks to exclude. (#5180)
Reviewed-by: Chris Nauroth <cnauroth@apache.org>
Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-12-09 08:10:04 +08:00
K0K0V0K
ee7d1787cd
YARN-11390. TestResourceTrackerService.testNodeRemovalNormally: Shutdown nodes should be 0 now expected: <1> but was: <0> (#5190)
Reviewed-by: Peter Szucs
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-12-08 09:52:19 -08:00
Oleksandr Shevchenko
0a4528cd7f
HADOOP-18563. Misleading AWS SDK S3 timeout configuration comment (#5197)
Contributed by Oleksandr Shevchenko
2022-12-08 15:07:59 +00:00
Pranav Saxena
c67c2b7569
HADOOP-18546. ABFS. disable purging list of in progress reads in abfs stream close() (#5176)
This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing
across concurrent HTTP requests" by not trying to cancel
in progress reads.

It supercedes HADOOP-18528, which disables the prefetching.
If that patch is applied *after* this one, prefetching
will be disabled.

As well as changing the default value in the code,
core-default.xml is updated to set
fs.azure.enable.readahead = true

As a result, if Configuration.get("fs.azure.enable.readahead")
returns a non-null value, then it can be inferred that
it was set in or core-default.xml (the fix is present)
or in core-site.xml (someone asked for it).

Contributed by Pranav Saxena.
2022-12-07 20:15:45 +00:00
Murali Krishna
2e88096266
HADOOP-18538. Upgrade kafka to 2.8.2 (#5164)
Signed-off-by: Brahma Reddy Battula <brahma@apache.org>
2022-12-06 22:27:46 +05:30
slfan1989
f71fd885be
YARN-11373. [Federation] Support refreshQueues refreshNodes API's for Federation. (#5146) 2022-12-06 08:17:05 -08:00
Akshat Bordia
86ac1ad9e5
YARN-10978. Fix ApplicationClassLoader to Correctly Expand Glob for Windows Path (#3558) 2022-12-06 16:39:49 +05:30
Gautham B A
dadd3d9138
YARN-11386. Fix issue with classpath resolution (#5183)
* This PR ensures that all the special notations such as
  <CPS> are resolved before getting added to classpath.
2022-12-06 16:32:26 +05:30
Steve Loughran
b666075a41
HADOOP-18560. AvroFSInput opens a stream twice and discards the second one without closing (#5186)
This is needed for branches with  the hadoop-common changes of
HADOOP-16202. Enhanced openFile()
2022-12-06 09:58:51 +00:00
Steve Loughran
84b33b897c
HADOOP-18470. index.md update for 3.3.5 release 2022-12-05 16:13:24 +00:00
ZanderXu
8a9bdb1edc
HDFS-16837. [RBF SBN] ClientGSIContext should merge RouterFederatedStates to get the max state id for each namespaces (#5123) 2022-12-05 16:15:47 +08:00
dingshun3016
02afb9ebe1
HDFS-16809. EC striped block is not sufficient when doing in maintenance. (#5050) 2022-12-05 16:34:51 +09:00
slfan1989
60e0fe8709
YARN-11381. Fix hadoop-yarn-common module Java Doc Errors. (#5179) 2022-12-02 10:56:17 -08:00
slfan1989
4af4997e11
YARN-11158. Support (Create/Renew/Cancel) DelegationToken API's for Federation. (#5104) 2022-12-01 13:20:21 -08:00
Szilard Nemeth
5440c75c4a YARN-10946. AbstractCSQueue: Create separate class for constructing Queue API objects. Contributed by Peter Szucs 2022-12-01 15:11:58 +01:00
litao
2067fcb646
HDFS-16550. Allow JN edit cache size to be set as a fraction of heap memory (#4209) 2022-11-30 07:44:21 -08:00
Anmol Asrani
7786600744
HADOOP-18457. ABFS: Support account level throttling (#5034)
This allows  abfs request throttling to be shared across all 
abfs connections talking to containers belonging to the same abfs storage
account -as that is the level at which IO throttling is applied.

The option is enabled/disabled in the configuration option 
"fs.azure.account.throttling.enabled";
The default is "true"

Contributed by Anmol Asrani
2022-11-30 13:05:31 +00:00
Kidd5368
72749a4ff8
HDFS-16839 It should consider EC reconstruction work when we determine if a node is busy (#5128)
Co-authored-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: Tao Li <tomscut@apache.org>
2022-11-30 09:43:15 +08:00
Owen O'Malley
03471a736c
HDFS-16851: RBF: Add a utility to dump the StateStore. (#5155) 2022-11-29 22:12:35 +00:00
HarshitGupta11
0ef572abed
HADOOP-18530. ChecksumFileSystem::readVectored might return byte buffers not positioned at 0 (#5168)
Contributed by Harshit Gupta
2022-11-29 14:51:22 +00:00
caozhiqiang
35c65005d0
HDFS-16846. EC: Only EC blocks should be effected by max-streams-hard-limit configuration (#5143)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-11-29 10:51:21 +09:00
Simbarashe Dzinamarira
909aeca86c
HDFS-16845: Adds configuration flag to allow clients to use router observer reads without using the ObserverReadProxyProvider. (#5142) 2022-11-29 00:49:10 +00:00
Simbarashe Dzinamarira
ec2856d79c
HDFS-16847: RBF: Prevents StateStoreFileSystemImpl from committing tmp file after encountering an IOException. (#5145) 2022-11-29 00:47:01 +00:00
slfan1989
f93167e678
YARN-11380. Fix hadoop-yarn-api module Java Doc Errors. (#5152). Contributed by Shilun Fan.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-28 18:54:32 +05:30
sreeb-msft
1a7acc403b
HADOOP-18498. ABFS: Remove unwanted ? prefix from SAS Tokens (#5136)
This commit parses SAS Tokens and removes the unwanted prefix of '?' from them, if present.

At present, SAS Tokens are provided to the driver through customer implementations of the SASTokenProvider interface. The SAS token providers should not assume that the token will be the first query parameter in the URIs that communicate with the backend. However, it was observed that certain public interfaces provided by Storage to generate SAS can include the '?' as the first character of the SAS Token, which would ideally be the case when it is the first query parameter. Thus, tokens that contain this prefix will lead to an error in the driver due to a clash of query parameters.

To avoid failures for use of such SAS tokens, after receiving the SAS Token from the provider, the code checks for whether any ? prefix is present or not. If yes, it is removed before further usage of the token. This way, users would not have to manually remove the prefix before passing it on as a configuration.

Contributed by Sree Bhattacharya
2022-11-28 11:38:13 +00:00
PJ Fanning
e09e81abe4
HADOOP-18496: remove unused okhttp.version (#5140). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-27 18:59:40 +05:30
slfan1989
1ddc9091f6
YARN-11381. Fix hadoop-yarn-common module Java Doc Errors. (#5153). Contributed by Shilun Fan.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-26 21:01:07 +05:30
ZanderXu
87429f443a
HDFS-16779. Add ErasureCodingPolicy information to the response description for GETFILESTATUS in WebHDFS.md (#4922) 2022-11-25 09:26:28 +08:00
ZanderXu
e0974298ce
HDFS-16826. [RBF SBN] ConnectionManager should advance the client stateId for each request (#5086) 2022-11-25 09:23:33 +08:00
huhaiyang
ef84d21867
HDFS-16841. Enhance the function of DebugAdmin#VerifyECCommand (#5137) 2022-11-24 09:17:27 +08:00