Commit Graph

3854 Commits

Author SHA1 Message Date
Szilard Nemeth
e8fa192f07 YARN-9217. Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing. Contributed by Peter Bacsko 2019-08-21 16:44:22 +02:00
bibinchundatt
e684b17e6f YARN-5857. TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk. Contributed by Bilwa S T. 2019-08-21 17:14:42 +05:30
Sunil G
0e0ddfaf24 YARN-2599. Standby RM should expose jmx endpoint. Contributed by Rohith Sharma K S. 2019-08-17 15:43:19 +05:30
Szilard Nemeth
9b8359bb08 YARN-9461. TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400. Contributed by Peter Bacsko 2019-08-16 12:31:58 +02:00
Szilard Nemeth
4456ea67b9 YARN-8586. Extract log aggregation related fields and methods from RMAppImpl. Contributed by Peter Bacsko 2019-08-16 11:36:14 +02:00
Szilard Nemeth
2216ec54e5 YARN-9100. Add tests for GpuResourceAllocator and do minor code cleanup. Contributed by Peter Bacsko 2019-08-16 09:13:20 +02:00
Szilard Nemeth
2a05e0ff3b YARN-9749. TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk. Contributed by Adam Antal 2019-08-16 08:52:09 +02:00
Adam Antal
22c4f38c4b YARN-9679. Regular code cleanup in TestResourcePluginManager (#1122) 2019-08-15 17:32:05 +02:00
Szilard Nemeth
1845a83cec YARN-9488. Skip YARNFeatureNotEnabledException from ClientRMService. Contributed by Prabhu Joseph 2019-08-15 17:15:38 +02:00
HUAN-PING SU
167acd87da YARN-9683. Remove reapDockerContainerNoPid left behind by YARN-9074 (#1212) Contributed by Kevin Su.
Reviewed-by: Eric Yang <eyang@apache.org>
Reviewed-by: Adam Antal <adam.antal@cloudera.com>
2019-08-14 10:42:29 -07:00
Adam Antal
c89bdfacc8 YARN-9676. Add DEBUG and TRACE level messages to AppLogAggregatorImpl… (#1261)
* YARN-9676. Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes

* Using {} placeholder, and increasing loglevel if log aggregation failed.
2019-08-14 17:35:16 +02:00
Szilard Nemeth
3e0410449f YARN-9133. Make tests more easy to comprehend in TestGpuResourceHandler. Contributed by Peter Bacsko 2019-08-14 17:13:54 +02:00
Szilard Nemeth
e5e609384f YARN-9140. Code cleanup in ResourcePluginManager.initialize and in TestResourcePluginManager. Contributed by Peter Bacsko 2019-08-14 16:58:22 +02:00
bibinchundatt
89a53c7eb4 YARN-9747. Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs. Contributed by Prabhu Joseph. 2019-08-14 13:46:23 +05:30
Eric Badger
2ac029b949 YARN-9442. container working directory has group read permissions. Contributed by Jim Brennan. 2019-08-13 16:21:18 +00:00
Abhishek Modi
b4097b96a3 YARN-9744. RollingLevelDBTimelineStore.getEntityByTime fails with NPE. Contributed by Prabhu Joseph. 2019-08-13 19:04:00 +05:30
Szilard Nemeth
e4b538bbda YARN-9723. ApplicationPlacementContext is not required for terminated jobs during recovery. Contributed by Prabhu Joseph 2019-08-12 15:15:43 +02:00
Abhishek Modi
13a5803ccf YARN-9464. Support pending resource metrics in RM's RESTful API. Contributed by Prabhu Joseph. 2019-08-12 14:31:24 +05:30
Abhishek Modi
8fbf8b2eb0 YARN-9722. PlacementRule logs object ID in place of queue name. Contributed by Prabhu Joseph. 2019-08-12 10:44:46 +05:30
Eric Yang
6ff0453ede YARN-9527. Prevent rogue Localizer Runner from downloading same file repeatly.
Contributed by Jim Brennan
2019-08-09 14:12:17 -04:00
Abhishek Modi
a79564fed0 YARN-9732. yarn.system-metrics-publisher.enabled=false is not honored by RM. Contributed by KWON BYUNGCHANG. 2019-08-09 22:25:30 +05:30
Szilard Nemeth
e0c21c6da9 YARN-9092. Create an object for cgroups mount enable and cgroups mount path as they belong together. Contributed by Gergely Pollak 2019-08-09 10:18:34 +02:00
Szilard Nemeth
742e30b473 YARN-9096: Some GpuResourcePlugin and ResourcePluginManager methods are synchronized unnecessarily. Contributed by Gergely Pollak 2019-08-09 09:59:19 +02:00
Szilard Nemeth
72d7e570a7 YARN-9094: Remove unused interface method: NodeResourceUpdaterPlugin#handleUpdatedResourceFromRM. Contributed by Gergely Pollak 2019-08-09 09:49:18 +02:00
Eric E Payne
3b38f2019e YARN-9685: NPE when rendering the info table of leaf queue in non-accessible partitions. Contributed by Tao Yang. 2019-08-08 12:37:50 +00:00
hunshenshi
22d7d1f8bf YARN-9601.Potential NPE in ZookeeperFederationStateStore#getPoliciesConfigurations (#908) Contributed by hunshenshi. 2019-08-07 21:26:14 -07:00
Haibo Chen
f51702d539 YARN-9559. Create AbstractContainersLauncher for pluggable ContainersLauncher logic. (Contributed by Jonathan Hung) 2019-08-06 13:52:30 -07:00
HUAN-PING SU
7c2042a44d YARN-9678. Addendum: TestGpuResourceHandler / TestFpgaResourceHandler should be renamed. Contributed by kevin su.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-08-06 10:21:55 -07:00
HUAN-PING SU
b8bf09ba3d YARN-9678. TestGpuResourceHandler / TestFpgaResourceHandler should be renamed. Contributed by kevin su.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-08-06 09:05:53 -07:00
Eric Yang
d6697da5e8 YARN-9667. Use setbuf with line buffer to reduce fflush complexity in container-executor.
Contributed by Peter Bacsko
2019-08-05 13:59:12 -04:00
Szilard Nemeth
54ac80176e Logging fileSize of log files under NM Local Dir. Contributed by Prabhu Joseph 2019-08-02 13:38:06 +02:00
Vidura Mudalige
1930a7bf60 YARN-9093. Remove commented code block from the beginning of Tes… (#444) 2019-08-02 13:16:19 +02:00
Adam Antal
95fc38f2e9 YARN-9375. Use Configured in GpuDiscoverer and FpgaDiscoverer (#1131)
Contributed by Adam Antal
2019-08-02 11:24:09 +02:00
Eric E Payne
42683aef1a YARN-9596: QueueMetrics has incorrect metrics when labelled partitions are involved. Contributed by Muhammad Samir Khan. 2019-07-30 18:58:36 +00:00
Eric Yang
c34ceb5fde YARN-9568. Fixed NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover.
Contributed by Steve Loughran
2019-07-18 12:30:53 -04:00
Haibo Chen
5915c902aa YARN-9646. DistributedShell tests failed to bind to a local host name. (Contributed by Ray Yang) 2019-07-16 17:36:49 -07:00
bibinchundatt
7a93be0f60 YARN-9645. Fix Invalid event FINISHED_CONTAINERS_PULLED_BY_AM at NEW on NM restart. Contributed by Bilwa S T. 2019-07-16 14:03:22 +05:30
Szilard Nemeth
18ee1092b4 YARN-9127. Create more tests to verify GpuDeviceInformationParser. Contributed by Peter Bacsko 2019-07-15 11:59:11 +02:00
Szilard Nemeth
91ce09e706 YARN-9360. Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource. Contributed by Peter Bacsko 2019-07-15 10:47:20 +02:00
Szilard Nemeth
61b0c2bb7c YARN-9337. GPU auto-discovery script runs even when the resource is given by hand. Contributed by Adam Antal 2019-07-12 17:28:14 +02:00
Szilard Nemeth
8b3c6791b1 YARN-9135. NM State store ResourceMappings serialization are tested with Strings instead of real Device objects. Contributed by Peter Bacsko 2019-07-12 17:20:42 +02:00
Szilard Nemeth
c416284bb7 YARN-9235. If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown. Contributed by Antal Balint Steinbach, Adam Antal 2019-07-12 16:51:58 +02:00
Haibo Chen
9b54dd7186 YARN-9668. UGI conf doesn't read user overridden configurations on RM and NM startup. (Contributed by Jonathan Hung) 2019-07-11 13:57:08 -07:00
Akira Ajisaka
ccaa99c923
HADOOP-16381. The JSON License is included in binary tarball via azure-documentdb:1.16.2. Contributed by Sushil Ks. 2019-07-11 13:49:42 +09:00
Szilard Nemeth
a2a8be18cb YARN-9629. Support configurable MIN_LOG_ROLLING_INTERVAL. Contributed by Adam Antal. 2019-07-03 13:45:00 +02:00
Weiwei Yang
15d82fcb75 YARN-9658. Fix UT failures in TestLeafQueue. Contributed by Tao Yang. 2019-07-03 12:08:45 +08:00
Sunil G
e966edd025 YARN-9644. First RMContext object is always leaked during switch over. Contributed by Bibin A Chundatt. 2019-07-02 12:18:16 +05:30
Weiwei Yang
570eee30e5 YARN-9655. AllocateResponse in FederationInterceptor lost applicationPriority. Contributed by hunshenshi. 2019-07-02 09:55:25 +08:00
hunshenshi
b1dafc3506 YARN-9661:Fix typo in LocalityMulticastAMRMProxyPolicy.java and AbstractConfigurableFederationPolicy.java (#1042) 2019-07-01 10:46:33 -07:00
Eric Yang
29465bf169 YARN-9560. Restructure DockerLinuxContainerRuntime to extend OCIContainerRuntime.
Contributed by Eric Badger, Jim Brennan, Craig Condit
2019-06-28 17:18:53 -04:00