From 978c487672edd9f18d8e2c9a1da063ae789bd774 Mon Sep 17 00:00:00 2001 From: Karthick Narendran Date: Thu, 23 Jan 2020 20:35:57 -0800 Subject: [PATCH] HADOOP-16826. ABFS: update abfs.md to include config keys for identity transformation Contributed by Karthick Narendran --- .../hadoop-azure/src/site/markdown/abfs.md | 31 +++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/hadoop-tools/hadoop-azure/src/site/markdown/abfs.md b/hadoop-tools/hadoop-azure/src/site/markdown/abfs.md index 1d01e021f7..79ec2ad786 100644 --- a/hadoop-tools/hadoop-azure/src/site/markdown/abfs.md +++ b/hadoop-tools/hadoop-azure/src/site/markdown/abfs.md @@ -857,6 +857,37 @@ signon page for humans, even though it is a machine calling. 1. The URL is wrong —it is pointing at a web page unrelated to OAuth2.0 1. There's a proxy server in the way trying to return helpful instructions. +### `java.io.IOException: The ownership on the staging directory /tmp/hadoop-yarn/staging/user1/.staging is not as expected. It is owned by . The directory must be owned by the submitter user1 or user1` + +When using [Azure Managed Identities](https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview), the files/directories in ADLS Gen2 by default will be owned by the service principal object id i.e. principal ID & submitting jobs as the local OS user 'user1' results in the above exception. + +The fix is to mimic the ownership to the local OS user, by adding the below properties to`core-site.xml`. + +```xml + + fs.azure.identity.transformer.service.principal.id + service principal object id + + An Azure Active Directory object ID (oid) used as the replacement for names contained + in the list specified by “fs.azure.identity.transformer.service.principal.substitution.list”. + Notice that instead of setting oid, you can also set $superuser here. + + + + fs.azure.identity.transformer.service.principal.substitution.list + user1 + + A comma separated list of names to be replaced with the service principal ID specified by + “fs.azure.identity.transformer.service.principal.id”. This substitution occurs + when setOwner, setAcl, modifyAclEntries, or removeAclEntries are invoked with identities + contained in the substitution list. Notice that when in non-secure cluster, asterisk symbol * + can be used to match all user/group. + + +``` + +Once the above properties are configured, `hdfs dfs -ls abfs://container1@abfswales1.dfs.core.windows.net/` shows the ADLS Gen2 files/directories are now owned by 'user1'. + ## Testing ABFS See the relevant section in [Testing Azure](testing_azure.html).