Hive user impersonation

Vibushanan Somasundaram
2 min readSep 1, 2021

--

This article centers around a real-life authorization requirement when hive data is exposed through any application. An usual pattern when we build such application is to authenticate user via a Single-Sign-On (SSO) services like pingID and on the application context we would only have the userID from which authorization has to be enabled.

User Impersonation:

A powerful technique most enterprise application uses in a kerberized cluster is to Impersonate/proxy a logged in user with another super user but applying the access privileges of the logged in user. This is achieved by creating a super user say “hivesuperuser” and the application uses this super user’s keytab to authorize and uses JDBC proxy with the logged in user.

Hive Configuration:

On core-site.xml configure the below two Hadoop properties.a. Set the property hadoop.proxyuser.<name>.hosts to specify the list of hostnames from which proxy requests are permitted.

<property>

<name>hadoop.proxyuser. hivesuperuser.hosts</name>

<value>*</value>

</property>

The above definition allows proxy from all hosts for the user hivesuperuser. Set the property hadoop.proxyuser.<name>.groups to specify the list of HDFS groups that can be impersonated

<property>

<name>hadoop.proxyuser.hivesuperuser.groups</name>

<value>*</value>

</property>

Proxy Connection:

Below code snippet helps in acquiring the proxy connection,

// login through superusers keytab and principle

UserGroupInformation.loginUserFromKeytab(“user@domain”,”keytabpath”);

//Proxy user on JDBC connection object

Class.forName(“org.apache.hive.jdbc.HiveDriver”); connection = DriverManager.getConnection( “jdbc:hive2://host1:port,host2:port,host3:2181/default” +”;hive.server2.proxy.user=” + user + “;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;” +”transportMode=http;httpPath=cliservice;principal=hive/_HOST@domain”);

Source code can be found here

--

--

No responses yet