Skip to content

Commit

Permalink
Use different tokens instead of forcing WD and all HMS to use the sam…
Browse files Browse the repository at this point in the history
…e delegatetoken in the kerberos environment
  • Loading branch information
flaming-archer committed Apr 12, 2024
1 parent 1b9a36e commit ce8a67b
Show file tree
Hide file tree
Showing 19 changed files with 359 additions and 372 deletions.
98 changes: 44 additions & 54 deletions HowToKerberize.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,75 +24,65 @@ In addition, because Kerberos authentication requires a delegation-token to prox
* Zookeeper to store delegation-token (Recommended)

### Configuration

Waggle Dance does not read Hadoop's `core-site.xml` so a general property providing Kerberos auth should be added to
the Hive configuration file `hive-site.xml`:
Waggle Dance `waggle-dance-server.yml` example:

```
<property>
<name>hadoop.security.authentication</name>
<value>KERBEROS</value>
</property>
port: 9083
verbose: true
#database-resolution: MANUAL
database-resolution: PREFIXED
yaml-storage:
overwrite-config-on-shutdown: false
logging:
config: file:/path/to/log4j2.xml
configuration-properties:
hadoop.security.authentication: KERBEROS
hive.metastore.sasl.enabled: true
hive.metastore.kerberos.principal: hive/[email protected]
hive.metastore.kerberos.keytab.file: /path/to/hive.keytab
hive.cluster.delegation.token.store.class: org.apache.hadoop.hive.thrift.ZooKeeperTokenStore
hive.cluster.delegation.token.store.zookeeper.connectString: zz1:2181,zz2:2181,zz3:2181
hive.cluster.delegation.token.store.zookeeper.znode: /hive/cluster/wd_delegation
hive.server2.authentication: KERBEROS
hive.server2.authentication.kerberos.principal: hive/[email protected]
hive.server2.authentication.kerberos.keytab: /path/to/hive.keytab
hive.server2.authentication.client.kerberos.principal: hive/[email protected]
hadoop.kerberos.keytab.login.autorenewal.enabled : true
hadoop.proxyuser.hive.users: '*'
hadoop.proxyuser.hive.hosts: '*'
```


Waggle Dance also needs a keytab file to communicate with the Metastore so the following properties should be present:
Waggle Dance `waggle-dance-federation.yml` example:
```
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/hive.keytab</value>
</property>
primary-meta-store:
database-prefix: ''
name: local
remote-meta-store-uris: thrift://ms1:9083
access-control-type: READ_AND_WRITE_AND_CREATE
impersonation-enabled: true
federated-meta-stores:
- remote-meta-store-uris: thrift://ms2:9083
database-prefix: dw_
name: remote
impersonation-enabled: true
access-control-type: READ_AND_WRITE_ON_DATABASE_WHITELIST
writable-database-white-list:
- .*
```

In addition, all metastores need to use the Zookeeper shared token:
In start shell , add jvm properties maybe useful.
```
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStore</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.connectString</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.znode</name>
<value>/hive/token</value>
</property>
-Djavax.security.auth.useSubjectCredsOnly=false
```

If you are intending to use a Beeline client, the following properties may be valuable:
Connect to Waggle Dance via beeline, change ` hive.metastore.uris` in Hive configuration file `hive-site.xml`:
```
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/hive.keytab</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<name>hive.metastore.uris</name>
<value>thrift://wd:9083</value>
</property>
```


### Running

Waggle Dance should be started by a privileged user with a fresh keytab.
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ The table below describes all the available configuration values for Waggle Danc
| `primary-meta-store.name` | Yes | Database name that uniquely identifies this metastore. Used internally. Cannot be empty. |
| `primary-meta-store.database-prefix` | No | Prefix used to access the primary metastore and differentiate databases in it from databases in another metastore. The default prefix (i.e. if this value isn't explicitly set) is empty string.|
| `primary-meta-store.access-control-type` | No | Sets how the client access controls should be handled. Default is `READ_ONLY` Other options `READ_AND_WRITE_AND_CREATE`, `READ_AND_WRITE_ON_DATABASE_WHITELIST` and `READ_AND_WRITE_AND_CREATE_ON_DATABASE_WHITELIST` see Access Control section below. |
| `primary-meta-store.impersonation-enabled` | No | Enable metastore end-user impersonation.|
| `primary-meta-store.writable-database-white-list` | No | White-list of databases used to verify write access used in conjunction with `primary-meta-store.access-control-type`. The list of databases should be listed without any `primary-meta-store.database-prefix`. This property supports both full database names and (case-insensitive) [Java RegEx patterns](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html).|
| `primary-meta-store.metastore-tunnel` | No | See metastore tunnel configuration values below. |
| `primary-meta-store.latency` | No | Indicates the acceptable slowness of the metastore in **milliseconds** for increasing the default connection timeout. Default latency is `0` and should be changed if the metastore is particularly slow. If you get an error saying that results were omitted because the metastore was slow, consider changing the latency to a higher number.|
Expand All @@ -168,6 +169,7 @@ The table below describes all the available configuration values for Waggle Danc
| `federated-meta-stores` | No | Possible empty list of read only federated metastores. |
| `federated-meta-stores[n].remote-meta-store-uris` | Yes | Thrift URIs of the federated read-only metastore. |
| `federated-meta-stores[n].name` | Yes | Name that uniquely identifies this metastore. Used internally. Cannot be empty. |
| `federated-meta-stores[n].impersonation-enabled` | No | Enable metastore end-user impersonation.|
| `federated-meta-stores[n].database-prefix` | No | Prefix used to access this particular metastore and differentiate databases in it from databases in another metastore. Typically used if databases have the same name across metastores but federated access to them is still needed. The default prefix (i.e. if this value isn't explicitly set) is {federated-meta-stores[n].name} lowercased and postfixed with an underscore. For example if the metastore name was configured as "waggle" and no database prefix was provided but `PREFIXED` database resolution was used then the value of `database-prefix` would be "waggle_". |
| `federated-meta-stores[n].metastore-tunnel` | No | See metastore tunnel configuration values below. |
| `federated-meta-stores[n].latency` | No | Indicates the acceptable slowness of the metastore in **milliseconds** for increasing the default connection timeout. Default latency is `0` and should be changed if the metastore is particularly slow. If you get an error saying that results were omitted because the metastore was slow, consider changing the latency to a higher number.|
Expand Down
Binary file modified kerberos-process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2023 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -59,7 +59,7 @@ public abstract class AbstractMetaStore {
private transient @JsonProperty @NotNull MetaStoreStatus status = MetaStoreStatus.UNKNOWN;
private long latency = 0;
private transient @JsonIgnore HashBiMap<String, String> databaseNameBiMapping = HashBiMap.create();

private boolean impersonationEnabled;
public AbstractMetaStore(String name, String remoteMetaStoreUris, AccessControlType accessControlType) {
this.name = name;
this.remoteMetaStoreUris = remoteMetaStoreUris;
Expand Down Expand Up @@ -211,6 +211,14 @@ public void setStatus(MetaStoreStatus status) {
this.status = status;
}

public boolean isImpersonationEnabled() {
return impersonationEnabled;
}

public void setImpersonationEnabled(boolean impersonationEnabled) {
this.impersonationEnabled = impersonationEnabled;
}

@Override
public int hashCode() {
return Objects.hashCode(name);
Expand Down Expand Up @@ -242,5 +250,4 @@ public String toString() {
.add("status", status)
.toString();
}

}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2021 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -72,7 +72,7 @@ public void nullDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2021 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -89,7 +89,7 @@ public void nonEmptyDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2023 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -30,6 +30,7 @@
import com.hotels.bdp.waggledance.api.model.AbstractMetaStore;
import com.hotels.bdp.waggledance.client.tunnelling.TunnelingMetaStoreClientFactory;
import com.hotels.bdp.waggledance.conf.WaggleDanceConfiguration;
import com.hotels.bdp.waggledance.context.CommonBeans;
import com.hotels.hcommon.hive.metastore.conf.HiveConfFactory;
import com.hotels.hcommon.hive.metastore.util.MetaStoreUriNormaliser;

Expand Down Expand Up @@ -66,6 +67,8 @@ private CloseableThriftHiveMetastoreIface newHiveInstance(
connectionTimeout, waggleDanceConfiguration.getConfigurationProperties());
}
properties.put(ConfVars.METASTOREURIS.varname, uris);
properties.put(CommonBeans.IMPERSONATION_ENABLED_KEY,
String.valueOf(metaStore.isImpersonationEnabled()));
HiveConfFactory confFactory = new HiveConfFactory(Collections.emptyList(), properties);
return defaultMetaStoreClientFactory
.newInstance(confFactory.newInstance(), "waggledance-" + name, DEFAULT_CLIENT_FACTORY_RECONNECTION_RETRY,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2023 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -15,17 +15,13 @@
*/
package com.hotels.bdp.waggledance.client;

import java.io.IOException;
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;
import java.lang.reflect.UndeclaredThrowableException;
import java.util.List;

import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.metastore.utils.SecurityUtils;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.thrift.transport.TTransportException;

import lombok.extern.log4j.Log4j2;
Expand All @@ -34,7 +30,6 @@
import com.google.common.collect.Lists;

import com.hotels.bdp.waggledance.client.compatibility.HiveCompatibleThriftHiveMetastoreIfaceFactory;
import com.hotels.bdp.waggledance.server.TokenWrappingHMSHandler;
import com.hotels.hcommon.hive.metastore.exception.MetastoreUnavailableException;


Expand Down Expand Up @@ -140,76 +135,6 @@ private void reconnectIfDisconnected() {

}

@Log4j2
private static class SaslMetastoreClientHander implements InvocationHandler {

private final CloseableThriftHiveMetastoreIface baseHandler;
private final ThriftMetastoreClientManager clientManager;
private final String tokenSignature = "WAGGLEDANCETOKEN";

private String delegationToken;

public static CloseableThriftHiveMetastoreIface newProxyInstance(
CloseableThriftHiveMetastoreIface baseHandler,
ThriftMetastoreClientManager clientManager) {
return (CloseableThriftHiveMetastoreIface) Proxy.newProxyInstance(SaslMetastoreClientHander.class.getClassLoader(),
INTERFACES, new SaslMetastoreClientHander(baseHandler, clientManager));
}

private SaslMetastoreClientHander(
CloseableThriftHiveMetastoreIface handler,
ThriftMetastoreClientManager clientManager) {
this.baseHandler = handler;
this.clientManager = clientManager;
}

@SuppressWarnings("unchecked")
@Override
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
try {
switch (method.getName()) {
case "get_delegation_token":
try {
clientManager.open();
Object token = method.invoke(baseHandler, args);
this.delegationToken = (String) token;
clientManager.close();
setTokenStr2Ugi(UserGroupInformation.getCurrentUser(), (String) token);
clientManager.open();
return token;
} catch (IOException e) {
throw new MetastoreUnavailableException("Couldn't setup delegation token in the ugi: ", e);
}
default:
genToken();
return method.invoke(baseHandler, args);
}
} catch (InvocationTargetException e) {
throw e.getTargetException();
} catch (UndeclaredThrowableException e) {
throw e.getCause();
}
}

private void genToken() throws Throwable {
UserGroupInformation currUser = null;
if (delegationToken == null && (currUser = UserGroupInformation.getCurrentUser())
!= UserGroupInformation.getLoginUser()) {

log.info("set {} delegation token", currUser.getShortUserName());
String token = TokenWrappingHMSHandler.getToken();
setTokenStr2Ugi(currUser, token);
delegationToken = token;
clientManager.close();
}
}

private void setTokenStr2Ugi(UserGroupInformation currUser, String token) throws IOException {
String newTokenSignature = clientManager.generateNewTokenSignature(tokenSignature);
SecurityUtils.setTokenStr(currUser, token, newTokenSignature);
}
}

/*
* (non-Javadoc)
* @see com.hotels.bdp.waggledance.client.MetaStoreClientFactoryI#newInstance(org.apache.hadoop.hive.conf.HiveConf,
Expand All @@ -231,17 +156,9 @@ CloseableThriftHiveMetastoreIface newInstance(
int reconnectionRetries,
ThriftMetastoreClientManager base) {
ReconnectingMetastoreClientInvocationHandler reconnectingHandler = new ReconnectingMetastoreClientInvocationHandler(
name, reconnectionRetries, base);
if (base.isSaslEnabled()) {
CloseableThriftHiveMetastoreIface ifaceReconnectingHandler = (CloseableThriftHiveMetastoreIface) Proxy
.newProxyInstance(getClass().getClassLoader(), INTERFACES, reconnectingHandler);
// wrapping the SaslMetastoreClientHander to handle delegation token if using sasl
return SaslMetastoreClientHander.newProxyInstance(ifaceReconnectingHandler, base);
} else {
return (CloseableThriftHiveMetastoreIface) Proxy
.newProxyInstance(getClass().getClassLoader(), INTERFACES, reconnectingHandler);
}

name, reconnectionRetries, base);
return (CloseableThriftHiveMetastoreIface) Proxy.newProxyInstance(getClass().getClassLoader(),
INTERFACES, reconnectingHandler);
}

}
Loading

0 comments on commit ce8a67b

Please sign in to comment.