Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use different tokens instead of forcing WD and all HMS to use the same delegatetoken in the kerberos environment #313

Merged
merged 9 commits into from
May 29, 2024
109 changes: 45 additions & 64 deletions HowToKerberize.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,83 +24,64 @@ In addition, because Kerberos authentication requires a delegation-token to prox
* Zookeeper to store delegation-token (Recommended)

### Configuration
Waggle Dance `waggle-dance-server.yml` example:

Waggle Dance does not read Hadoop's `core-site.xml` so a general property providing Kerberos auth should be added to
the Hive configuration file `hive-site.xml`:

```
<property>
<name>hadoop.security.authentication</name>
<value>KERBEROS</value>
</property>
```


Waggle Dance also needs a keytab file to communicate with the Metastore so the following properties should be present:
```
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/hive.keytab</value>
</property>
port: 9083
verbose: true
#database-resolution: MANUAL
database-resolution: PREFIXED
yaml-storage:
overwrite-config-on-shutdown: false
logging:
config: file:/path/to/log4j2.xml
configuration-properties:
hadoop.security.authentication: KERBEROS
hive.metastore.sasl.enabled: true
hive.metastore.kerberos.principal: hive/[email protected]
hive.metastore.kerberos.keytab.file: /path/to/hive.keytab
hive.cluster.delegation.token.store.class: org.apache.hadoop.hive.thrift.ZooKeeperTokenStore
hive.cluster.delegation.token.store.zookeeper.connectString: zz1:2181,zz2:2181,zz3:2181
hive.cluster.delegation.token.store.zookeeper.znode: /hive/cluster/wd_delegation
hive.server2.authentication: KERBEROS
hive.server2.authentication.kerberos.principal: hive/[email protected]
hive.server2.authentication.kerberos.keytab: /path/to/hive.keytab
hive.server2.authentication.client.kerberos.principal: hive/[email protected]
hadoop.kerberos.keytab.login.autorenewal.enabled : true
hadoop.proxyuser.hive.users: '*'
hadoop.proxyuser.hive.hosts: '*'
```

In addition, all metastores need to use the Zookeeper shared token:
Waggle Dance `waggle-dance-federation.yml` example:
```
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStore</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.connectString</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.znode</name>
<value>/hive/token</value>
</property>
primary-meta-store:
database-prefix: ''
name: local
remote-meta-store-uris: thrift://ms1:9083
access-control-type: READ_AND_WRITE_AND_CREATE
impersonation-enabled: true
federated-meta-stores:
- remote-meta-store-uris: thrift://ms2:9083
database-prefix: dw_
name: remote
impersonation-enabled: true
access-control-type: READ_AND_WRITE_ON_DATABASE_WHITELIST
writable-database-white-list:
- .*
```

If you are intending to use a Beeline client, the following properties may be valuable:
Connect to Waggle Dance via beeline, change ` hive.metastore.uris` in Hive configuration file `hive-site.xml`:
```
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/hive.keytab</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<name>hive.metastore.uris</name>
<value>thrift://wd:9083</value>
</property>
```


### Running

Waggle Dance should be started by a privileged user with a fresh keytab.

If Waggle Dance throws a GSS exception, you have problem with the keytab file.
Try to perform `kdestroy` and `kinit` operations and check the keytab file ownership flags.

If the Metastore throws an exception with code -127, Waggle Dance is probably using the wrong authentication policy.
Check the values in `hive-conf.xml` and make sure that HIVE_HOME and HIVE_CONF_DIR are defined.

Don't forget to restart hive services!
Just start the service directly, no kinit operation is required.
Because the ticket information is saved in jvm instead of being saved in a local file.
In this way, it can automatically renew without the need for additional operations to renew local tickets.
57 changes: 30 additions & 27 deletions README.md

Large diffs are not rendered by default.

Binary file modified kerberos-process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ public abstract class AbstractMetaStore {
private transient @JsonProperty @NotNull MetaStoreStatus status = MetaStoreStatus.UNKNOWN;
private long latency = 0;
private transient @JsonIgnore HashBiMap<String, String> databaseNameBiMapping = HashBiMap.create();
private boolean impersonationEnabled;
private Map<String, String> configurationProperties = new HashMap<>();

public AbstractMetaStore(String name, String remoteMetaStoreUris, AccessControlType accessControlType) {
Expand Down Expand Up @@ -222,6 +223,14 @@ public void setStatus(MetaStoreStatus status) {
this.status = status;
}

public boolean isImpersonationEnabled() {
return impersonationEnabled;
}

public void setImpersonationEnabled(boolean impersonationEnabled) {
this.impersonationEnabled = impersonationEnabled;
}

@Override
public int hashCode() {
return Objects.hashCode(name);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ public void nullDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"configurationProperties\":{},\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"configurationProperties\":{},\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ public void nonEmptyDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"configurationProperties\":{},\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"configurationProperties\":{},\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
/**
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.hotels.bdp.waggledance.client;

import java.io.Closeable;
import java.net.URI;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;

import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
import org.apache.hadoop.hive.conf.HiveConfUtil;
import org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore;
import org.apache.thrift.TException;
import org.apache.thrift.transport.TTransport;

import lombok.extern.log4j.Log4j2;

import com.hotels.bdp.waggledance.client.compatibility.HiveCompatibleThriftHiveMetastoreIfaceFactory;

@Log4j2
public abstract class AbstractThriftMetastoreClientManager implements Closeable {

protected static final AtomicInteger CONN_COUNT = new AtomicInteger(0);
jmnunezizu marked this conversation as resolved.
Show resolved Hide resolved
protected final HiveConf conf;
protected final HiveCompatibleThriftHiveMetastoreIfaceFactory hiveCompatibleThriftHiveMetastoreIfaceFactory;
protected final URI[] metastoreUris;
protected ThriftHiveMetastore.Iface client = null;
protected TTransport transport = null;
protected boolean isConnected = false;
// for thrift connects
protected int retries = 5;
protected long retryDelaySeconds = 0;

protected final int connectionTimeout;
protected final String msUri;

AbstractThriftMetastoreClientManager(
HiveConf conf,
HiveCompatibleThriftHiveMetastoreIfaceFactory hiveCompatibleThriftHiveMetastoreIfaceFactory,
int connectionTimeout) {
this.conf = conf;
this.hiveCompatibleThriftHiveMetastoreIfaceFactory = hiveCompatibleThriftHiveMetastoreIfaceFactory;
this.connectionTimeout = connectionTimeout;
msUri = conf.getVar(ConfVars.METASTOREURIS);

if (HiveConfUtil.isEmbeddedMetaStore(msUri)) {
throw new RuntimeException("You can't waggle an embedded metastore");
}

// get the number retries
patduin marked this conversation as resolved.
Show resolved Hide resolved
retries = HiveConf.getIntVar(conf, ConfVars.METASTORETHRIFTCONNECTIONRETRIES);
patduin marked this conversation as resolved.
Show resolved Hide resolved
retryDelaySeconds = conf.getTimeVar(ConfVars.METASTORE_CLIENT_CONNECT_RETRY_DELAY, TimeUnit.SECONDS);

// user wants file store based configuration
if (msUri != null) {
patduin marked this conversation as resolved.
Show resolved Hide resolved
String[] metastoreUrisString = msUri.split(",");
metastoreUris = new URI[metastoreUrisString.length];
try {
int i = 0;
for (String s : metastoreUrisString) {
URI tmpUri = new URI(s);
if (tmpUri.getScheme() == null) {
throw new IllegalArgumentException("URI: " + s + " does not have a scheme");
}
metastoreUris[i++] = tmpUri;
}
} catch (IllegalArgumentException e) {
patduin marked this conversation as resolved.
Show resolved Hide resolved
throw (e);
} catch (Exception e) {
String exInfo = "Got exception: " + e.getClass().getName() + " " + e.getMessage();
log.error(exInfo, e);
patduin marked this conversation as resolved.
Show resolved Hide resolved
throw new RuntimeException(exInfo, e);
}
} else {
log.error("NOT getting uris from conf");
patduin marked this conversation as resolved.
Show resolved Hide resolved
throw new RuntimeException("MetaStoreURIs not found in conf file");
}
}

void open() {
open(null);
}

abstract void open(HiveUgiArgs ugiArgs);

void reconnect(HiveUgiArgs ugiArgs) {
close();
// Swap the first element of the metastoreUris[] with a random element from the rest
// of the array. Rationale being that this method will generally be called when the default
// connection has died and the default connection is likely to be the first array element.
promoteRandomMetaStoreURI();
patduin marked this conversation as resolved.
Show resolved Hide resolved
open(ugiArgs);
}

public String getHiveConfValue(String key, String defaultValue) {
return conf.get(key, defaultValue);
}

public void setHiveConfValue(String key, String value) {
conf.set(key, value);
}

@Override
public void close() {
if (!isConnected) {
return;
}
isConnected = false;
try {
if (client != null) {
client.shutdown();
}
} catch (TException e) {
log.debug("Unable to shutdown metastore client. Will try closing transport directly.", e);
patduin marked this conversation as resolved.
Show resolved Hide resolved
}
// Transport would have got closed via client.shutdown(), so we don't need this, but
// just in case, we make this call.
if ((transport != null) && transport.isOpen()) {
transport.close();
transport = null;
}
log.info("Closed a connection to metastore, current connections: {}", CONN_COUNT.decrementAndGet());
}

boolean isOpen() {
return (transport != null) && transport.isOpen();
}

protected ThriftHiveMetastore.Iface getClient() {
return client;
}

/**
* Swaps the first element of the metastoreUris array with a random element from the remainder of the array.
*/
private void promoteRandomMetaStoreURI() {
patduin marked this conversation as resolved.
Show resolved Hide resolved
if (metastoreUris.length <= 1) {
return;
}
Random rng = new Random();
int index = rng.nextInt(metastoreUris.length - 1) + 1;
URI tmp = metastoreUris[0];
metastoreUris[0] = metastoreUris[index];
metastoreUris[index] = tmp;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import com.hotels.bdp.waggledance.api.model.AbstractMetaStore;
import com.hotels.bdp.waggledance.client.tunnelling.TunnelingMetaStoreClientFactory;
import com.hotels.bdp.waggledance.conf.WaggleDanceConfiguration;
import com.hotels.bdp.waggledance.context.CommonBeans;
import com.hotels.hcommon.hive.metastore.conf.HiveConfFactory;
import com.hotels.hcommon.hive.metastore.util.MetaStoreUriNormaliser;

Expand Down Expand Up @@ -70,6 +71,8 @@ private CloseableThriftHiveMetastoreIface newHiveInstance(
connectionTimeout, waggleDanceConfiguration.getConfigurationProperties());
}
properties.put(ConfVars.METASTOREURIS.varname, uris);
properties.put(CommonBeans.IMPERSONATION_ENABLED_KEY,
String.valueOf(metaStore.isImpersonationEnabled()));
HiveConfFactory confFactory = new HiveConfFactory(Collections.emptyList(), properties);
return defaultMetaStoreClientFactory
.newInstance(confFactory.newInstance(), "waggledance-" + name, DEFAULT_CLIENT_FACTORY_RECONNECTION_RETRY,
Expand Down
Loading
Loading