Dbeaver Hbase



Connect to data from Java/J2EE Apps. Access live data from BI, Reporting, ETL Tools, Custom Apps, and more. HBase permissions are enforced given the end-user, not the Phoenix Query Server’s identity. In some cases, it may be desirable to execute the query as some other user – this is referred to as “impersonation”. This can enable workflows where a trusted user has the.

The Phoenix Query Server provides an alternative means for interaction with Phoenix and HBase. Soon this will enable access from environments other than the JVM.

Overview

Phoenix 4.4 introduces a stand-alone server that exposes Phoenix to “thin” clients. It is based on the Avatica component of Apache Calcite. The query server is comprised of a Java server that manages Phoenix Connections on the clients’ behalf. The client implementation is currently a JDBC driver with minimal dependencies. Two transport mechanisms are current supported: JSON and Protocol Buffers (available as of Phoenix 4.7 as the default). There’s also a sqlline script that uses the thin client.

Avatica is still relatively early in its life-cycle. With the introduction of the Protobuf transport, Avatica is moving towards backwards compatibility with the provided thin JDBC driver. There are no such backwards compatibility guarantees for the JSON API.

To repeat, there is no guarantee of backwards compatibility with the JSON transport; however, compatibility with the Protobuf transport is stabilizing (although, not tested thoroughly enough to be stated as “guaranteed”).

Installation

In the 4.x and 5.0 releases the query server and its JDBC client are part of the standard Phoenix distribution. They require no additional dependencies.

After the 5.0 release, the query server has been unbundled into the phoenix-queryserver repository, and its version number has been reset to 1.0. At the time of writing there is no released version of the standalone query server.

Usage

Server

The server component is managed through bin/queryserver.py. Its usage is as follows

Hbase

When invoked with no arguments, the query server is launched in the foreground, with logging directed to the console.

The first argument is an optional start or stop command to the daemon. When either of these are provided, it will take appropriate action on a daemon process, if it exists.

Any subsequent arguments are passed to the main class for interpretation.

The server is packaged in a standalone jar, phoenix-server-<version>-runnable.jar. This jar and HBASE_CONF_DIR on the classpath are all that is required to launch the server.

Client

Phoenix provides two mechanisms for interacting with the query server. A JDBC driver is provided in the standalone phoenix-<version>-thin-client.jar. The script bin/sqlline-thin.py is available for the command line.

The JDBC connection string is composed as follows:

<scheme> specifies the transport protocol (http or https) used when communicating with the server.

Hbase

<server-hostname> is the name of the host offering the service.

<port> is the port number on which the host is listening. Default is 8765, though this is configurable (see below).

The full list of options that can be provided via the JDBC URL string is available in the Avatica documentation

The script bin/sqlline-thin.py is intended to behave identically to its sibling script bin/sqlline.py. It supports the following usage options.

Dbeaver Connect Hbase

The first optional argument is a connection URL, as described previously. When not provided, scheme defaults to http, host to localhost, and port to 8765.

The second optional parameter is a sql file from which to read commands.

Wire API documentation

The API itself is documented in the Apache Calcite project as it is the Avatica API – there is no wire API defined in Phoenix itself.

For more information in building clients in other languages that work with Avatica, please feel free to reach out to the Apache Calcite dev mailing list.

Impersonation

By default, the Phoenix Query Server executes queries on behalf of the end-user. HBase permissions are enforced given the end-user, not the Phoenix Query Server’s identity. In some cases, it may be desirable to execute the query as some other user – this is referred to as “impersonation”. This can enable workflows where a trusted user has the privilege to run queries for other users.

This can be enabled by setting the configuration property phoenix.queryserver.withRemoteUserExtractor to true. The URL of the Query Server can be modified to include the required request parameter. For example, to let “bob” to run a query as “alice”, the following JDBC URL could be used:

The standard Hadoop “proxyuser” configuration keys are checked to validate if the “real” remote user is allowed to impersonate the “doAs” user. See the Hadoop documentation for more information on how to configure these rules.

As a word of warning: there is no end-to-end test coverage for the HBase 0.98 and 1.1 Phoenix releases because of missing test-related code in those HBase releases. While we expect no issues on these Phoenix release lines, we recommend additional testing by the user to verify that there are no issues.

Metrics

By default, the Phoenix Query Server exposes various Phoenix global client metrics via JMX (for HBase versions 1.3 and up). The list of metrics are available here.

PQS Metrics use Hadoop Metrics 2 internally for metrics publishing. Hence it publishes various JVM related metrics. Metrics can be filtered based on certain tags, which can be configured by the property specified in hbase-site.xml on the classpath. Further details are provided in Configuration section.

Configuration

Server components are spread across a number of java packages, so effective logging configuration requires updating multiple packages. The default server logging configuration sets the following log levels:

As of the time of this writing, the underlying Avatica component respects the following configuration options. They are exposed via hbase-site.xml configuration.

Configurations relating to the server instantiation.
PropertyDescriptionDefault
phoenix.queryserver.http.portSpecifies a port the server will listen on. Default is 8765.8765
phoenix.queryserver.metafactory.classThe Avatica Meta.Factory class to instantiate.org.apache.phoenix.queryserver.server.PhoenixMetaFactoryImpl
phoenix.queryserver.serializationThe transport/serialization format, either PROTOBUF or JSON.PROTOBUF
Configurations relating to HTTPS.
HTTPS support is only available in the unbundled phoenix-queryserver 1.0.0-SNAPSHOT version at the time of writing.
PropertyDescriptionDefault
phoenix.queryserver.tls.enabledBoolean which controls if QueryServer uses HTTPS transport. When using HTTPS, the key- and trustore files, and their passwords must also be provided.false
phoenix.queryserver.tls.keystoreThe keystore file that contains the private key of the HTTPS serviceunset
phoenix.queryserver.tls.keystore.passwordThe password for the keystore file that contains the HTTPS private keyempty string
phoenix.queryserver.tls.truststoreThe keystore file that contains the HTTPS certificateunset
phoenix.queryserver.tls.truststore.passwordThe password for the keystore file that contains the HTTPS certificateempty string
Configurations relating to server connecting to a secure cluster.
PropertyDescriptionDefault
hbase.security.authenticationWhen set to 'kerberos', the server will attempt to log in before initiating Phoenix connections.Specified hbase-default.xml
phoenix.queryserver.keytab.fileThe key to look for keytab file.unset
phoenix.queryserver.kerberos.principalThe kerberos principal to use when authenticating. If phoenix.queryserver.kerberos.http.principal is not configured, this principal specified will be also used to both authenticate SPNEGO connections and to connect to HBase.unset
phoenix.queryserver.http.keytab.fileThe keytab file to use for authenticating SPNEGO connections. This configuration MUST be specified if phoenix.queryserver.kerberos.http.principal is configured. phoenix.queryserver.keytab.file will be used if this property is undefined.unset
phoenix.queryserver.http.kerberos.principalThe kerberos principal to use when authenticating SPNEGO connections. phoenix.queryserver.kerberos.principal will be used if this property is undefined.unset
phoenix.queryserver.kerberos.http.principalDeprecated, use phoenix.queryserver.http.kerberos.principal instead.unset
phoenix.queryserver.kerberos.allowed.realmsA comma-separated list of Kerberos realms, other than that of the PQS's HTTP principal's realm, that should be allowed to authenticate with PQS via SPNEGO.unset
phoenix.queryserver.dns.nameserverThe DNS hostnamedefault
phoenix.queryserver.dns.interfaceThe name of the network interface to query for DNS.default
Configurations relating to the server connection cache.
PropertyDescriptionDefault
avatica.connectioncache.concurrencyConnection cache concurrency level. Default is 10.10
avatica.connectioncache.initialcapacityConnection cache initial capacity. Default is 100.100
avatica.connectioncache.maxcapacity Connection cache maximum capacity. Approaching this point, the cache will start to evict least recently used connection objects. Default is 1000. 1000
avatica.connectioncache.expiryduration Connection cache expiration duration. Any connections older than this value will be discarded. Default is 10 minutes. 10
avatica.connectioncache.expiryunit Connection cache expiration unit. Unit modifier applied to the value provided in avatica.connectioncache.expiryunit. Default is minutes. MINUTES
Configurations relating to the server statement cache.
PropertyDescriptionDefault
avatica.statementcache.concurrencyStatement cache concurrency level. Default is 100.100
avatica.statementcache.initialcapacityStatement cache initial capacity. Default is 1000.1000
avatica.statementcache.maxcapacity Statement cache maximum capacity. Approaching this point, the cache will start to evict least recently used statement objects. Default is 10000. 10000
avatica.statementcache.expiryduration Statement cache expiration duration. Any statements older than this value will be discarded. Default is 5 minutes. 5
avatica.statementcache.expiryunit Statement cache expiration unit. Unit modifier applied to the value provided in avatica.statementcache.expiryunit. Default is minutes. MINUTES
Configurations relating to impersonation.
PropertyDescriptionDefault
phoenix.queryserver.withRemoteUserExtractor Boolean which controls if a remote user to impersonate should be extracted from the HTTP request parameter made by that user instead of the HTTP-authenticated user name (which is the default). false
phoenix.queryserver.remoteUserExtractor.param The name of the HTTP request parameter to use to extract the user name to execute the query as. doAs
Configurations relating to metrics.
PropertyDescriptionDefault
phoenix.client.metrics.tag Tag for filtering categories of Phoenix global client metrics emitted by PQS in hadoop-metrics2.properties FAT_CLIENT

Query Server Additions

Dbeaver

The Phoenix Query Server is meant to be horizontally scalable which means that it is a natural fit add-on features like service discovery and load balancing.

Load balancing

The Query Server can use off-the-shelf HTTP load balancers such as the Apache HTTP Server, nginx, or HAProxy. The primary requirement of using these load balancers is that the implementation must implement “sticky session” (when a client communicates with a backend server, that client continues to talk to that backend server). The Query Server also provides some bundled functionality for load balancing using ZooKeeper.

The ZooKeeper-based load balancer functions by automatically registering PQS instances in ZooKeeper and then allows clients to query the list of available servers. This implementation, unlike the others mentioned above, requires that client use the advertised information to make a routing decision. In this regard, this ZooKeeper-based approach is more akin to a service-discovery layer than a traditional load balancer. This load balancer implementation does not support SASL-based (Kerberos) ACLs in ZooKeeper (see PHOENIX-4085).

Dbeaver Hbase

The following are configuration properties used to configure this load balancer:

Dbeaver Kerberos Hbase

Configurations relating to the ZooKeeper-based load balancer.
PropertyDescriptionDefault
phoenix.queryserver.loadbalancer.enabledShould PQS register itself in ZooKeeper for the load balancer.false
phoenix.queryserver.base.pathRoot znode the PQS instance should register itself to./phoenix
phoenix.queryserver.service.nameA unique name to identify this PQS instance from others.queryserver
phoenix.queryserver.zookeeper.acl.usernameName to set for a DIGEST ZooKeeper ACL, optional.phoenix
phoenix.queryserver.zookeeper.acl.passwordPassword to set for a DIGEST ZooKeeper ACL, optional.phoenix