Skip to content

Commit

Permalink
Undo reverts (#52)
Browse files Browse the repository at this point in the history
* Refactor python example (#41)

* bootstrap refactored example

* refactor original code

* performance optimizations

* add logging

* remove user/pwd defaults and add exceptions

* fix tests

* remove pycache files

* fix exceptions and tests

* fix imports

* add copyright headers

* fix readme and remove requirements.txt

* run formatters, bandit and restructure

* update readme

* make changes to git ignore

* address doc review comments

* doc updates

* quickstart changes

* add endpoint class to mask other two

* add functional programming

* add title to readme

* rename modules

* Fix arg requirements and client middleware auth header (#42)

* bootstrap refactored example

* refactor original code

* performance optimizations

* add logging

* remove user/pwd defaults and add exceptions

* fix tests

* remove pycache files

* fix exceptions and tests

* fix imports

* add copyright headers

* fix readme and remove requirements.txt

* run formatters, bandit and restructure

* update readme

* make changes to git ignore

* address doc review comments

* doc updates

* quickstart changes

* add endpoint class to mask other two

* add functional programming

* add title to readme

* rename modules

* fix argument requirements and client middleware

* change wording

* fix typo

* fix typo

* fix spacing

* encode bearer token

* Change package version (#43)

* bootstrap refactored example

* refactor original code

* performance optimizations

* add logging

* remove user/pwd defaults and add exceptions

* fix tests

* remove pycache files

* fix exceptions and tests

* fix imports

* add copyright headers

* fix readme and remove requirements.txt

* run formatters, bandit and restructure

* update readme

* make changes to git ignore

* address doc review comments

* doc updates

* quickstart changes

* add endpoint class to mask other two

* add functional programming

* add title to readme

* rename modules

* fix argument requirements and client middleware

* change wording

* fix typo

* fix typo

* fix spacing

* encode bearer token

* change version to 1.0.0

* Fix typos in readme (#44)

* bootstrap refactored example

* refactor original code

* performance optimizations

* add logging

* remove user/pwd defaults and add exceptions

* fix tests

* remove pycache files

* fix exceptions and tests

* fix imports

* add copyright headers

* fix readme and remove requirements.txt

* run formatters, bandit and restructure

* update readme

* make changes to git ignore

* address doc review comments

* doc updates

* quickstart changes

* add endpoint class to mask other two

* add functional programming

* add title to readme

* rename modules

* fix argument requirements and client middleware

* change wording

* fix typo

* fix typo

* fix spacing

* encode bearer token

* change version to 1.0.0

* doc typos

* Refactor python example (#41)

* bootstrap refactored example

* refactor original code

* performance optimizations

* add logging

* remove user/pwd defaults and add exceptions

* fix tests

* remove pycache files

* fix exceptions and tests

* fix imports

* add copyright headers

* fix readme and remove requirements.txt

* run formatters, bandit and restructure

* update readme

* make changes to git ignore

* address doc review comments

* doc updates

* quickstart changes

* add endpoint class to mask other two

* add functional programming

* add title to readme

* rename modules

"merge conflicts"

* Fix arg requirements and client middleware auth header (#42)

* bootstrap refactored example

* refactor original code

* performance optimizations

* add logging

* remove user/pwd defaults and add exceptions

* fix tests

* remove pycache files

* fix exceptions and tests

* fix imports

* add copyright headers

* fix readme and remove requirements.txt

* run formatters, bandit and restructure

* update readme

* make changes to git ignore

* address doc review comments

* doc updates

* quickstart changes

* add endpoint class to mask other two

* add functional programming

* add title to readme

* rename modules

* fix argument requirements and client middleware

* change wording

* fix typo

* fix typo

* fix spacing

* encode bearer token

"allow empty"

* include all modules that start with dremio

* change folder structure

* add documentation for using pat against software
  • Loading branch information
ravjotbrar authored Mar 24, 2023
1 parent 34c4a56 commit 9221f93
Show file tree
Hide file tree
Showing 18 changed files with 753 additions and 445 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,10 @@ java/.idea
java/target
java/*.iml
.idea
*.env
**/__pycache__
**/*.egg-info
**/build
**/dist
**/*.pytest_cache
**/*.log
39 changes: 8 additions & 31 deletions QUICKSTART.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,41 +12,18 @@
## 1 Query data using Flight clients
This process is the same if you launched the Dremio locally or via docker.

### 1.1 Query your datasets with arrow flight client in python

This lightweight Python client application connects to the Dremio Arrow Flight server endpoint. It requires the username and password for authentication. Developers can use admin or regular user credentials for authentication. Any datasets in Dremio that are accessible by the provided Dremio user can be queried. By default, the hostname is `localhost` and the port is `32010`. Developers can change these default settings by providing the hostname and port as arguments when running the client. Moreover, the tls option can be provided to establish an encrypted connection.

> Note: Trusted certificates must be provided when the tls option is enabled.
### 1.1 Query your datasets with Python
>
#### 1.1.1 Prerequisites

- Python 3

#### 1.1.2 Instructions on using this Python sample application
#### 1.1.1 Getting Started

- This application also requires `pyarrow` and `pandas`. Consider one of the dependency installation methods below. We recommend using `conda` for its ease of use.
- Install dependencies using `conda`
- `conda install -c conda-forge --file requirements.txt`
- Alternatively, install dependencies using `pip`
- `pip3 install -r requirements.txt`
- Run the Python sample application:
- `python3 example.py -host '<DREMIO_HOST>' -user '<DREMIO_USERNAME>' -pass '<DREMIO_PASSWORD>'`
1. Install [Python 3](https://www.python.org/downloads/)
2. Download and install the [dremio-flight-endpoint whl file](https://github.com/dremio-hub/arrow-flight-client-examples/releases)
- `python -m pip install <PATH TO WHEEL>`
3. Copy the contents of arrow-flight-client-examples/python/example.py into your own python file.
4. Run your python file with a local instance of Dremio:
- `python3 example.py --username <USER> --password <password> -query 'SELECT 1'`

#### 1.1.3 Usage
```
example.py [-h] [-host HOSTNAME] [-port FLIGHTPORT] -user USERNAME -pass PASSWORD [-query SQLQUERY] [-tls] [-certs TRUSTEDCERTIFICATES]

optional arguments:
-h, --help show this help message and exit
-host HOSTNAME, --hostname HOSTNAME Dremio co-ordinator hostname
-port FLIGHTPORT, --flightPort FLIGHTPORT Dremio flight server port
-user USERNAME, --username USERNAME Dremio username
-pass PASSWORD, --password PASSWORD Dremio password
-query SQLQUERY, --sqlquery SQLQUERY SQL query to test
-tls, --tls Enable encrypted connection
-certs TRUSTEDCERTIFICATES, --trustedCertificates TRUSTEDCERTIFICATES Path to trusted certificates for encrypted connection
```
---

### 1.2 Query your dataset with arrow flight client in java
This lightweight Java client application connects to the Dremio Arrow Flight server endpoint. It requires the username and password for authentication. Developers can use admin or regular user credentials for authentication. Any datasets in Dremio that are accessible by the provided Dremio user can be queried. By default, the hostname is `localhost` and the port is `32010`. Developers can change these default settings by providing the hostname and port as arguments when running the client. Moreover, the tls option can be provided to establish an encrypted connection.
Expand Down
117 changes: 54 additions & 63 deletions python/README.md
Original file line number Diff line number Diff line change
@@ -1,82 +1,73 @@
# Python Arrow Flight Client Application Example
![Build Status](https://github.com/dremio-hub/arrow-flight-client-examples/workflows/python-build/badge.svg)

This lightweight Python client application connects to the Dremio Arrow Flight server endpoint. Developers can use token based or regular user credentials (username/password) for authentication. Please note username/password is not supported for Dremio Cloud. Any datasets in Dremio that are accessible by the provided Dremio user can be queried. By default, the hostname is `localhost` and the port is `32010`. Developers can change these default settings by providing the hostname and port as arguments when running the client.
Moreover, the tls option can be provided to establish an encrypted connection.
## Getting Started
1. Install [Python 3](https://www.python.org/downloads/)
2. Download and install the [dremio-flight-endpoint whl file](https://github.com/dremio-hub/arrow-flight-client-examples/releases)
- `python -m pip install <PATH TO WHEEL>`
3. Copy the contents of arrow-flight-client-examples/python/example.py into your own python file.
4. Run your python file with a local instance of Dremio:
- `python3 example.py --username <USER> --password <password> -query 'SELECT 1'`


## How to connect to Dremio Cloud

Get started with your first query to Dremio Cloud.

* The following example requires you to create a [Personal Access Token](https://docs.dremio.com/cloud/security/authentication/personal-access-token/) in Dremio. Replace ```<INSERT PAT HERE>``` in the example below with your actual PAT token.
* You may need to wait for a Dremio engine to start up or start it manually if no Dremio engine for your Organization is running.

This example queries the Dremio Sample dataset ```NYC-taxi-trips``` and returns the first 10 values.

```python3 example.py -host data.dremio.cloud -port 443 -pat '<INSERT PAT HERE>' -tls -query 'SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10'```

You have now run your first Flight query on Dremio Cloud!

### Instructions on using this Python sample application
- Install and setup Python3 as `pyarrow` requires Python3
- This application also requires `pyarrow` and `pandas`. Consider one of the dependency installation methods below. We recommend using `conda` for its ease of use.
- Install dependencies using `conda`
- `conda install -c conda-forge --file requirements.txt`
- Alternatively, install dependencies using `pip`
- `pip3 install -r requirements.txt`
- Run the Python sample application with a local instance of Dremio (with default parameters):
- `python3 example.py -query 'SELECT 1'`
## Configuration Options

```
usage: example.py [-h] [-host HOSTNAME] [-port PORT] [-user USERNAME] [-pass PASSWORD]
[-pat, -authToken PAT_OR_AUTH_TOKEN] [-query QUERY] [-tls] [-dsv DISABLE_SERVER_VERIFICATION]
[-certs TRUSTED_CERTIFICATES] [-sessionProperties [SESSION_PROPERTIES ...]] [-engine ENGINE]
usage: example.py [-h] [-host HOSTNAME] [-port PORT] -user USERNAME -pass PASSWORD
-pat TOKEN -query QUERY [-tls]
[-dcv DISABLE_CERTIFICATE_VERIFICATION]
[-path_to_certs PATH_TO_CERTS] [-sp [SESSION_PROPERTIES ...]]
[-engine ENGINE]
optional arguments:
options:
-h, --help show this help message and exit
-host HOSTNAME, --hostname HOSTNAME
Dremio co-ordinator hostname. Defaults to "localhost".
-port PORT, --flightport PORT
Dremio flight server port. Defaults to 32010.
-user USERNAME, --username USERNAME
Dremio username. Defaults to "dremio".
Dremio username. Not applicable when connecting to Dremio
Cloud
-pass PASSWORD, --password PASSWORD
Dremio password. Defaults to "dremio123".
-pat PAT_OR_AUTH_TOKEN, --personalAccessToken PAT_OR_AUTH_TOKEN, -authToken PAT_OR_AUTH_TOKEN, --authToken PAT_OR_AUTH_TOKEN
Either a Personal Access Token or an OAuth2 Token.
Dremio password. Not applicable when connecting to Dremio
Cloud
-pat TOKEN, --token TOKEN
Either a Personal Access Token or an OAuth2 Token. Only
applicable to Dremio Cloud. Use --password if connecting to
Dremio Software using PAT
-query QUERY, --sqlQuery QUERY
SQL query to test.
SQL query to test. Must be enclosed in single quotes. If
single quotes are already present within the query, change
those to double quotes and enclose entire query in single
quotes.
-tls, --tls Enable encrypted connection. Defaults to False.
-dsv DISABLE_SERVER_VERIFICATION, --disableServerVerification DISABLE_SERVER_VERIFICATION
Disable TLS server verification. Defaults to False.
-certs TRUSTED_CERTIFICATES, --trustedCertificates TRUSTED_CERTIFICATES
Path to trusted certificates for encrypted connection. Defaults to system certificates.
-sessionProperties [SESSION_PROPERTIES ...], --sessionProperties [SESSION_PROPERTIES ...]
Key value pairs of SessionProperty, example: -sessionProperties schema='Samples."samples.dremio.com"'
-dcv DISABLE_CERTIFICATE_VERIFICATION, --disableCertificateVerification DISABLE_CERTIFICATE_VERIFICATION
Disables TLS server verification. Defaults to False.
-path_to_certs PATH_TO_CERTS, --trustedCertificates PATH_TO_CERTS
Path to trusted certificates for encrypted connection.
Defaults to system certificates.
-sp [SESSION_PROPERTIES ...], --sessionProperty [SESSION_PROPERTIES ...]
Key value pairs of SessionProperty, example: -sp
schema='Samples."samples.dremio.com"' -sp key=value
-engine ENGINE, --engine ENGINE
The specific engine to run against.
```

### Getting Started

Get started with your first query to Dremio Cloud.

* The following example requires you to create a [Personal Access Token](https://docs.dremio.com/software/security/personal-access-tokens/) in Dremio. Replace ```<INSERT PAT HERE>``` in the example below with your actual PAT token.
* You may need to wait for a Dremio engine to start up if no Dremio engine for your Organization is running.

This example queries the Dremio Sample dataset ```NYC-taxi-trips``` and returns the first 10 values.
The specific engine to run against. Only applicable to Dremio
Cloud.
```python3 example.py -host data.dremio.cloud -port 443 -pat '<INSERT PAT HERE>' -tls -query 'SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10'```

Running the command will return the following.

``` [INFO] Enabling TLS connection
[INFO] Trusted certificates provided
[INFO] Authentication skipped until first request
[INFO] Query: SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10
[INFO] GetSchema was successful
[INFO] Schema: <pyarrow._flight.SchemaResult object at 0x7febe2944610>
[INFO] GetFlightInfo was successful
[INFO] Ticket: <Ticket b'\nDSELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10\x12^\n\\\nDSELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10\x10(\x1a\x12\t\x8a\x883#\x12\xd1\xd9\x1d\x11\x00\xb3\xbbC\xdb\xd9J\t'>
[INFO] Reading query results from Dremio
pickup_datetime passenger_count trip_distance_mi fare_amount tip_amount total_amount
0 2013-05-27 19:15:00 1 1.26 7.5 0.00 8.00
1 2013-05-31 16:40:00 1 0.73 5.0 1.20 7.70
2 2013-05-27 19:03:00 2 9.23 27.5 5.00 38.33
3 2013-05-31 16:24:00 1 2.27 12.0 0.00 13.50
4 2013-05-27 19:17:00 1 0.71 5.0 0.00 5.50
5 2013-05-27 19:11:00 1 2.52 10.5 3.15 14.15
6 2013-05-31 16:41:00 5 1.01 6.0 1.10 8.60
7 2013-05-31 16:37:00 1 1.25 8.5 0.00 10.00
8 2013-05-31 16:39:00 1 2.04 10.0 1.50 13.00
9 2013-05-27 19:02:00 1 11.73 32.5 8.12 41.12
```

You have now run your first Flight query on Dremio Cloud!
## Description
![Build Status](https://github.com/dremio-hub/arrow-flight-client-examples/workflows/python-build/badge.svg)
This lightweight Python client application connects to the Dremio Arrow Flight server endpoint. Developers can use token based or regular user credentials (username/password) for authentication. Please note username/password is not supported for Dremio Cloud. Any datasets in Dremio that are accessible by the provided Dremio user can be queried. By default, the hostname is `localhost` and the port is `32010`. Developers can change these default settings by providing the hostname and port as arguments when running the client.
Moreover, the tls option can be provided to establish an encrypted connection.
Empty file.
Empty file.
128 changes: 128 additions & 0 deletions python/dremio-flight/dremio/arguments/parse.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
"""
Copyright (C) 2017-2021 Dremio Corporation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""

import argparse
import sys
import certifi


class KVParser(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
setattr(namespace, self.dest, [])

dest = list(
map(
lambda value: list(
map(lambda split_val: split_val.encode("utf-8"), value.split("="))
),
values,
)
)
setattr(namespace, self.dest, dest)


def parse_arguments():
"""
Parses the command-line arguments supplied to the script.
"""

parser = argparse.ArgumentParser()
parser.add_argument(
"-host",
"--hostname",
type=str,
help='Dremio co-ordinator hostname. Defaults to "localhost".',
default="localhost",
)
parser.add_argument(
"-port",
"--flightport",
dest="port",
type=int,
help="Dremio flight server port. Defaults to 32010.",
default=32010,
)
parser.add_argument(
"-user",
"--username",
type=str,
help="Dremio username. Not applicable when connecting to Dremio Cloud",
required="-pat" not in sys.argv and "--token" not in sys.argv,
)
parser.add_argument(
"-pass",
"--password",
type=str,
help="Dremio password. Not applicable when connecting to Dremio Cloud",
required="-pat" not in sys.argv and "--token" not in sys.argv,
)
parser.add_argument(
"-pat",
"--token",
dest="token",
type=str,
help="Either a Personal Access Token or an OAuth2 Token. Only applicable to Dremio Cloud. Use --password if connecting to Dremio Software using PAT",
required="-user" not in sys.argv and "--username" not in sys.argv,
)
parser.add_argument(
"-query",
"--sqlQuery",
dest="query",
type=str,
help="SQL query to test. Must be enclosed in single quotes. If single quotes are already present within the query, change those to double quotes and enclose entire query in single quotes.",
required=True,
)
parser.add_argument(
"-tls",
"--tls",
dest="tls",
help="Enable encrypted connection. Defaults to False.",
default=False,
action="store_true",
)
parser.add_argument(
"-dcv",
"--disableCertificateVerification",
dest="disable_certificate_verification",
type=bool,
help="Disables TLS server verification. Defaults to False.",
default=False,
)
parser.add_argument(
"-path_to_certs",
"--trustedCertificates",
dest="path_to_certs",
type=str,
help="Path to trusted certificates for encrypted connection. Defaults to system certificates.",
default=certifi.where(),
)
parser.add_argument(
"-sp",
"--sessionProperty",
dest="session_properties",
help="Key value pairs of SessionProperty, example: -sp schema='Samples.\"samples.dremio.com\"' -sp key=value",
required=False,
nargs="*",
action=KVParser,
)
parser.add_argument(
"-engine",
"--engine",
type=str,
help="The specific engine to run against. Only applicable to Dremio Cloud.",
required=False,
)
return parser.parse_args()
Empty file.
Loading

0 comments on commit 9221f93

Please sign in to comment.