Skip to content

firebolt-db/airbyte-readme

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Firebolt Destination

This page guides you through the process of setting up the Firebolt destination connector.

Prerequisites

This Firebolt destination connector has two replication strategies:

  1. SQL: Replicates data via SQL INSERT queries. This leverages Firebolt SDK to execute queries directly on Firebolt Engines. Not recommended for production workloads as this does not scale well.

  2. S3: Replicates data by first uploading data to an S3 bucket, creating an External Table and writing into a final Fact Table. This is the recommended loading approach. Requires an S3 bucket and credentials in addition to Firebolt credentials.

For SQL strategy:

  • Host
  • Username
  • Password
  • Database
  • Engine (optional)

Airbyte automatically picks an approach depending on the given configuration - if S3 configuration is present, Airbyte will use the S3 strategy.

For S3 strategy:

  • Username
  • Password
  • Database
  • S3 Bucket Name
    • See this to create an S3 bucket.
  • S3 Bucket Region
    • Create the S3 bucket on the same region as the Firebolt database.
  • Access Key Id
  • Secret Access Key
    • Corresponding key to the above key id.
  • Host (optional)
    • Firebolt backend URL. Can be left blank for most usecases.
  • Engine (optional)
    • If connecting to a non-default engine you should specify its name or url here.

Setup guide

  1. Create a Firebolt account following the guide
  2. Follow the getting started tutorial to setup a database.
  3. Create a General Purpose (read-write) engine as described in here
  4. (Optional) Create a staging S3 bucket (for the S3 strategy).
  5. (Optional) Create an IAM with programmatic access to read, write and delete objects from an S3 bucket.

Supported sync modes

The Firebolt destination connector supports the following sync modes:

  • Full Refresh
  • Incremental - Append Sync

Connector-specific features & highlights

Output schema

Each stream will be output into its own raw Fact table in Firebolt. Each table will contain 3 columns:

  • _airbyte_ab_id: a uuid assigned by Airbyte to each event that is processed. The column type in Firebolt is VARCHAR.
  • _airbyte_emitted_at: a timestamp representing when the event was pulled from the data source. The column type in Firebolt is TIMESTAMP.
  • _airbyte_data: a json blob representing the event data. The column type in Firebolt is VARCHAR but can be be parsed with JSON functions.

Firebolt Source

Overview

The Firebolt source allows you to sync your data from Firebolt. Only Full refresh is supported at the moment.

The connector is built on top of a pure Python firebolt-sdk and does not require additonal dependencies.

Resulting schema

The Firebolt source does not alter schema present in your database. Depending on the destination connected to this source, however, the result schema may be altered. See the destination's documentation for more details.

Features

Feature Supported?(Yes/No) Notes
Full Refresh Sync Yes
Incremental - Append Sync No

Getting started

Requirements

  1. An existing AWS account

Setup guide

  1. Create a Firebolt account following the guide

  2. Follow the getting started tutorial to setup a database

  3. Load data

  4. Create an Analytics (read-only) engine as described in here

You should now have the following

  1. An existing Firebolt account
  2. Connection parameters handy
    1. Username
    2. Password
    3. Account, in case of a multi-account setup (Optional)
    4. Host (Optional)
    5. Engine (Optional), preferably Analytics/read-only
  3. A running engine (if an engine is stopped or booting up you won't be able to connect to it)
  4. Your data in either Fact or Dimension tables.

You can now use the Airbyte Firebolt source.

About

Readme for temporary airbyte packages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages