DBT Transformation & Model Management

Data transformation ensures your data warehouse is a single, consistent source of centralized data. We recommend using DBT, which is open-source, runs where the data lives, and is easy to use.

Using DBT with dbt-postgres

Using DBT with Hydra uses DBT's Postgres support.

Installing dbt-postgres

pip is the easiest way to install the adapter:

pip install dbt-postgres

Installing dbt-postgres will also install dbt-core and any other dependencies.

Other ways to install dbt Core on the command line:

Configuring dbt-postgres


"Unlogged" tables can be considerably faster than ordinary tables, as they are not written to the write-ahead log nor replicated to read replicas. They are also considerably less safe than ordinary tables. See Postgres docs for details.

{{ config(materialized='table', unlogged=True) }}
select ...
  +unlogged: true


It's important to create indexes for columns that are commonly used in joins or where clauses.

Table models, incremental models, seeds, and snapshots may have a list of indexes defined. Each Postgres index can have three components:

  • columns (list, required): one or more columns on which the index is defined

  • unique (boolean, optional): whether the index should be declared unique

  • type (string, optional): a supported index type (B-tree, Hash, GIN, etc)

{{ config(
    materialized = 'table',
      {'columns': ['column_a'], 'type': 'hash'},
      {'columns': ['column_a', 'column_b'], 'unique': True},
select ...

If one or more indexes are configured on a resource, dbt will run create index DDL statement(s) as part of that resource's materialization, within the same transaction as its main create statement. For the index's name, dbt uses a hash of its properties and the current timestamp, in order to guarantee uniqueness and avoid namespace conflict with other indexes.

create index if not exists
on "my_target_database"."my_target_schema"."indexed_model"
 using hash

create unique index if not exists
on "my_target_database"."my_target_schema"."indexed_model"
 (column_a, column_b);

You can also configure indexes for a number of resources at once:

        - columns: ['column_a']
          type: hash

Profile Configuration

Postgres targets should be set up using the following configuration in your profiles.yml file.

  target: dev
      type: postgres
      host: [hostname]
      user: [username]
      password: [password]
      port: [port]
      dbname: [database name]
      schema: [dbt schema]
      threads: [1 or more]
      [keepalives_idle](<https://docs.getdbt.com/reference/warehouse-setups/postgres-setup#keepalives_idle>): 0 # default 0, indicating the system default. See below
      connect_timeout: 10 *# default 10 seconds*
      [retries](<https://docs.getdbt.com/reference/warehouse-setups/postgres-setup#retries>): 1  # default 1 retry on error/timeout when opening connections
      [search_path](<https://docs.getdbt.com/reference/warehouse-setups/postgres-setup#search_path>): [optional, override the default postgres search_path]
      [role](<https://docs.getdbt.com/reference/warehouse-setups/postgres-setup#role>): [optional, set the role dbt assumes when executing queries]
      [sslmode](<https://docs.getdbt.com/reference/warehouse-setups/postgres-setup#sslmode>): [optional, set the sslmode used to connect to the database]



The search_path config controls the Postgres "search path" that dbt configures when opening new connections to the database. By default, the Postgres search path is "$user, public", meaning that unqualified table names will be searched for in the public schema, or a schema with the same name as the logged-in user. Note: Setting the search_path to a custom value is not necessary or recommended for typical usage of dbt.


The role config controls the Postgres role that dbt assumes when opening new connections to the database.


The sslmode config controls how dbt connectes to Postgres databases using SSL. See the Postgres docs on sslmode for usage information. When unset, dbt will connect to databases using the Postgres default, prefer, as the sslmode.


If the database closes its connection while dbt is waiting for data, you may see the error SSL SYSCALL error: EOF detected. Lowering the [keepalives_idle value](https://www.postgresql.org/docs/9.3/libpq-connect.html) may prevent this, because the server will send a ping to keep the connection active more frequently.

dbt's default setting is 0 (the server's default value), but can be configured lower (perhaps 120 or 60 seconds), at the cost of a chattier network connection.

Last updated