From S3
Amazon S3 is a popular cloud-based storage service that is often used to store large amounts of data. You can load CSV data or Parquet data from S3 to Hydra.
Loading CSV data from S3
You can load data or run queries against CSV files stored on Amazon S3 using an S3 CSV External Table. S3 CSV External Tables are implemented using s3csv_fdw
. To create a S3 CSV External Table, create a data.csv
file with the following content:
Upload the file to S3 and create a multicorn
S3 CSV foreign table, replacing ...
with your AWS credentials and S3 bucket name:
You can now load this data into Hydra using a INSERT ... SELECT
query:
Loading Parquet data from S3
You can load data or run queries against Apache Parquet files stored on Amazon S3. S3 Parquet External Tables are implemented using parquet_s3_fdw
. As an example, we are using the same data from here.
The column details are as followed:
Upload the parquet files to a S3 bucket folder called sample-data
, and create a S3 Parquet foreign table, replacing ...
with your AWS credentials, region, and S3 bucket name:
You can now read data from the Parquet file using SELECT ... FROM userdata
. Note that every query will read the data again, incurring charges on your AWS account. For better performance and avoiding ongoing charges, we recommend caching the data locally in Hydra by creating a materialized view:
Or inserting it into a table:
Last updated