Utilizing Bitquery's Blockchain Data with Amazon Athena: A Comprehensive Guide

Bitquery offers a rich repository of blockchain data, enabling developers to access and analyze information from various blockchains.

This tutorial will guide you through leveraging Bitquery’s blockchain data with Amazon Athena for insightful analysis.

Amazon Athena allows you to query Amazon S3 Inventory files using standard SQL queries. This feature is available in all Regions where Athena is supported.

AWS Athena Query Architecture Diagram(Source: JasCloudTech)

Amazon Athena can effectively query Amazon S3 Inventory files stored in either Apache ORC, Apache Parquet, or comma-separated values (CSV) formats. For optimal performance and cost-efficiency, we recommend using ORC or Parquet formats for your inventory files.

These columnar formats enable faster data retrieval and processing by allowing Athena to read and decompress only the columns needed for the specific query.

Prerequisites

Before we get started, here are some things that you need to have in hand:

  • [image]An AWS account with access to Amazon Athena (obviously).
  • [image]A Bitquery account for accessing their S3 blockchain data.
  • [image]Familiarity with the structure of the blockchain data provided by Bitquery S3 bucket.

How to Use Bitquery’s Blockchain Data with Amazon Athena

Access AWS Management Console:

Log in to your AWS account and navigate to the AWS Management Console. Make sure your region is set to N.Virginia/us-1-east.

AWS Management Console with Athena Service Search(AWS)

Open Amazon Athena: Find and open the Amazon Athena service from the list of AWS services.

Set Up a Database in Athena (if required):

In this case, before your first query you need to set up a query result location in Amazon s3 as shown below:

If you haven’t created a database in Athena before, click on the “Query Editor” on the left-hand side, and then click “Set up a query result location in Amazon S3.”

Define a location for query results.

Obtain Bitquery’s S3 Bucket Information:

Log in to your Bitquery account and gather details about the S3 bucket containing Bitquery’s blockchain data.

Retrieve information such as the S3 bucket name and the structure of the stored data. In this case the name is streaming-eth

Create an External Table for Bitquery’s Blockchain Data:

In the Athena Query Editor, create an external table that references Bitquery’s blockchain data stored in the S3 bucket. For example:


CREATE EXTERNAL TABLE IF NOT EXISTS bitquery_data (

your_columns_here STRING

)

ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'

WITH SERDEPROPERTIES (

'serialization.format' = 'org.apache.hadoop.mapred.TextInputFormat',

'field.delim' = ','

)

LOCATION 's3://streaming-eth/'

Replace your_columns_here with the columns you want included for your query.

Adjust the LOCATION and column names based on data structure on the S3 bucket.

Create table screenshot

Run Queries on Bitquery’s Blockchain Data:

Here’s how to access your S3 Inventory data using Athena:

  1. Create an Athena table:Refer to the Creating Tables in Amazon Athena guide for details.

  2. Choose a sample query template:

ORC-formatted inventory:

  • Replace your_table_name with your actual Athena table name.

  • Remove any unused optional fields.

  • Replace the bucket name and inventory location with your configuration’s details.

  • Update the 2022-01-01-00-00 date with the first day of your data partition.

Remember:

  • Use standard SQL queries for your chosen format (ORC, Parquet, CSV).

  • Athena can access your inventory data in any region where it’s available.

Also note that:

Demo S3 buckets are open for a limited access from public domain after the blockchain name with demo-streaming- prefix:

demo-streaming-eth for Ethereum Mainnet

demo-streaming-bsc for Binance Smart Chain Mainnet

Buckets are open to list content ( using S3 API ) and get objects ( using S3 and HTTP access). The easiest way to try them is just download the content using HTTP URLs.

Find more Bitquery Documentation.

Once the table is created, execute SQL queries on Bitquery’s blockchain data you downloaded:

For instance, to query for a bucket containing blockchain data for Ethereum Mainnet, you would need to replace [insert bucket name] with the specific bucket name where your downloaded Ethereum Mainnet blockchain data is stored.


CREATE EXTERNAL TABLE your_table_name(

bucket string,

key string,

version_id string,

is_latest boolean,

is_delete_marker boolean,

size bigint,

last_modified_date timestamp,

e_tag string,

storage_class string,

is_multipart_uploaded boolean,

replication_status string,

encryption_status string,

object_lock_retain_until_date bigint,

object_lock_mode string,

object_lock_legal_hold_status string,

intelligent_tiering_access_tier string,

bucket_key_status string,

checksum_algorithm string,

object_access_control_list string,

object_owner string

) PARTITIONED BY (

dt string

)

ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'

STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'

LOCATION 's3://source-bucket/config-ID/hive/[insert Ethereum Mainnet bucket name]'

TBLPROPERTIES (

"projection.enabled" = "true",

"projection.dt.type" = "date",

"projection.dt.format" = "yyyy-MM-dd-HH-mm",

"projection.dt.range" = "2023-12-01-00-00,NOW",

"projection.dt.interval" = "1",

"projection.dt.interval.unit" = "HOURS"

);

Replace [insert Ethereum Mainnet bucket name] with the actual name of the bucket that contains the Ethereum Mainnet blockchain data (downloaded).

The above SQL query will create an external table pointing to that specific S3 bucket location, assuming the data is stored there in the format specified by the table schema.

You can adjust the schema and settings as needed based on the actual structure of your Ethereum Mainnet blockchain data.

Below is how the data would appear()not actual bucket query result)

Sample appearance of Athena SQL bucket query.

Analyze and Visualize Bitquery’s Blockchain Data:

Leverage Athena’s querying capabilities to analyze and derive insights from Bitquery’s S3 blockchain data.

You can perform operations like aggregations, filtering, and joins to extract meaningful information.

Monitor AWS Cost:

Keep an eye on the AWS Cost Explorer to track costs associated with running queries in Athena.

Costs are based on the data scanned during queries, so optimize queries to minimize unnecessary scanning.

Review Query Results and Export Data:

Athena displays query results in the Query Editor. Review, export, or save results to S3 or other destinations for further analysis or visualization.

This tutorial provides a foundational guide for utilizing Amazon Athena to query and analyze blockchain data stored in Bitquery’s S3 bucket. Modify queries and table structures as needed based on Bitquery’s specific data format and structure.

Remember: Always optimize queries to minimize data scanned and associated costs. Adjust query complexity based on the volume of data processed from Bitquery’s S3 bucket.

About Bitquery

Bitquery is your comprehensive toolkit designed with developers in mind, simplifying blockchain data access. Our products offer practical advantages and flexibility.

APIs - Explore API: Easily retrieve precise real-time and historical data for over 40 blockchains using GraphQL. Seamlessly integrate blockchain data into your applications, making data-driven decisions effortless.

Coinpath® - Try Coinpath: Streamline compliance and crypto investigations by tracing money movements across 40+ blockchains. Gain insights for efficient decision-making.

Data in Cloud - Try Demo Bucket: Access indexed blockchain data cost-effectively and at scale for your data pipeline. We currently support Ethereum, BSC, Solana, with more blockchains on the horizon, simplifying your data access.

Explorer - Try Explorer: Discover an intuitive platform for exploring data from 40+ blockchains. Visualize data, generate queries, and integrate effortlessly into your applications.

Bitquery empowers developers with straightforward blockchain data tools. If you have questions or need assistance, connect with us on our Telegram channel or via email at hello@bitquery.io. Stay updated on the latest in cryptocurrency by subscribing to our newsletter below.

Blog is written by guest author Thiongo Waweru