This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import pandas as pd | |
import google_auth_oauthlib.flow | |
import googleapiclient.discovery | |
import googleapiclient.errors | |
import matplotlib | |
scopes = ["https://www.googleapis.com/auth/youtube.readonly"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Define the CREATE TABLE statement here | |
{%- macro create_table() %} | |
... | |
{%- endmacro %} | |
# Main ETL logic, insert the results into a STAGING table | |
{%- macro main() %} | |
... | |
{%- endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE TABLE IF NOT EXISTS fct_bookings ( | |
id_listing BIGINT COMMENT 'Unique ID of the listing' | |
, id_host BIGINT COMMENT 'Unique ID of the host who owns the listing' | |
, m_bookings BIGINT COMMENT 'Denoted 1 if a booking transaction occurred' | |
) | |
PARTITION BY ( -- this is how we define partition keys | |
ds STRING | |
); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{%- if backfill %} | |
INSERT OVERWRITE TABLE bookings_summary PARTITION (ds) | |
{%- else %} | |
INSERT OVERWRITE TABLE bookings_summary PARTITION (ds = '{{ ds }}') | |
{%- endif %} | |
SELECT | |
dim_market | |
, SUM(m_bookings) AS m_bookings | |
{%- if backfill %} | |
, ds |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Not Recommended Approach: Scan the entire table and rebuild everyday | |
INSERT OVERWRITE TABLE dim_total_bookings PARTITION (ds = '{{ ds }}') | |
SELECT | |
dim_market | |
, SUM(m_bookings) AS m_bookings | |
FROM | |
fct_bookings | |
WHERE | |
ds <= '{{ ds }}' -- this is expensive, and can quickly run into scale issue | |
GROUP BY |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Column | Type | Extra | Comment | |
--------------------+---------+-----------------+------------------------------------------- | |
id_listing | bigint | | Unique id of the listing. | |
id_host | bigint | | Unique id of the host for the listing | |
dim_market | varchar | | The market in which the listing is located | |
ds | varchar | partition key | | |
(4 rows) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
INSERT OVERWRITE TABLE bookings_summary PARTITION (ds) | |
SELECT | |
dim_market | |
, SUM(m_bookings) AS m_bookings | |
, ds # For Hive to know we are using dynamic partitions | |
FROM | |
fct_bookings | |
WHERE | |
ds BETWEEN '{{ earliest_ds }}' AND '{{ latest_ds }}' | |
GROUP BY |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
INSERT OVERWRITE TABLE fct_bookings PARTITION (ds='{{ latest_ds }}') | |
SELECT | |
id_listing | |
, m_bookings | |
FROM | |
some_up_stream_booking_tables | |
WHERE | |
ds = '{{ latest_ds }}' | |
; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
INSERT OVERWRITE TABLE bookings_summary PARTITION (ds= '{{ earliest_ds }}') | |
SELECT | |
dim_market | |
, SUM(m_bookings) AS m_bookings | |
FROM | |
fct_bookings | |
WHERE | |
ds = '{{ earliest_ds }}' | |
GROUP BY | |
dim_market |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SELECT | |
b.dim_market | |
, SUM(a.m_bookings) AS m_bookings | |
FROM ( | |
SELECT | |
id_listing | |
, 1 AS m_bookings | |
, m_a # not used (for illustration only) | |
, m_b # not used (for illustration only) | |
, m_c # not used (for illustration only) |
NewerOlder