SQLAlchemy has two parts:
* ORM: this maps table and relationships between tables to Python objects
* Core: alows you to write and execute SQL using Python exspressions
Select statment:
import pandas as pd
import pandas as pd | |
def _map_to_pandas(rdds): | |
""" Needs to be here due to pickling issues """ | |
return [pd.DataFrame(list(rdds))] | |
def toPandas(df, n_partitions=None): | |
""" | |
Returns the contents of `df` as a local `pandas.DataFrame` in a speedy fashion. The DataFrame is | |
repartitioned if `n_partitions` is passed. |
import pandas as pd | |
from sqlalchemy import create_engine | |
df = pd.read_csv('<INPUT_FILE_PATH>.csv', encoding='utf') # Needs to be unicode for sqlite | |
disk_engine = create_engine('sqlite:///<DB_PATH>.sqlite') | |
df.to_sql('<TABLE_NAME>', disk_engine, if_exists='append') |
import pandas as pd | |
def nunique_rolling_time_series(data_series, step_freqency, window_size, output_name=''): | |
""" | |
Calculate a rolling statistic of nunique of a time series. The input series has a DateTime index. | |
""" | |
data_series = data_series.sort_index() | |
min_date = data_series.index.min() |
Acorn_Category | Acorn_Category_Name | Acorn_Group | Acorn_Group_Name | Acorn_Type | Acorn_Type_Code | Acorn_Type_Name | |
---|---|---|---|---|---|---|---|
1 | 1 Affluent Achievers | A | 1.A Lavish Lifestyles | 1 | 1.A.1 | 1.A.1 Exclusive enclaves | |
1 | 1 Affluent Achievers | A | 1.A Lavish Lifestyles | 2 | 1.A.2 | 1.A.2 Metropolitan money | |
1 | 1 Affluent Achievers | A | 1.A Lavish Lifestyles | 3 | 1.A.3 | 1.A.3 Large house luxury | |
1 | 1 Affluent Achievers | B | 1.B Executive Wealth | 4 | 1.B.4 | 1.B.4 Asset rich families | |
1 | 1 Affluent Achievers | B | 1.B Executive Wealth | 5 | 1.B.5 | 1.B.5 Wealthy countryside commuters | |
1 | 1 Affluent Achievers | B | 1.B Executive Wealth | 6 | 1.B.6 | 1.B.6 Financially comfortable families | |
1 | 1 Affluent Achievers | B | 1.B Executive Wealth | 7 | 1.B.7 | 1.B.7 Affluent professionals | |
1 | 1 Affluent Achievers | B | 1.B Executive Wealth | 8 | 1.B.8 | 1.B.8 Prosperous suburban families | |
1 | 1 Affluent Achievers | B | 1.B Executive Wealth | 9 | 1.B.9 | 1.B.9 Well-off edge of towners |
To have both Anaconda Python installed with both Python 2 and 3 running simultaneously: | |
1. Download Python Anconda 3 | |
2. Create a Python 2.7 environment `conda create -n py3k python=3.4 anaconda` | |
3. Navigate to the install directory. Something like `Anaconda3\envs\py2k` | |
4. Rename python to python2. And in scripts pip to pip2 | |
5 Add those two directories to the path |
import pandas as pd | |
def clean_str_cols(df, encoding='ascii'): | |
""" | |
As string columns are stored as 'objects' it can cause many problems, especially when reading and writig CSVs. This function | |
forces the columns to be strings and be encoded as a specified encoding. | |
Solves `UnicodeEncodeError` errors when using `to_csv`. | |
""" | |
df = df.copy() |
Force cast conversion to integer or output null:
CREATE OR REPLACE FUNCTION convert_to_integer(v_input text)
RETURNS BIGINT AS $$
DECLARE v_int_value BIGINT DEFAULT NULL;
BEGIN
BEGIN
v_int_value := v_input::BIGINT;
EXCEPTION WHEN OTHERS THEN
RAISE NOTICE 'Invalid integer value: "%". Returning NULL.', v_input;