Last active
August 1, 2022 17:36
Revisions
-
shaypal5 revised this gist
Aug 1, 2022 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -13,7 +13,7 @@ >>> mp.pipeline A pdpipe pipeline: [ 0] Drop columns Columns with at least 0.2 missing value rate [ 1] Drop rows by label values [ 2] Encode label values [ 3] Drop columns 'Name' [ 4] Apply dataframe method set_index with kwargs {'keys': 'id'} -
shaypal5 revised this gist
Aug 1, 2022 . 1 changed file with 16 additions and 15 deletions.There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -16,20 +16,21 @@ [ 1] Drop labels by values [ 2] Encode label values [ 3] Drop columns 'Name' [ 4] Apply dataframe method set_index with kwargs {'keys': 'id'} [ 5] Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] > 101> [ 6] Assign column Viking with df[Country].isin(['Denmark', 'Finland']) & ~df[Bearded] [ 7] Assign column YearlyGrands with df[Savings] * 1000 / df[Age] [ 8] Bin Savings by [1]. [ 9] One-hot encode 'Country' [10] Tokenize Quote [11] Stemming tokens in Quote... [12] Remove stopwords from Quote [13] Count-vectorizing column Quote. [14] Decompose columns Columns that start with Quote with PCA [15] Encode 'Savings_bin', 'Gender' [16] Scale columns Columns of dtypes <class 'numpy.number'> [17] Drop columns 'Bearded' [18] Transform input dataframes to the following schema: <Learnable Schema> [19] Validates conditions -
shaypal5 created this gist
Aug 1, 2022 .There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,35 @@ >>> mp = MyPipelineAndModel( savings_max_val=101, drop_gender=False, standardize=True, ohencode_country=True, savings_bin_val=1, pca_threshold=25, fit_intercept=True) >>> mp <PdPipeline -> LogisticRegression> >>> mp.estimator LogisticRegression() >>> mp.pipeline A pdpipe pipeline: [ 0] Drop columns Columns with at least 0.2 missing value rate [ 1] Drop labels by values [ 2] Encode label values [ 3] Drop columns 'Name' [ 4] Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] > 101> [ 5] Assign column Viking with df[Country].isin(['Denmark', 'Finland']) & ~df[Bearded] [ 6] Assign column YearlyGrands with df[Savings] * 1000 / df[Age] [ 7] Bin Savings by [1]. [ 8] One-hot encode 'Country' [ 9] Tokenize Quote [10] Stemming tokens in Quote... [11] Remove stopwords from Quote [12] Count-vectorizing column Quote. [13] Decompose columns Columns that start with Quote with PCA [14] Encode 'Savings_bin', 'Gender' [15] Scale columns Columns of dtypes <class 'numpy.number'> [16] Drop columns 'Bearded' [17] Transform input dataframes to the following schema: <Learnable Schema> [18] Validates conditions