shaypal5/pdp_post_adv2.py

## 2 changes: 1 addition & 1 deletion pdp_post_adv2.py
@@ -13,7 +13,7 @@

    >>> mp.pipeline
>>> mp.pipeline

    A pdpipe pipeline:
A pdpipe pipeline:

    [ 0]  Drop columns Columns with at least 0.2 missing value rate
[ 0]  Drop columns Columns with at least 0.2 missing value rate

    [ 1]  Drop labels by values
[ 1]  Drop labels by values

    [ 1]  Drop rows by label values
[ 1]  Drop rows by label values

    [ 2]  Encode label values
[ 2]  Encode label values

    [ 3]  Drop columns 'Name'
[ 3]  Drop columns 'Name'

    [ 4]  Apply dataframe method set_index with kwargs {'keys': 'id'}
[ 4]  Apply dataframe method set_index with kwargs {'keys': 'id'}


## 31 changes: 16 additions & 15 deletions pdp_post_adv2.py
@@ -16,20 +16,21 @@

    [ 1]  Drop labels by values
[ 1]  Drop labels by values

    [ 2]  Encode label values
[ 2]  Encode label values

    [ 3]  Drop columns 'Name'
[ 3]  Drop columns 'Name'

    [ 4]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
[ 4]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >

    [ 4]  Apply dataframe method set_index with kwargs {'keys': 'id'}
[ 4]  Apply dataframe method set_index with kwargs {'keys': 'id'}

    [ 5]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
[ 5]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >

          101>
      101>

    [ 5]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
[ 5]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &

    [ 6]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
[ 6]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &

          ~df[Bearded]
      ~df[Bearded]

    [ 6]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
[ 6]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]

    [ 7]  Bin Savings by [1].
[ 7]  Bin Savings by [1].

    [ 8]  One-hot encode 'Country'
[ 8]  One-hot encode 'Country'

    [ 9]  Tokenize Quote
[ 9]  Tokenize Quote

    [10]  Stemming tokens in Quote...
[10]  Stemming tokens in Quote...

    [11]  Remove stopwords from Quote
[11]  Remove stopwords from Quote

    [12]  Count-vectorizing column Quote.
[12]  Count-vectorizing column Quote.

    [13]  Decompose columns Columns that start with Quote with PCA
[13]  Decompose columns Columns that start with Quote with PCA

    [14]  Encode 'Savings_bin', 'Gender'
[14]  Encode 'Savings_bin', 'Gender'

    [15]  Scale columns Columns of dtypes <class 'numpy.number'>
[15]  Scale columns Columns of dtypes <class 'numpy.number'>

    [16]  Drop columns 'Bearded'
[16]  Drop columns 'Bearded'

    [17]  Transform input dataframes to the following schema: <Learnable Schema>
[17]  Transform input dataframes to the following schema: <Learnable Schema>

    [18]  Validates conditions
[18]  Validates conditions

    [ 7]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
[ 7]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]

    [ 8]  Bin Savings by [1].
[ 8]  Bin Savings by [1].

    [ 9]  One-hot encode 'Country'
[ 9]  One-hot encode 'Country'

    [10]  Tokenize Quote
[10]  Tokenize Quote

    [11]  Stemming tokens in Quote...
[11]  Stemming tokens in Quote...

    [12]  Remove stopwords from Quote
[12]  Remove stopwords from Quote

    [13]  Count-vectorizing column Quote.
[13]  Count-vectorizing column Quote.

    [14]  Decompose columns Columns that start with Quote with PCA
[14]  Decompose columns Columns that start with Quote with PCA

    [15]  Encode 'Savings_bin', 'Gender'
[15]  Encode 'Savings_bin', 'Gender'

    [16]  Scale columns Columns of dtypes <class 'numpy.number'>
[16]  Scale columns Columns of dtypes <class 'numpy.number'>

    [17]  Drop columns 'Bearded'
[17]  Drop columns 'Bearded'

    [18]  Transform input dataframes to the following schema: <Learnable Schema>
[18]  Transform input dataframes to the following schema: <Learnable Schema>

    [19]  Validates conditions
[19]  Validates conditions

## 35 changes: 35 additions & 0 deletions pdp_post_adv2.py
@@ -0,0 +1,35 @@

    >>> mp = MyPipelineAndModel(
>>> mp = MyPipelineAndModel(

          savings_max_val=101,
      savings_max_val=101,

          drop_gender=False,
      drop_gender=False,

          standardize=True,
      standardize=True,

          ohencode_country=True,
      ohencode_country=True,

          savings_bin_val=1,
      savings_bin_val=1,

          pca_threshold=25,
      pca_threshold=25,

          fit_intercept=True)
      fit_intercept=True)

    >>> mp
>>> mp

    <PdPipeline -> LogisticRegression>
<PdPipeline -> LogisticRegression>

    >>> mp.estimator
>>> mp.estimator

    LogisticRegression()
LogisticRegression()

    >>> mp.pipeline
>>> mp.pipeline

    A pdpipe pipeline:
A pdpipe pipeline:

    [ 0]  Drop columns Columns with at least 0.2 missing value rate
[ 0]  Drop columns Columns with at least 0.2 missing value rate

    [ 1]  Drop labels by values
[ 1]  Drop labels by values

    [ 2]  Encode label values
[ 2]  Encode label values

    [ 3]  Drop columns 'Name'
[ 3]  Drop columns 'Name'

    [ 4]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
[ 4]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >

          101>
      101>

    [ 5]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
[ 5]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &

          ~df[Bearded]
      ~df[Bearded]

    [ 6]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
[ 6]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]

    [ 7]  Bin Savings by [1].
[ 7]  Bin Savings by [1].

    [ 8]  One-hot encode 'Country'
[ 8]  One-hot encode 'Country'

    [ 9]  Tokenize Quote
[ 9]  Tokenize Quote

    [10]  Stemming tokens in Quote...
[10]  Stemming tokens in Quote...

    [11]  Remove stopwords from Quote
[11]  Remove stopwords from Quote

    [12]  Count-vectorizing column Quote.
[12]  Count-vectorizing column Quote.

    [13]  Decompose columns Columns that start with Quote with PCA
[13]  Decompose columns Columns that start with Quote with PCA

    [14]  Encode 'Savings_bin', 'Gender'
[14]  Encode 'Savings_bin', 'Gender'

    [15]  Scale columns Columns of dtypes <class 'numpy.number'>
[15]  Scale columns Columns of dtypes <class 'numpy.number'>

    [16]  Drop columns 'Bearded'
[16]  Drop columns 'Bearded'

    [17]  Transform input dataframes to the following schema: <Learnable Schema>
[17]  Transform input dataframes to the following schema: <Learnable Schema>

    [18]  Validates conditions
[18]  Validates conditions