shagunsodhani/Convolutional Neural Network For Sentence Classification.md

## Convolutional Neural Network For Sentence Classification.md

      
    Raw
  

              Convolutional Neural Network For Sentence Classification.md
            
          
    Convolutional Neural Network For Sentence Classification

Introduction


The paper demonstrates how simple CNNs, built on top of word embeddings, can be used for sentence classification tasks.
Link to the paper
Implementation

Architecture


Pad input sentences so that they are of the same length.
Map words in the padded sentence using word embeddings (which may be either initialized as zero vectors or initialized as word2vec embeddings) to obtain a matrix corresponding to the sentence.
Apply convolution layer with multiple filter widths and feature maps.
Apply max-over-time pooling operation over the feature map.
Concatenate the pooling results from different layers and feed to a fully-connected layer with softmax activation.
Softmax outputs probabilistic distribution over the labels.
Use dropout for regularisation.

Hyperparameters


RELU activation for convolution layers
Filter window of 3, 4, 5 with 100 feature maps each.
Dropout - 0.5
Gradient clipping at 3
Batch size - 50
Adadelta update rule.

Variants


CNN-rand

Randomly initialized word vectors.


CNN-static

Uses pre-trained vectors from word2vec and does not update the word vectors.


CNN-non-static

Same as CNN-static but updates word vectors during training.


CNN-multichannel

Uses two set of word vectors (channels).
One set is updated and other is not updated.


Datasets


Sentiment analysis datasets for Movie Reviews, Customer Reviews etc.
Classification data for questions.
Maximum number of classes for any dataset - 6

Strengths


Good results on benchmarks despite being a simple architecture.
Word vectors obtained by non-static channel have more meaningful representation.

Weakness


Small data with few labels.
Results are not very detailed or exhaustive.