Skip to content

Instantly share code, notes, and snippets.

@ksimonyan
Last active July 6, 2024 11:55
Show Gist options
  • Save ksimonyan/3785162f95cd2d5fee77 to your computer and use it in GitHub Desktop.
Save ksimonyan/3785162f95cd2d5fee77 to your computer and use it in GitHub Desktop.
ILSVRC-2014 model (VGG team) with 19 weight layers

##Information

name: 19-layer model from the arXiv paper: "Very Deep Convolutional Networks for Large-Scale Image Recognition"

caffemodel: VGG_ILSVRC_19_layers

caffemodel_url: http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel

license: see http://www.robots.ox.ac.uk/~vgg/research/very_deep/

caffe_version: trained using a custom Caffe-based framework

gist_id: 3785162f95cd2d5fee77

Description

The model is an improved version of the 19-layer model used by the VGG team in the ILSVRC-2014 competition. The details can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman
arXiv:1409.1556

Please cite the paper if you use the model.

In the paper, the model is denoted as the configuration E trained with scale jittering. The input images should be zero-centered by mean pixel (rather than mean image) subtraction. Namely, the following BGR values should be subtracted: [103.939, 116.779, 123.68].

Caffe compatibility

The models are currently supported by the dev branch of Caffe, but are not yet compatible with master. An example of how to use the models in Matlab can be found in matlab/caffe/matcaffe_demo_vgg.m

ILSVRC-2012 performance

Using dense single-scale evaluation (the smallest image side rescaled to 384), the top-5 classification error on the validation set of ILSVRC-2012 is 8.0% (see Table 3 in the arXiv paper).

Using dense multi-scale evaluation (the smallest image side rescaled to 256, 384, and 512), the top-5 classification error is 7.5% on the validation set and 7.3% on the test set of ILSVRC-2012 (see Tables 4 and 6 in the arXiv paper).

name: "VGG_ILSVRC_19_layers"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 224
input_dim: 224
layers {
bottom: "data"
top: "conv1_1"
name: "conv1_1"
type: CONVOLUTION
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv1_1"
top: "conv1_1"
name: "relu1_1"
type: RELU
}
layers {
bottom: "conv1_1"
top: "conv1_2"
name: "conv1_2"
type: CONVOLUTION
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv1_2"
top: "conv1_2"
name: "relu1_2"
type: RELU
}
layers {
bottom: "conv1_2"
top: "pool1"
name: "pool1"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: "pool1"
top: "conv2_1"
name: "conv2_1"
type: CONVOLUTION
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv2_1"
top: "conv2_1"
name: "relu2_1"
type: RELU
}
layers {
bottom: "conv2_1"
top: "conv2_2"
name: "conv2_2"
type: CONVOLUTION
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv2_2"
top: "conv2_2"
name: "relu2_2"
type: RELU
}
layers {
bottom: "conv2_2"
top: "pool2"
name: "pool2"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: "pool2"
top: "conv3_1"
name: "conv3_1"
type: CONVOLUTION
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv3_1"
top: "conv3_1"
name: "relu3_1"
type: RELU
}
layers {
bottom: "conv3_1"
top: "conv3_2"
name: "conv3_2"
type: CONVOLUTION
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv3_2"
top: "conv3_2"
name: "relu3_2"
type: RELU
}
layers {
bottom: "conv3_2"
top: "conv3_3"
name: "conv3_3"
type: CONVOLUTION
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv3_3"
top: "conv3_3"
name: "relu3_3"
type: RELU
}
layers {
bottom: "conv3_3"
top: "conv3_4"
name: "conv3_4"
type: CONVOLUTION
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv3_4"
top: "conv3_4"
name: "relu3_4"
type: RELU
}
layers {
bottom: "conv3_4"
top: "pool3"
name: "pool3"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: "pool3"
top: "conv4_1"
name: "conv4_1"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv4_1"
top: "conv4_1"
name: "relu4_1"
type: RELU
}
layers {
bottom: "conv4_1"
top: "conv4_2"
name: "conv4_2"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv4_2"
top: "conv4_2"
name: "relu4_2"
type: RELU
}
layers {
bottom: "conv4_2"
top: "conv4_3"
name: "conv4_3"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv4_3"
top: "conv4_3"
name: "relu4_3"
type: RELU
}
layers {
bottom: "conv4_3"
top: "conv4_4"
name: "conv4_4"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv4_4"
top: "conv4_4"
name: "relu4_4"
type: RELU
}
layers {
bottom: "conv4_4"
top: "pool4"
name: "pool4"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: "pool4"
top: "conv5_1"
name: "conv5_1"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv5_1"
top: "conv5_1"
name: "relu5_1"
type: RELU
}
layers {
bottom: "conv5_1"
top: "conv5_2"
name: "conv5_2"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv5_2"
top: "conv5_2"
name: "relu5_2"
type: RELU
}
layers {
bottom: "conv5_2"
top: "conv5_3"
name: "conv5_3"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv5_3"
top: "conv5_3"
name: "relu5_3"
type: RELU
}
layers {
bottom: "conv5_3"
top: "conv5_4"
name: "conv5_4"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv5_4"
top: "conv5_4"
name: "relu5_4"
type: RELU
}
layers {
bottom: "conv5_4"
top: "pool5"
name: "pool5"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: "pool5"
top: "fc6"
name: "fc6"
type: INNER_PRODUCT
inner_product_param {
num_output: 4096
}
}
layers {
bottom: "fc6"
top: "fc6"
name: "relu6"
type: RELU
}
layers {
bottom: "fc6"
top: "fc6"
name: "drop6"
type: DROPOUT
dropout_param {
dropout_ratio: 0.5
}
}
layers {
bottom: "fc6"
top: "fc7"
name: "fc7"
type: INNER_PRODUCT
inner_product_param {
num_output: 4096
}
}
layers {
bottom: "fc7"
top: "fc7"
name: "relu7"
type: RELU
}
layers {
bottom: "fc7"
top: "fc7"
name: "drop7"
type: DROPOUT
dropout_param {
dropout_ratio: 0.5
}
}
layers {
bottom: "fc7"
top: "fc8"
name: "fc8"
type: INNER_PRODUCT
inner_product_param {
num_output: 1000
}
}
layers {
bottom: "fc8"
top: "prob"
name: "prob"
type: SOFTMAX
}
@darkseed
Copy link

6

@cuihenggang
Copy link

7

@stardust2602
Copy link

8

@jongsony
Copy link

9

@adithyamurali
Copy link

10!

@xuguozhi
Copy link

11!

@graphific
Copy link

by popular request, a fitting train_val.prototxt

@inovatek
Copy link

Hello

How to implement the model on the Mester Branch ?

There are currently no Dev Branch.

Thank you very much !

@LiberiFatali
Copy link

It woks fine with master.

Copy link

ghost commented Mar 14, 2016

Hello,

I'm really new on this topic (Deep Learning and CNN), I have just started my bachelor project. I am wondering if there is a way to see the name of the classes which the CNN can classify. Sorry in advance if the question sounds really stupid.
Maybe it can help with the answer: I am using Torch and loadcaffe to load the pre-trained net.

Kind regards!

@leonardt
Copy link

@mlagunas checkout this tutorial specifically cell 9 where they load the set of labels to see what the net predicted.

@alexkarargyris
Copy link

Can we use this model with Nvidia's DIGITS? I am trying load it but I get an error when I start running the classifier:

ERROR: must specify a loss layer

@pherrusa7
Copy link

@graphific Thanks for posting the train_val.prototxt!
@ksimonyan Thanks for posting the trained model and the VGG_ILSVRC_19_layers_deploy.prototxt

If I understood correctly, both .prototxt files look like deprecated in the new version of caffe. All layers should be defined as lower-case instead of upper-case. Also some types of layers are not supported any more by caffe.

I did the required changes to the train_val to work. If somebody needs it, let me know

Thanks!

@yinnonsanders
Copy link

@pherrusa7
I would appreciate having the modified train_val

@jacor-
Copy link

jacor- commented May 6, 2016

I have been trying to use this network with a new version of Caffe and I cannot make it work.
@pherrusa7, could you share that modified train_val file? It would be awesome to be able to take a look into it.

@ambitiousank
Copy link

@pherrusa7 I would like the modified train_val and solver file... Thanks in advance...

@kevinfang1028
Copy link

@pherrusa7
I would like to have modified train_val and solver files. Thank you very much!

@pherrusa7
Copy link

pherrusa7 commented May 17, 2016

Hi @yinnonsanders @jacor @ambitiousank @kevinfang1028

Here you can find the train_val and solver: Caffe-Utilities

In the readme file you can also see the command I used to get it work. I hope will be helpful :)

@vlnguyen92
Copy link

@pherrusa7: Have you tried the TEST phase yet (using a pretrained model)? I can't get it to work.

@gongbaochicken
Copy link

Great job!!!

@gailysun
Copy link

Great!!

@Sunnydreamrain
Copy link

Have anyone got the test accuracy reported? I can only get 73.034% TOP1 accuracy and 91.654% TOP5 accuracy. It is close, but still not exactly the result. Especially for the TOP1 accuracy, reported is 25.5% error rate which is 74.5% accuracy. It is 1.5% lower.
Any advice?

@pherrusa7
Copy link

@vlnguyen92: Sorry for the delay. Yes, I did. If you did not get it yet, you can follow this step-by-step approach to TEST your model that I made months ago (caffeUsers). I hope it will be helpful.

@mrgloom
Copy link

mrgloom commented Sep 18, 2016

Here is working example of VGG-19 that I have trained using NVIDIA DIGITS with Caffe backend.
https://github.com/mrgloom/kaggle-dogs-vs-cats-solution/tree/master/learning_from_scratch/Models/VGG-19

@soulslicer
Copy link

Hi all, in the paper, there was also a simplified VGG 11:

https://arxiv.org/pdf/1409.1556.pdf

Is this network and its weights available anywhere??

@bigFin
Copy link

bigFin commented Aug 11, 2018

There is no Dev branch of Caffe

@mrgransky
Copy link

Here is working example of VGG-19 that I have trained using NVIDIA DIGITS with Caffe backend.
https://github.com/mrgloom/kaggle-dogs-vs-cats-solution/tree/master/learning_from_scratch/Models/VGG-19

broken link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment