Skip to content

Instantly share code, notes, and snippets.

@matsui528
Last active December 25, 2019 07:44
Show Gist options
  • Save matsui528/6d223d17241842c84d5882a9afa0453a to your computer and use it in GitHub Desktop.
Save matsui528/6d223d17241842c84d5882a9afa0453a to your computer and use it in GitHub Desktop.
Install script of caffe2 and detectron on AWS EC2 instance with Deep Learning Base AMI
# Install script of Caffe2 and Detectron on AWS EC2
#
# Tested environment:
# - AMI: Deep Learning Base AMI (Ubuntu) Version 6.0 - ami-ce3673b6 (CUDA is already installed)
# - Instance: p3.2xlarge (V100 * 1)
# - Caffe2: https://github.com/pytorch/pytorch/commit/731273b8d61dfa2aa8b2909f27c8810ede103952
# - Detectron: https://github.com/facebookresearch/Detectron/commit/cd447c77c96f5752d6b37761d30bbdacc86989a2
#
# Usage:
# Launch a fresh EC2 instance, put this script on the /home/ubuntu/, and run the following command.
# $ cd ~
# $ source install_caffe2_detectron.sh
#
# Test:
# $ cd ~/work/detectron/detectron
# $ python2 tests/test_spatial_narrow_as_op.py
#
# Run samples:
# Run the following commands and see the results in /tmp/detectron-visualizations
# $ cd ~/work/detectron
# $ python2 tools/infer_simple.py \
# --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
# --output-dir /tmp/detectron-visualizations \
# --image-ext jpg \
# --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
# demo
#
# Note that:
# - In the Deep Learning AMI (Version 4.0), CUDA and caffe2 are already installed. But the caffe2 in the AMI is
# a bit old version and does not include some modules required for Detectron.
# So this script supposes that an AMI is Deep Learning "Base" AMI (Version 6.0), where only CUDA is installed.
# - Manual configuration of setup.py is not a recommended way. Any suggestions are welcome.
INSTALL_DIR=~/work
### Install Caffe2
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
build-essential \
cmake \
git \
libgoogle-glog-dev \
libgtest-dev \
libiomp-dev \
libleveldb-dev \
liblmdb-dev \
libopencv-dev \
libopenmpi-dev \
libsnappy-dev \
libprotobuf-dev \
openmpi-bin \
openmpi-doc \
protobuf-compiler \
python-dev \
python-pip
sudo pip2 install \
future \
numpy \
protobuf
sudo apt-get install -y --no-install-recommends libgflags-dev
mkdir -p $INSTALL_DIR && cd $INSTALL_DIR
git clone --recursive https://github.com/pytorch/pytorch.git && cd pytorch
git submodule update --init
mkdir -p build && cd build
cmake ..
sudo make install -j6
# Export paths
echo "export PYTHONPATH=/usr/local:\$PYTHONPATH" >> ~/.bashrc
echo "export PYTHONPATH=\$PYTHONPATH:${INSTALL_DIR}/caffe2/build" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/lib:\$LD_LIBRARY_PATH" >> ~/.bashrc
source ~/.bashrc
### Install Detectron
sudo pip2 install numpy pyyaml matplotlib opencv-python setuptools cython mock scipy
# First, install coco api
COCOAPI=$INSTALL_DIR/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# https://github.com/facebookresearch/Detectron/issues/105
# https://github.com/cocodataset/cocoapi/issues/94
# Based on the comments above, insert `extra_link_args=['-L/usr/lib/x86_64-linux-gnu/']` in setup.py
sed -i -e "/extra_compile_args/a \ extra_link_args=['-L/usr/lib/x86_64-linux-gnu/']," setup.py
# Install into global site-packages
sudo make install
# Next, install detectron
DETECTRON=$INSTALL_DIR/detectron
git clone https://github.com/facebookresearch/detectron $DETECTRON
cd $DETECTRON
make
@testingdataviz
Copy link

testingdataviz commented May 12, 2018

This is not working any longer due to changes in the file structure. facebookresearch/Detectron#414

Can anyone help?

@abrichr
Copy link

abrichr commented May 22, 2018

I got a "repository not found" error while recursively cloning caffe2. This was due to the repository at https://github.com/RLovelett/eigen no longer being available. I forked caffe2 to https://github.com/abrichr/caffe2, updated the relevant url in .gitmodules to the official mirror at https://github.com/eigenteam/eigen-git-mirror, and ran 'git submodule sync`. After changing the caffe2 repo url in the Gist from https://github.com/caffe2/caffe2.git to https://github.com/abrichr/caffe2.git, and changing "lib" to "detectron" in "cd $DETECTRON/lib", the script works.

Forked Gist here: https://gist.github.com/abrichr/27f3f8c476a0c300bc65643b5c6380ff

PR here: facebookarchive/caffe2#2525

@darwaish
Copy link

darwaish commented Jun 7, 2018

@abrichr I tried forked git you listed: https://gist.github.com/abrichr/27f3f8c476a0c300bc65643b5c6380ff

But script ends with message below. Appreciate any suggestion:

creating build/temp.linux-x86_64-2.7/pycocotools
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I../common -I/usr/include/python2.7 -c ../common/maskApi.c -o build/temp.linux-x86_64-2.7/../common/maskApi.o -Wno-cpp -Wno-unused-function -std=c99
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I../common -I/usr/include/python2.7 -c pycocotools/_mask.c -o build/temp.linux-x86_64-2.7/pycocotools/_mask.o -Wno-cpp -Wno-unused-function -std=c99
x86_64-linux-gnu-gcc: error: pycocotools/_mask.c: No such file or directory
x86_64-linux-gnu-gcc: fatal error: no input files
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
Makefile:7: recipe for target 'install' failed
make: *** [install] Error 1
Cloning into '/home/ubuntu/work/detectron'...
remote: Counting objects: 856, done.
remote: Total 856 (delta 0), reused 0 (delta 0), pack-reused 856
Receiving objects: 100% (856/856), 4.06 MiB | 0 bytes/s, done.
Resolving deltas: 100% (528/528), done.
Checking connectivity... done.
sed: can't read setup.py: No such file or directory
make: *** No targets specified and no makefile found. Stop.

@darwaish
Copy link

darwaish commented Jun 7, 2018

@matsui528 if I use script you have above, it prompts below and then fails.
Cloning into 'third_party/eigen'...
Username for 'https://github.com':

@matsui528
Copy link
Author

Hi there,
Updated the code and now it should work with the latest Deep Learning Base AMI (Version 6.0 - ami-ce3673b6)

@matsui528
Copy link
Author

Hi there,
Now we can install the latest caffe2 easily via conda install pytorch-nightly -c pytorch. This means we can use Deep Learning AMI (not Deep Learning "Base" AMI) directly. Note that python2 with conda environment is pre-installed in DL AMI. Here is the install script for the latest DL AMI:

# - Deep Learning AMI (Ubuntu) Version 19.0 - ami-05bc59103c52af154
# - Tested on p3.8xlarge
# - We use pytorch_p27 environment of conda
# - Note that it seems "sudo apt upgrade" fails for a while right after an instance is launced. It would be recommended to reboot the instance first of all, and make sure "sudo apt update" and "sudo apt upgrade" work, then run this script.

# Install the latest pytorch
sudo apt -y update
sudo apt -y upgrade
source activate pytorch_p27
conda install -y pytorch-nightly -c pytorch
conda install -y future cython

# Install COCOAPI
INSTALL_DIR=~/work
mkdir -p $INSTALL_DIR

COCOAPI=$INSTALL_DIR/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
make install

# Install detectron
DETECTRON=$INSTALL_DIR/detectron
git clone https://github.com/facebookresearch/detectron $DETECTRON
pip install -r $DETECTRON/requirements.txt
cd $DETECTRON
make

@pvaezi
Copy link

pvaezi commented May 27, 2019

@matsui528 thanks for putting this on gist. Tried your pytorch-nightly solution on same exact AMI you tested on, but getting protobuf errors. How did you solve the protobuf incompatibility problem?

(pytorch_p27) ubuntu@ip-xxxxx:~$ python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/caffe2/python/__init__.py", line 2, in <module>
    from caffe2.proto import caffe2_pb2
  File "/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/caffe2/proto/__init__.py", line 11, in <module>
    from caffe2.proto import caffe2_pb2, metanet_pb2, torch_pb2
  File "/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/caffe2/proto/caffe2_pb2.py", line 23, in <module>
    BReaderProto\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x0e\n\x06source\x18\x02 \x01(\t\x12\x0f\n\x07\x64\x62_type\x18\x03 \x01(\t\x12\x0b\n\x03key\x18\x04 \x01(\t*\xfa\x01\n\x0f\x44\x65viceTypeProto\x12\r\n\tPROTO_CPU\x10\x00\x12\x0e\n\nPROTO_CUDA\x10\x01\x12\x10\n\x0cPROTO_MKLDNN\x10\x02\x12\x10\n\x0cPROTO_OPENGL\x10\x03\x12\x10\n\x0cPROTO_OPENCL\x10\x04\x12\x0f\n\x0bPROTO_IDEEP\x10\x05\x12\r\n\tPROTO_HIP\x10\x06\x12\x0e\n\nPROTO_FPGA\x10\x07\x12\x0f\n\x0bPROTO_MSNPU\x10\x08\x12\r\n\tPROTO_XLA\x10\t\x12\'\n#PROTO_COMPILE_TIME_MAX_DEVICE_TYPES\x10\n\x12\x19\n\x13PROTO_ONLY_FOR_TEST\x10\xa5\xa3\x01')
TypeError: __new__() got an unexpected keyword argument 'serialized_options'

@jmorrey
Copy link

jmorrey commented Jun 21, 2019

Here is how to fix the protobuf problem:

pip uninstall protobuf
pip install protobuf==3.6.1

...however, now I have a new problem:

(pytorch_p27) ubuntu@ip-10-60-1-125:~/work/detectron$ python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
Segmentation fault (core dumped)

That is on Deep Learning AMI (Ubuntu) Version 23.1 (ami-0757fc5a639fe7666) and a P3.2xl

@jmorrey
Copy link

jmorrey commented Jun 21, 2019

Update: I had to build and install PyTorch/Caffe2 from source and things seem to be working now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment