-
-
Save matsui528/6d223d17241842c84d5882a9afa0453a to your computer and use it in GitHub Desktop.
# Install script of Caffe2 and Detectron on AWS EC2 | |
# | |
# Tested environment: | |
# - AMI: Deep Learning Base AMI (Ubuntu) Version 6.0 - ami-ce3673b6 (CUDA is already installed) | |
# - Instance: p3.2xlarge (V100 * 1) | |
# - Caffe2: https://github.com/pytorch/pytorch/commit/731273b8d61dfa2aa8b2909f27c8810ede103952 | |
# - Detectron: https://github.com/facebookresearch/Detectron/commit/cd447c77c96f5752d6b37761d30bbdacc86989a2 | |
# | |
# Usage: | |
# Launch a fresh EC2 instance, put this script on the /home/ubuntu/, and run the following command. | |
# $ cd ~ | |
# $ source install_caffe2_detectron.sh | |
# | |
# Test: | |
# $ cd ~/work/detectron/detectron | |
# $ python2 tests/test_spatial_narrow_as_op.py | |
# | |
# Run samples: | |
# Run the following commands and see the results in /tmp/detectron-visualizations | |
# $ cd ~/work/detectron | |
# $ python2 tools/infer_simple.py \ | |
# --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \ | |
# --output-dir /tmp/detectron-visualizations \ | |
# --image-ext jpg \ | |
# --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \ | |
# demo | |
# | |
# Note that: | |
# - In the Deep Learning AMI (Version 4.0), CUDA and caffe2 are already installed. But the caffe2 in the AMI is | |
# a bit old version and does not include some modules required for Detectron. | |
# So this script supposes that an AMI is Deep Learning "Base" AMI (Version 6.0), where only CUDA is installed. | |
# - Manual configuration of setup.py is not a recommended way. Any suggestions are welcome. | |
INSTALL_DIR=~/work | |
### Install Caffe2 | |
sudo apt-get update | |
sudo apt-get install -y --no-install-recommends \ | |
build-essential \ | |
cmake \ | |
git \ | |
libgoogle-glog-dev \ | |
libgtest-dev \ | |
libiomp-dev \ | |
libleveldb-dev \ | |
liblmdb-dev \ | |
libopencv-dev \ | |
libopenmpi-dev \ | |
libsnappy-dev \ | |
libprotobuf-dev \ | |
openmpi-bin \ | |
openmpi-doc \ | |
protobuf-compiler \ | |
python-dev \ | |
python-pip | |
sudo pip2 install \ | |
future \ | |
numpy \ | |
protobuf | |
sudo apt-get install -y --no-install-recommends libgflags-dev | |
mkdir -p $INSTALL_DIR && cd $INSTALL_DIR | |
git clone --recursive https://github.com/pytorch/pytorch.git && cd pytorch | |
git submodule update --init | |
mkdir -p build && cd build | |
cmake .. | |
sudo make install -j6 | |
# Export paths | |
echo "export PYTHONPATH=/usr/local:\$PYTHONPATH" >> ~/.bashrc | |
echo "export PYTHONPATH=\$PYTHONPATH:${INSTALL_DIR}/caffe2/build" >> ~/.bashrc | |
echo "export LD_LIBRARY_PATH=/usr/local/lib:\$LD_LIBRARY_PATH" >> ~/.bashrc | |
source ~/.bashrc | |
### Install Detectron | |
sudo pip2 install numpy pyyaml matplotlib opencv-python setuptools cython mock scipy | |
# First, install coco api | |
COCOAPI=$INSTALL_DIR/cocoapi | |
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI | |
cd $COCOAPI/PythonAPI | |
# https://github.com/facebookresearch/Detectron/issues/105 | |
# https://github.com/cocodataset/cocoapi/issues/94 | |
# Based on the comments above, insert `extra_link_args=['-L/usr/lib/x86_64-linux-gnu/']` in setup.py | |
sed -i -e "/extra_compile_args/a \ extra_link_args=['-L/usr/lib/x86_64-linux-gnu/']," setup.py | |
# Install into global site-packages | |
sudo make install | |
# Next, install detectron | |
DETECTRON=$INSTALL_DIR/detectron | |
git clone https://github.com/facebookresearch/detectron $DETECTRON | |
cd $DETECTRON | |
make |
@abrichr I tried forked git you listed: https://gist.github.com/abrichr/27f3f8c476a0c300bc65643b5c6380ff
But script ends with message below. Appreciate any suggestion:
creating build/temp.linux-x86_64-2.7/pycocotools
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I../common -I/usr/include/python2.7 -c ../common/maskApi.c -o build/temp.linux-x86_64-2.7/../common/maskApi.o -Wno-cpp -Wno-unused-function -std=c99
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I../common -I/usr/include/python2.7 -c pycocotools/_mask.c -o build/temp.linux-x86_64-2.7/pycocotools/_mask.o -Wno-cpp -Wno-unused-function -std=c99
x86_64-linux-gnu-gcc: error: pycocotools/_mask.c: No such file or directory
x86_64-linux-gnu-gcc: fatal error: no input files
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
Makefile:7: recipe for target 'install' failed
make: *** [install] Error 1
Cloning into '/home/ubuntu/work/detectron'...
remote: Counting objects: 856, done.
remote: Total 856 (delta 0), reused 0 (delta 0), pack-reused 856
Receiving objects: 100% (856/856), 4.06 MiB | 0 bytes/s, done.
Resolving deltas: 100% (528/528), done.
Checking connectivity... done.
sed: can't read setup.py: No such file or directory
make: *** No targets specified and no makefile found. Stop.
@matsui528 if I use script you have above, it prompts below and then fails.
Cloning into 'third_party/eigen'...
Username for 'https://github.com':
Hi there,
Updated the code and now it should work with the latest Deep Learning Base AMI (Version 6.0 - ami-ce3673b6)
Hi there,
Now we can install the latest caffe2 easily via conda install pytorch-nightly -c pytorch
. This means we can use Deep Learning AMI (not Deep Learning "Base" AMI) directly. Note that python2 with conda environment is pre-installed in DL AMI. Here is the install script for the latest DL AMI:
# - Deep Learning AMI (Ubuntu) Version 19.0 - ami-05bc59103c52af154
# - Tested on p3.8xlarge
# - We use pytorch_p27 environment of conda
# - Note that it seems "sudo apt upgrade" fails for a while right after an instance is launced. It would be recommended to reboot the instance first of all, and make sure "sudo apt update" and "sudo apt upgrade" work, then run this script.
# Install the latest pytorch
sudo apt -y update
sudo apt -y upgrade
source activate pytorch_p27
conda install -y pytorch-nightly -c pytorch
conda install -y future cython
# Install COCOAPI
INSTALL_DIR=~/work
mkdir -p $INSTALL_DIR
COCOAPI=$INSTALL_DIR/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
make install
# Install detectron
DETECTRON=$INSTALL_DIR/detectron
git clone https://github.com/facebookresearch/detectron $DETECTRON
pip install -r $DETECTRON/requirements.txt
cd $DETECTRON
make
@matsui528 thanks for putting this on gist. Tried your pytorch-nightly
solution on same exact AMI you tested on, but getting protobuf errors. How did you solve the protobuf incompatibility problem?
(pytorch_p27) ubuntu@ip-xxxxx:~$ python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/caffe2/python/__init__.py", line 2, in <module>
from caffe2.proto import caffe2_pb2
File "/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/caffe2/proto/__init__.py", line 11, in <module>
from caffe2.proto import caffe2_pb2, metanet_pb2, torch_pb2
File "/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/caffe2/proto/caffe2_pb2.py", line 23, in <module>
BReaderProto\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x0e\n\x06source\x18\x02 \x01(\t\x12\x0f\n\x07\x64\x62_type\x18\x03 \x01(\t\x12\x0b\n\x03key\x18\x04 \x01(\t*\xfa\x01\n\x0f\x44\x65viceTypeProto\x12\r\n\tPROTO_CPU\x10\x00\x12\x0e\n\nPROTO_CUDA\x10\x01\x12\x10\n\x0cPROTO_MKLDNN\x10\x02\x12\x10\n\x0cPROTO_OPENGL\x10\x03\x12\x10\n\x0cPROTO_OPENCL\x10\x04\x12\x0f\n\x0bPROTO_IDEEP\x10\x05\x12\r\n\tPROTO_HIP\x10\x06\x12\x0e\n\nPROTO_FPGA\x10\x07\x12\x0f\n\x0bPROTO_MSNPU\x10\x08\x12\r\n\tPROTO_XLA\x10\t\x12\'\n#PROTO_COMPILE_TIME_MAX_DEVICE_TYPES\x10\n\x12\x19\n\x13PROTO_ONLY_FOR_TEST\x10\xa5\xa3\x01')
TypeError: __new__() got an unexpected keyword argument 'serialized_options'
Here is how to fix the protobuf problem:
pip uninstall protobuf
pip install protobuf==3.6.1
...however, now I have a new problem:
(pytorch_p27) ubuntu@ip-10-60-1-125:~/work/detectron$ python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
Segmentation fault (core dumped)
That is on Deep Learning AMI (Ubuntu) Version 23.1 (ami-0757fc5a639fe7666) and a P3.2xl
Update: I had to build and install PyTorch/Caffe2 from source and things seem to be working now.
I got a "repository not found" error while recursively cloning caffe2. This was due to the repository at https://github.com/RLovelett/eigen no longer being available. I forked caffe2 to https://github.com/abrichr/caffe2, updated the relevant url in .gitmodules to the official mirror at https://github.com/eigenteam/eigen-git-mirror, and ran 'git submodule sync`. After changing the caffe2 repo url in the Gist from https://github.com/caffe2/caffe2.git to https://github.com/abrichr/caffe2.git, and changing "lib" to "detectron" in "cd $DETECTRON/lib", the script works.
Forked Gist here: https://gist.github.com/abrichr/27f3f8c476a0c300bc65643b5c6380ff
PR here: facebookarchive/caffe2#2525