POSTS
AMD RadeonGPU上でCenterNet
Introduction
M2Detよりもより良いObjectDetectionの論文が出ていたのでRadeonGPU上で動作確認を行いました。M2Detも中国勢でCenterNetも中国系の人が書いたCenterNetというモデルです。論文の通りであればYoloV3< M2Det< CenterNetとなる最も精度が高く軽いモデルになります。
CenterNet: Keypoint Triplets for Object Detection https://arxiv.org/abs/1904.08189
PyTorch実装 https://github.com/xingyizhou/CenterNet/blob/master/readme/INSTALL.md
Keras実装 https://github.com/see--/keras-centernet
動作確認
クローンして必要なパッケージを入れます。
git clone https://github.com/see--/keras-centernet
cd keras-centernet
sudo pip3 install -r requirements.txt
必要パッケージは
Keras==2.2.4
opencv-python==3.4.3.18
tqdm==4.26.0
youtube-dl==2019.4.30
pytest==4.4.1
Pillow==6.0.0
matplotlib==3.0.3
Cython==0.29.7
pycocotools==2.0.0
KerasのBackendはtensorflow-rocm 1.13.3を使用します。 youtube-dlは今回は不要です。
以下のコマンドでROCm-TensorFlow上でGPU(RadeonVII)を確認。
python3 -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()"
johndoe@thiguhag:~$ python3 -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()"
2019-04-15 23:10:40.484698: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-04-15 23:10:40.485199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties:
name: Vega 20
AMDGPU ISA: gfx906
memoryClockRate (GHz) 1.802
pciBusID 0000:03:00.0
Total memory: 15.98GiB
Free memory: 15.73GiB
2019-04-15 23:10:40.485213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0
2019-04-15 23:10:40.485374: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-15 23:10:40.485391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0
2019-04-15 23:10:40.485395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N
2019-04-15 23:10:40.485421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 20, pci bus id: 0000:03:00.0)
サンプルを動作させます。 Weightsファイルは明示的ではなく以下のスクリプトにダウンローダーが含まれており自動的に取得してきますので、動作させるのは非常に簡単です。
PYTHONPATH=. python3 keras_centernet/bin/ctdet_image.py --fn assets/demo2.jpg --inres 512,512
johndoe@thiguhag:~/keras-centernet$ PYTHONPATH=. python3 keras_centernet/bin/ctdet_image.py --fn assets/demo2.jpg --inres 512,512
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-06-12 04:33:20.069565: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-06-12 04:33:20.125171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties:
name: Vega 20
AMDGPU ISA: gfx906
memoryClockRate (GHz) 1.802
pciBusID 0000:03:00.0
Total memory: 15.98GiB
Free memory: 15.73GiB
2019-06-12 04:33:20.125200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0
2019-06-12 04:33:20.125210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-12 04:33:20.125214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0
2019-06-12 04:33:20.125218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N
2019-06-12 04:33:20.125250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 20, pci bus id: 0000:03:00.0)
Image saved to: output/ctdet.demo2.jpg
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,
無事動かすことができました。
References
- TensorFlor-ROCm / HipCaffe / PyTorch-ROCm / Caffe2 installation https://rocm-documentation.readthedocs.io/en/latest/Deep_learning/Deep-learning.html
- ROCm https://github.com/ROCmSoftwarePlatform
- MIOpen https://gpuopen.com/compute-product/miopen/
- GPUEater tensorflow-rocm installer https://github.com/aieater/rocm_tensorflow_info
- CenterNet: Keypoint Triplets for Object Detection https://arxiv.org/abs/1904.08189
- PyTorch-CenterNet https://github.com/xingyizhou/CenterNet/blob/master/readme/INSTALL.md
- Keras-CenterNet https://github.com/see--/keras-centernet
エンジニア募集中
GPU EATERの開発を一緒に行うメンバーを募集しています。
特にディープラーニング研究者、バックエンドエンジニアを積極採用中です。
募集職種はこちら