POSTS
CenterNet on AMD RadeonGPU
Introduction
Because there was a better ObjectDetection paper than M2Det, I checked the operation on Radeon GPU. M2Det is also Chinese, and CenterNet is a model called CenterNet written by Chinese people. According to the paper, it will be the most accurate and lightest model, YoloV3 <M2Det <CenterNet.
CenterNet: Keypoint Triplets for Object Detection https://arxiv.org/abs/1904.08189
PyTorch Implementation https://github.com/xingyizhou/CenterNet/blob/master/readme/INSTALL.md
Keras Implementation https://github.com/see--/keras-centernet
Installation Check
Clone it and put in the required package.
git clone https://github.com/see--/keras-centernet
cd keras-centernet
sudo pip3 install -r requirements.txt
Required package is
Keras==2.2.4
opencv-python==3.4.3.18
tqdm==4.26.0
youtube-dl==2019.4.30
pytest==4.4.1
Pillow==6.0.0
matplotlib==3.0.3
Cython==0.29.7
pycocotools==2.0.0
Keras Backend uses tensorflow-rocm 1.13.3. youtube-dl is unnecessary this time.
Check GPU (Radeon VII) on ROCm-TensorFlow with the following command.
python3 -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()"
johndoe@thiguhag:~$ python3 -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()"
2019-04-15 23:10:40.484698: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-04-15 23:10:40.485199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties:
name: Vega 20
AMDGPU ISA: gfx906
memoryClockRate (GHz) 1.802
pciBusID 0000:03:00.0
Total memory: 15.98GiB
Free memory: 15.73GiB
2019-04-15 23:10:40.485213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0
2019-04-15 23:10:40.485374: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-15 23:10:40.485391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0
2019-04-15 23:10:40.485395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N
2019-04-15 23:10:40.485421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 20, pci bus id: 0000:03:00.0)
Run the sample. The Weights file is not explicit but contains the downloader in the following script and is automatically obtained, so it is very easy to operate.
PYTHONPATH=. python3 keras_centernet/bin/ctdet_image.py --fn assets/demo2.jpg --inres 512,512
johndoe@thiguhag:~/keras-centernet$ PYTHONPATH=. python3 keras_centernet/bin/ctdet_image.py --fn assets/demo2.jpg --inres 512,512
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-06-12 04:33:20.069565: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-06-12 04:33:20.125171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties:
name: Vega 20
AMDGPU ISA: gfx906
memoryClockRate (GHz) 1.802
pciBusID 0000:03:00.0
Total memory: 15.98GiB
Free memory: 15.73GiB
2019-06-12 04:33:20.125200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0
2019-06-12 04:33:20.125210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-12 04:33:20.125214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0
2019-06-12 04:33:20.125218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N
2019-06-12 04:33:20.125250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 20, pci bus id: 0000:03:00.0)
Image saved to: output/ctdet.demo2.jpg
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,
I was able to move it safely.
References
- TensorFlor-ROCm / HipCaffe / PyTorch-ROCm / Caffe2 installation https://rocm-documentation.readthedocs.io/en/latest/Deep_learning/Deep-learning.html
- ROCm https://github.com/ROCmSoftwarePlatform
- MIOpen https://gpuopen.com/compute-product/miopen/
- GPUEater tensorflow-rocm installer https://github.com/aieater/rocm_tensorflow_info
- CenterNet: Keypoint Triplets for Object Detection https://arxiv.org/abs/1904.08189
- PyTorch-CenterNet https://github.com/xingyizhou/CenterNet/blob/master/readme/INSTALL.md
- Keras-CenterNet https://github.com/see--/keras-centernet
Are you interested in working with us?
We are actively looking for new members for developing and improving GPUEater cloud platform. For more information, please check here.