- Home
- >
- Software Development
- >
- Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2 – InApps Technology 2022
Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2 – InApps Technology is an article under the topic Software Development Many of you are most interested in today !! Today, let’s InApps.net learn Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2 – InApps Technology in today’s post !
Read more about Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2 – InApps Technology at Wikipedia
You can find content about Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2 – InApps Technology from the Wikipedia website
In the previous parts of this series, we have explored the concept of ONNX model format and runtime. In the last and final tutorial, I will walk you through the steps of accelerating an ONNX model on an edge device powered by Intel Movidius Neural Compute Stick (NCS) 2 and Intel’s Distribution of OpenVINO Toolkit. We will run the Tiny YOLO2 model first on the desktop based on CPU and then on an edge device with almost no change to the code.
Quick Recap — ONNX Runtime
Apart from bringing interoperability across deep learning frameworks, ONNX promises optimized execution of neural network graph depending on the availability of hardware. The ONNX Runtime abstracts various hardware architectures such as AMD64 CPU, ARM64 CPU, GPU, FPGA, and VPU.
For example, the same ONNX model can deliver better inference performance when it is run against a GPU backend without any optimization done to the model. This is possible due to the plugin model of ONNX that supports multiple execution providers.
A hint provided to ONNX Runtime just before creating the inference session translates to a considerable performance boost.
The below code snippet is an example of such an optimization hint for the ONNX Runtime to utilize an Intel Integrated Graphics backend.
import onnxruntime as rt rt.capi._pybind_state.set_openvino_device(“GPU_FP32”) sess = rt.InferenceSession(‘TinyYOLO.onnx’) |
When the same model is used in a smart camera powered by an Intel NCS device, the backend can be changed to target the MYRIAD Vision Processing Unit (VPU).
rt.capi._pybind_state.set_openvino_device(“MYRIAD_FP16”) |
In the below sections, we will build a simple object detection system based on the popular Tiny YOLO v2 model. We will first run this on a PC to execute the model against a CPU backend before moving it to the edge device with a VPU.
Prerequisites
To finish this tutorial, you need the following:
Setting up the Environment
Start by creating a Python virtual environment for the project.
python –m venv demoenv source demoenv/bin/activate |
Create a requirements.txt
file with the required Python modules.
Since we are going to detect up to 20 objects, create a file called labels.txt
with the below labels:
aeroplane,bicycle,bird,boat,bottle,bus,car,cat,chair,cow,diningtable,dog,horse,motorbike,person,pottedplant,sheep,sofa,train,tvmonitor |
Finally, download the Tiny YOLO v2 model from the ONNX Model Zoo.
Object Detection with Tiny YOLO V2 on Desktop
We are now ready to code the inference program based on Tiny YOLO v2 and ONNX Runtime. Create a file, infer.py
with the below code:
<br />
def preprocess(msg):
<br />
inp = np.array(msg).reshape((len(msg),1))
<br />
frame = cv2.imdecode(inp.astype(np.uint8), 1)
<br />
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
<br />
frame = np.array(frame).astype(np.float32)
<br />
frame = cv2.resize(frame, (416, 416))
<br />
frame = frame.transpose(2, 0, 1)
<br />
frame = np.reshape(frame, (1, 3, 416, 416))
<br />
return frame
<br />
<br />
def infer(frame, sess, conf_threshold):
<br />
input_name = sess.get_inputs()[0].name
<br />
output={}
<br />
<br />
def softmax(x):
<br />
return np.exp(x) / np.sum(np.exp(x), axis=0)
<br />
<br />
def sigmoid(x):
<br />
return 1/(1+np.exp(-x))
<br />
<br />
pred = sess.run(None, {input_name: frame})
<br />
pred = np.array(pred[0][0])
<br />
<br />
labels_file = open(“labels.txt”)
<br />
labels = labels_file.read().split(“,”)
<br />
<br />
tiny_yolo_cell_width = 13
<br />
tiny_yolo_cell_height = 13
<br />
num_boxes = 5
<br />
tiny_yolo_classes = 20
<br />
<br />
<br />
for bx in range (0, tiny_yolo_cell_width):
<br />
for by in range (0, tiny_yolo_cell_height):
<br />
for bound in range (0, num_boxes):
<br />
channel = bound*25
<br />
tx = pred[channel][by][bx]
<br />
ty = pred[channel+1][by][bx]
<br />
tw = pred[channel+2][by][bx]
<br />
th = pred[channel+3][by][bx]
<br />
tc = pred[channel+4][by][bx]
<br />
<br />
confidence = sigmoid(tc)
<br />
class_out = pred[channel+5:channel+5+tiny_yolo_classes][bx][by]
<br />
class_out = softmax(np.array(class_out))
<br />
class_detected = np.argmax(class_out)
<br />
display_confidence = class_out[class_detected]*confidence
<br />
if display_confidence > conf_threshold:
<br />
output[‘object’]=labels[class_detected]
<br />
output[‘confidence’]=display_confidence
<br />
return output
<br />
<br />
def main():
<br />
cam=0
<br />
conf_threshold=0.10
<br />
sess = rt.InferenceSession(‘TinyYOLO.onnx’)
<br />
while (True):
<br />
cv2.waitKey(5)
<br />
cap = cv2.VideoCapture(cam)
<br />
ret, frame = cap.read()
<br />
#cv2.imshow(‘frame’,frame)
<br />
ret, enc = cv2.imencode(‘.jpg’, frame)
<br />
enc = enc.flatten()
<br />
fr=preprocess(enc.tolist())
<br />
p=infer(fr,sess,conf_threshold)
<br />
print(p)
<br />
<br />
if __name__ == “__main__”:
<br />
main()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | import cv2 import numpy as np import onnxruntime as rt def preprocess(msg): inp = np.array(msg).reshape((len(msg),1)) frame = cv2.imdecode(inp.astype(np.uint8), 1) frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) frame = np.array(frame).astype(np.float32) frame = cv2.resize(frame, (416, 416)) frame = frame.transpose(2, 0, 1) frame = np.reshape(frame, (1, 3, 416, 416)) return frame def infer(frame, sess, conf_threshold): input_name = sess.get_inputs()[0].name output={} def softmax(x): return np.exp(x) / np.sum(np.exp(x), axis=0) def sigmoid(x): return 1/(1+np.exp(–x)) pred = sess.run(None, {input_name: frame}) pred = np.array(pred[0][0]) labels_file = open(“labels.txt”) labels = labels_file.read().split(“,”) tiny_yolo_cell_width = 13 tiny_yolo_cell_height = 13 num_boxes = 5 tiny_yolo_classes = 20 for bx in range (0, tiny_yolo_cell_width): for by in range (0, tiny_yolo_cell_height): for bound in range (0, num_boxes): channel = bound*25 tx = pred[channel][by][bx] ty = pred[channel+1][by][bx] tw = pred[channel+2][by][bx] th = pred[channel+3][by][bx] tc = pred[channel+4][by][bx] confidence = sigmoid(tc) class_out = pred[channel+5:channel+5+tiny_yolo_classes][bx][by] class_out = softmax(np.array(class_out)) class_detected = np.argmax(class_out) display_confidence = class_out[class_detected]*confidence if display_confidence > conf_threshold: output[‘object’]=labels[class_detected] output[‘confidence’]=display_confidence return output def main(): cam=0 conf_threshold=0.10 sess = rt.InferenceSession(‘TinyYOLO.onnx’) while (True): cv2.waitKey(5) cap = cv2.VideoCapture(cam) ret, frame = cap.read() #cv2.imshow(‘frame’,frame) ret, enc = cv2.imencode(‘.jpg’, frame) enc = enc.flatten() fr=preprocess(enc.tolist()) p=infer(fr,sess,conf_threshold) print(p) if __name__ == “__main__”: main() |
If you are familiar with OpenCV and basic Convolutional Neural Networks (CNN), the code is self-explanatory.
It does three things:
- Grabs the frame from the webcam
- Converts and preprocesses the frame as expected by the model
- Finally, it performs inference on the frame to detect objects that match the confidence level and pairs it with one of the labels from the CSV file
If you have multiple cameras attached to the machine, don’t forget to update the index appropriately by changing the value of cam
variable.
Executing the code shows the objects it found along with the confidence score. Adjust the confidence threshold based on your requirement.
{‘object’: ‘diningtable’, ‘confidence’: 0.1934369369567218} {‘object’: ‘diningtable’, ‘confidence’: 0.12359955877868607} {‘object’: ‘diningtable’, ‘confidence’: 0.11795787527541246} {‘object’: ‘chair’, ‘confidence’: 0.13212954996625334} {‘object’: ‘diningtable’, ‘confidence’: 0.1899228051957825} {‘object’: ‘chair’, ‘confidence’: 0.1374235041020961} {‘object’: ‘chair’, ‘confidence’: 0.1632368686534813} |
This scenario represents ONNX Runtime performing inference against a CPU backend. In the next step, we will port this code to run on an edge device powered by Intel NCS 2.
Object Detection with Tiny YOLO V2 at the Edge
Assuming you have an Ubuntu 18.04 machine connected to an Intel NCS 2 device running the latest version of Intel OpenVINO Toolkit, you are ready to execute the code at the edge. Otherwise, follow the steps to configure Intel NCS 2 and OpenVINO Toolkit as per the documentation.
If you have an Up Squared AI Vision X Kit, you can use it for this tutorial.
Even if you don’t install the entire OpenVINO Toolkit, ensure you install the Myriad rules drivers for NCS on the host machine according to the reference.
Microsoft has provided Docker images and Dockerfile for mainstream environments. Let’s start by downloading the container image for OpenVINO Toolkit with Myriad.
docker pull mcr.microsoft.com/azureml/onnxruntime:latest–openvino–myriad |
Create a directory, tinyyolo
, on the Ubuntu machine and copy the files from your PC. Your directory should contain the below files:
infer.py
requirements.txt
labels.txt
TinyYOLO.onnx
Before we execute the code, let’s add a line that tells ONNX Runtime about the presence of the Intel NCS device.
Open infer.py
and add the below line just before creating the inference session variable.
rt.capi._pybind_state.set_openvino_device(“MYRIAD_FP16”) |
We are set to run the inference code within the Docker container based on the Myriad device.
Let’s launch the Docker container by mapping the /dev
directory and mounting the tinyyolo
directory. We also need to add the --privileged
and --network host
flags to provide appropriate permissions to access the camera and the NCS USB device.
While in the tinyyolo
directory, execute the below command:
docker run —privileged –v /dev:/dev –v $PWD:/tinyyolo —network host –it —rm mcr.microsoft.com/azureml/onnxruntime:latest–openvino–myriad /bin/bash |
After getting into the shell, let’s move into the directory and install the prerequisites.
cd /tinyyolo pip install –r requirements.txt |
Execute the code to see the inference output in the terminal.
It may take a few minutes for the graph to get loaded and warmed up. You should now see the objects detected by the camera in the terminal.
This scenario can be easily extended to publish the inference output to an MQTT channel configured locally or in the cloud. Refer to my previous AIoT tutorial and a video demo of this use case.
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.
Feature Image by Robert Balog from Pixabay,
At this time, InApps Technology does not allow comments directly on this website. We invite all readers who wish to discuss a story to visit us on Twitter or Facebook. We also welcome your news tips and feedback via email: [email protected].
List of Keywords users find our article on Google:
neural compute stick |
neural compute stick 2 |
intel ncs2 |
typeorm graphql |
onnxruntime requirements |
“cutting edge technologies” |
Source: InApps.net
Let’s create the next big thing together!
Coming together is a beginning. Keeping together is progress. Working together is success.