Playing with Jetson Nano - inference analyzis

So I have struggled a bit the last days with the Nvidia Jetson Nano

This little board is very fast in analyzing images and real time video. My application is rather simple, I forward images to the Nano Jetson from my Motion video system when a motion is detected. If someone, a person, is entering our premises when we are away or at home with our alarm system armed, I get notified on my iPhone. This setup works already since long, I have tested it running in an old laptop (with Debian), Odroids XU4 & N2 as well as RPi3+

Now it was time for the Nano Jetson adventure!

To start, a lot needs to be installed, I decided to follow " Hello AI World" guide where there are plenty of good examples (in python) that helped me to write the final solution I needed

I had to struggle quit a bit; the Jetson seems very sensitive to what power supply you use, if you have a monitor connected or not etc etc, really not as forgiving as a Pi. At the moment it is running but I do not know yet how it will work in the long term

Anyway, some timing comparisons could be of interest. If I forward the same image to my various platforms, I get the following readings for how long a successful object detection analyze takes (detecting persons in the image):

Odroid N2: 3.243597 seconds
My laptop: 0.786970 seconds
Jetson Nano: 0.07310032844543457 seconds

The Jetson, using GPU, is roughly ten times faster than my laptop!! Thats good!

For my application, it is not necessary to have such a fast processing but for real time video analytics, I believe this is very interesting (I have however not played with that part)

Below the image I used for testing. This is a tricky image and many models fail to detect me walking there. But both the Jetson and the others (Yolo V3) managed this well



Thanks for posting this. I've found the Nano to be the best <$150 IOT class computer for using the Coral TPU because it can handle more rtsp streams. This includes the Coral Mendel development board. I've used none of the "JetPack" stuff except for OpenCV, everything else is just Python3, the TPU Python support and node-red for the UI and control via the dashboard.

I just got another Nano to setup for the "pure" Jetson experience. You've given me a wonderful time saving place to start.

The immediate downside I see is the available models is really limited compared to OpenVINO and the TPU to a lessor extent. For instance I don't see a "Pose Estimation" model available for it.

I've been running my AI for over a year and collected a fair number of "false positive" images from MobileNet-SSD v1 & v2 with 15 various outdoor cameras with resolutions from D1 to UHD (4K). Using a Pose Estimation AI on the TPU as a second verification step would have rejected all these false positives when fed my collection of false positive images. Downside is it would increase the false negative rate. Seems to be the higher the camera angle and the more the person fills the frame the less likely are pose keypoints of sufficient confidence found. I'm investigating this. So far the bulk of the false negative are from cameras in more protected locations (patio, porch, garage) that have not given any false positives since I upgraded to MoblleNet-SSD v2. These confined areas necessarily make the viewing angle steeper and the person fill more of the frame.

For grins, I ran your image through it and it would have been a false negative. Not unexpected because of the high camera angle.

I think this is one of the largest issues we face as cameras need to be mounted high to avoid vandalism -- which makes we question the real world practicality of Arlo, Nest, etc. battery powered WiFi cameras, if the (expensive) cameras are mounted high enough to avoid theft, its gonna be a PITA to be changing batteries every few months.

My biggest surprise so far is that UHD cameras appear to improve the AI detection sensitivity which was totally unexpected given that the 3820x2160 image is resized to 300x300 for the AI. I ran a UHD and HD camera mounted adjacent to each other to get as close to the same field of view for each camera as I could the UHD camera detected people in more frames and further from the camera, regularly getting them well beyond my interest being on my neighbor's sidewalk across the street! So now I have to add "region masking" to filter valid detection that I don't want notifications of.

Here is a detection and verification of my mailman leaving that is at about the limit of where I want notifications. I have to reject all the ones from the horizontal sidewalk and across the street. Didn't have this problem with D1 and HD camera images:


Very, very nice indeed!!! And great resolution too. You are 100% right about the cameras viewing angle, and the price for those outdoor wireless type nest, a thief knowing something would be more interested stealing those instead of breaking into the garage (or car)
We have a lot to talk about!

I was slow to upgrade to 4K UHD cameras because I though the AI would be wouldn't work well with it. Using some 3 & 4 Mpixel NetCams with MobileNetSSD-v1 it sure looked like the extra resolution made detection less sensitive.

It turned out that shortly after I started running MobileNetSSD-v2 my Lorex DVR died. It was too hot to consider going up in the attic to pull new cables so I had to get a compatible "analog" replacement (called MPX, these days, basically suports all the "analog" security camera formats) DVR and figured, what the hey, get one that also supports 4K, Costco had a 4K "analog" camera for ~$90 so I tried one and was blown away by the improvement in the AI detection. Totally unexpected! I now have 5 UHD cameras in operation and 10 1080p cameras.

I think we are all benefiting by sharing ideas and results. I'm willing to share my code with anyone who is interested.


There has been a huge improvements made for the Jetson Nano. Check out this:
Examples demonstrating how to optimize caffe/tensorflow/darknet models with TensorRT and run inferencing on NVIDIA Jetson or x86_64 PC platforms

I could not resist, had to give it a go. I was targeting to use YOLO v4 that I think is the most accurate object detector of all. So basically the demo #5 and #6 optimizing TensorRT using plugins. Well the result (for all object detectors) are pretty impressive. In addition, I was happy to see that YOLO v4 tiny now detects objects in some of my tricky images where I needed the full blown YOLO v3 to find the same.

I thought about having a kind of analyzing service in python that would utilize the new optimized engines and in addition could use mqtt to be nicely integrated with NR. Something like:

image -> mqtt -> python script analyzer service -> send result -> mqtt -> NR -> image viewer etc etc

So I made a setup flow like below. The script is running on the display just to make it easier to see the progress while detecting. I put the script in the folder /home/wk/trt_projects/tensorrt_demos. You need to change this to your specific paths, also in the exec node command line. The script assumes that the yolov4-tiny-opt-416 engine has been built. The python script itself requires a number of modules to be installed with pip3 if you do not have them installed already. The version of cv2 that is included in the latest default Jetson image is good enough, I do not make any advanced operations using cv2

[{"id":"59daad36.de7434","type":"inject","z":"85a5333b.17ff7","name":"Stop","repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"true","payloadType":"bool","x":110,"y":240,"wires":[["d0c00b44.5fc5c8"]]},{"id":"d0c00b44.5fc5c8","type":"change","z":"85a5333b.17ff7","name":"","rules":[{"t":"set","p":"payload","pt":"msg","to":"Stop-YOLO-Analyzer","tot":"str"}],"action":"","property":"","from":"","to":"","reg":false,"x":310,"y":240,"wires":[["ff70b4ec.d53b88"]]},{"id":"ff70b4ec.d53b88","type":"mqtt out","z":"85a5333b.17ff7","name":"","topic":"image/stop","qos":"","retain":"","broker":"5f58f95.8104a08","x":540,"y":240,"wires":[]},{"id":"cf10f9c1.719218","type":"mqtt in","z":"85a5333b.17ff7","name":"","topic":"result","qos":"2","datatype":"auto","broker":"5f58f95.8104a08","x":110,"y":480,"wires":[["93d86af8.87ed88"]]},{"id":"20c50622.29e10a","type":"image viewer","z":"85a5333b.17ff7","name":"","width":"320","data":"payload","dataType":"msg","x":340,"y":540,"wires":[[]]},{"id":"93d86af8.87ed88","type":"jimp-image","z":"85a5333b.17ff7","name":"","data":"payload","dataType":"msg","ret":"img","parameter1":"","parameter1Type":"msg","parameter2":"","parameter2Type":"msg","parameter3":"","parameter3Type":"msg","parameter4":"","parameter4Type":"msg","parameter5":"","parameter5Type":"msg","parameter6":"","parameter6Type":"msg","parameter7":"","parameter7Type":"msg","parameter8":"","parameter8Type":"msg","parameterCount":0,"jimpFunction":"none","selectedJimpFunction":{"name":"none","fn":"none","description":"Just loads the image.","parameters":[]},"x":340,"y":480,"wires":[["20c50622.29e10a"]]},{"id":"15072d69.25bff3","type":"http request","z":"85a5333b.17ff7","name":"","method":"GET","ret":"bin","paytoqs":false,"url":"","tls":"","persist":true,"proxy":"","authType":"","x":450,"y":370,"wires":[["e2c4def1.99418","3fdd02a5.88d0fe"]]},{"id":"8f7dbdb4.6eced","type":"inject","z":"85a5333b.17ff7","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"true","payloadType":"bool","x":110,"y":370,"wires":[["15072d69.25bff3"]]},{"id":"e2c4def1.99418","type":"mqtt out","z":"85a5333b.17ff7","name":"","topic":"image/51","qos":"","retain":"","broker":"5f58f95.8104a08","x":790,"y":370,"wires":[]},{"id":"3fdd02a5.88d0fe","type":"image viewer","z":"85a5333b.17ff7","name":"","width":"320","data":"payload","dataType":"msg","x":780,"y":430,"wires":[[]]},{"id":"54643313.a69f4c","type":"exec","z":"85a5333b.17ff7","command":"export DISPLAY=:0 && xterm -geometry 96x24-150+150 -e \"cd /home/wk/trt_projects/tensorrt_demos && python3 /home/wk/trt_projects/tensorrt_demos/\"","addpay":false,"append":"","useSpawn":"false","timer":"","oldrc":false,"name":"","x":650,"y":170,"wires":[[],[],[]]},{"id":"f3494b31.ed5548","type":"inject","z":"85a5333b.17ff7","name":"Start","repeat":"","crontab":"","once":false,"onceDelay":"10","topic":"","payload":"true","payloadType":"bool","x":110,"y":100,"wires":[["54643313.a69f4c"]]},{"id":"be08ae32.8d02c","type":"comment","z":"85a5333b.17ff7","name":"Starting & stopping services","info":"","x":180,"y":60,"wires":[]},{"id":"8ba1ff8e.34d6","type":"comment","z":"85a5333b.17ff7","name":"Send images","info":"","x":130,"y":330,"wires":[]},{"id":"8c4c9f74.3f5bd","type":"comment","z":"85a5333b.17ff7","name":"View the result","info":"","x":140,"y":440,"wires":[]},{"id":"5f58f95.8104a08","type":"mqtt-broker","z":"","name":"","broker":"","port":"1883","clientid":"","usetls":false,"compatmode":false,"keepalive":"60","cleansession":true,"birthTopic":"","birthQos":"0","birthPayload":"","closeTopic":"","closeQos":"0","closePayload":"","willTopic":"","willQos":"0","willPayload":""}]

The script is below (just change extension to .py). To send images to the script it is just to publish the buffer data to the broker topic. I have used image/XX where XX is a unique camera number in my case but you could use anything as XX.
trt_yolov4_to_mqtt.txt (4.1 KB)

If you decide to try demo #6 better first check your versions of TensorRT. My "old" installation showed I had 5.1.6 and that is too old for building the opt versions using plugins. Now with the new image downloaded I have 7.1.3

When you build the engines, at least this what I saw for the opt version, do plan if you want to run the Nano headless or not. I built with a monitor attached and then I got warning when running it without. So I rebuilt without monitor attached, using vnc. Then it worked fine without the warning. See here:

If you start-off from a new downloaded image, see also here:

The result is not disappointing, reaching around 20 fps!!!

Best regards, Walter