How fast is your Tensorflow COCO SSD

Hello

I would like to ask your experiences with the TF COCO SSD node from node-red-contrib-tfjs-coco-ssd.
On my side the model execution consumes between 7 and 10 seconds for a simple image from an HTTP request, e.g. https://upload.wikimedia.org/wikipedia/commons/c/cb/Old-style_VAZ_car_in_Kolpino_with_USSR-time_car_number.jpg

Is this normal and something what could be improved? How could this be imporved. In my case i'm only interested in detection of the classes PERSON, CAR and BICYCLE. Could this improve the runtime of the model, and if yes: how can I make this kind of settings?

Maybe the documents in this folder (on my Raspberry PI 3) could be the key to make some settings?
/home/userpi/.node-red/node_modules/@tensorflow-models/coco-ssd

Thank you, BR

Hi @Gerry-it ,
Here on stackoverflow I see that it is ok if you use images of 320Ă—320. Seems that the ssd model has been trained on 320x320 images also...
Bart

When I try with the picture of the VAZ, the coco ssd analyze is fast, finished within 260-290 milliseconds running on a RPi3B+. Much more time is consumed in the actual image transfer via the http node. I have saved the image locally on an apache web server I have in my network but it still requires some 11 seconds!!! to transfer it

I would, if possible, look for another transfer mechanism, having "something" closer to the source, reading the file, eventually compress it and then transfer it via MQTT

1 Like

I think my answer from yesterday was misleading. It was wrong. According to my new measuring it is not the transfer time that is consuming the amount of processing time, it is the analyze. Just transferring the image takes around 180-190 ms. The actual analyze of the full sized image then takes around 11-12 seconds. Using a resized image keeping the aspect ratio (320x192) takes around 3-4 seconds. All on a RPi3B+

I don't think you can "modify" the existing model. Maybe you could train your own model to make a more tailored variant. But the best advice is to resize images before the analyze and to get a faster computer. I do not have one myself but I have understood running on a RPi4 will be much faster

I'm still wondering why the "blue request text" below the http request stays there for all the time while the flow is running but anyway.

I think this is because NR is pretty blocked from doing other things while the analyze is running...

Hi, @krambriw thanks four your input.

I played with some settings of the COCO node, and to change the pass from ANNOTED IMAGE to NONE or ORIGINAL IMAGE reduced the time (of the complete node: http-request + coco-ssd) from 8 seconds to 2.2 seconds.

How do you reduce the Quality of the image in a performant way?

BR

Thanks, yes, changing the output to NONE reduces the elapsed time substantially :wink:

To reduce the size of the image for testing, I just used an image tool software (Paint Shop Pro) to create a smaller version of the original. The original size is 2560x1536 and I resized it to 320x192

To do resizing "on the fly" there are several ways. One is to use the image tools (node-red-contrib-image-tools) but after testing this I think it loads NR too much so what you will win in reducing analyzing time, you lose on resizing time. I believe, not tested yet, is to do the resizing in a pre-processing outside of NR before analyzing the image in the coco ssd node. Like in a python script, using OpenCV and send the resized image back to NR for the analyze

EDIT: Just to mention, the coco ssd analyze did find the car without problem also in the resized image

If you are getting the images from a camera then often cameras can have multiple streams available. Check whether yours has a lower resolution output. Often they will provide multiple at once so you can have the high resolution for recording and at the same time the low resolution one for the image detection.

I did some testing with a python script I have since before. Sending your original large image to the script via MQTT like this:
image

Gave the result:

2021-03-01 09:55:35 pics processed: 3
Qsize: 1
320 192
Finished DNN Analyze in 0.34888553619384766 seconds, camera:42
car 0.99999857 42

The script gets the image via MQTT, uses CV2 (OpenCV) to resize it to width 320 pix, keeping the aspect ratio, then analyzing directly using MobileNetSSD, basically the same as the coco ssd node does. A car detected with 0.9999 probability, Everything finished in 0.35 seconds

Hello,

I've tried to play with the property Threshold in the COCO SDD Node. Until know it was always at 1, the new setting is 0.7. I expected that everthing below a score of 0.7 will not be recognized. But I get results e.g. recognition of a CAR also with 0.5, 0.6 etc. Far below 0.7. Is my interpretation of the treshold setting wrong? If so, what is the correct function of this property. To be honest, i didnt found any clear answer in the WEB...

thanks

There is an issue raised for this already at github. You can modify the code yourself or wait that it get's fixed. I'm not sure if @dceejay has seen that. I simply changed the code in my installed module and now it works. See here

The code to fix is found in the tfjs.js file in directory /home/pi/.node-red/node_modules/node-red-contrib-tfjs-coco-ssd:

        RED.nodes.createNode(this, n);
        this.scoreThreshold = n.scoreThreshold;
        this.maxDetections = n.maxDetections;
        this.passthru = n.passthru || "false";
        this.modelUrl = n.modelUrl || undefined; // "http://localhost:1880/coco/model.json"
        this.lineColour = n.lineColour || "magenta";
        var node = this;

1 Like

Finally patched :slight_smile: version 0.5.6

1 Like

Hello,

@dceejay and @krambriw thanks for the good inputs from your side.

I've updated the node now to 0.5.6.
What is strange: with the standard setting of 0.5 as threshold it works like before. If I set the property threshold to e.g. 0.7 the model does not work anymore? With 0.4 it works... very strange...Anyone else with the same behaviour?

thank you

Hi, it works fine for me trying threshold 0.8 and 0.9. But check that the object class really gets detected with a score higher than the threshold

image

Below is a sample flow I have running in a RPi3. The flow receives image frames from my security cameras, resized to 320 pixels width, via MQTT. To avoid overloading of the RPi a have a rate limiting node in front of the analyzing node configured to 1 msg per 3 s. The function node is doing some "intelligent" per camera filtering:

  • required & valid object types, just "persons" currently allowed
  • required score value
  • required minimum detection area size in pixels
  • maximum allowed detection area ratio W/H
  • detection area labeling

When a valid detection, fulfilling the rules, is happening the annotated image is saved to file

The two gates have a special usage. They are controlled by certain conditions and either allowing or blocking the sending of the event information to my phone via Telegram

The first gate mode is depending on the status of our Verisure alarm system. It is only open when the system is fully armed or armed home (the shell of the house is armed). When unarmed the gate is closed, means events will be blocked

The second gate mode is controlled by presence detection. Assume the house is fully armed and I approach the house. My phone is detected and the gate closes for a defined time period, pausing the transmission, allowing me to enter the house without getting a lot of images of myself sent to my phone. For presence detection I use Monitor and BLE. This has worked really well and you can of course create a list of phones that you would like to grant access in this way. In my case the list is short, is just my phone right now

1 Like

Hi @krambriw ... I lime your function "functions". How did you solved the single bullet points? would it be a problem for you to share the code of flow with us?

thanks

The code inside the function node is a bit specialized towards my specific setup, I hope it can be understood. At least it could be a point of reference for discussion, everything inside is pretty straight forward I think (and I think I can also clarify any unclear details). Here it is:

Please note that I have assigned a unique number to all my cameras. Like '51' or '11'. So when an image is captured from a camera it is published (by another separate process) to an MQTT broker at a camera specific topic. Like resized/22
In this way I can identify from which camera the image is captured. To be able to set the correct labeling, check the required score etc etc

let conf_lev = {
    'XX':0.50,
    '51':0.50,
    '52':0.50,
    '41':0.40,
    '42':0.35,
    '31':0.65,
    '32':0.45,
    '21':0.70,
    '22':0.45,
    '11':0.35,
    '12':0.35
};

let min_area = {
    'XX':700,
    '51':700,
    '52':700,
    '41':8600,
    '42':700,
    '31':4900,
    '32':5000,
    '21':3000,
    '22':4000,
    '11':1500,
    '12':3700
};

let ratios = {
    'XX':0.65,
    '51':0.65,
    '52':0.65,
    '41':0.65,
    '42':0.65,
    '31':0.65,
    '32':0.65,
    '21':0.65,
    '22':0.65,
    '11':0.70,
    '12':0.70
};

let ar = context.get("ar") || {
    'XX':0,
    '51':0,
    '52':0,
    '41':0,
    '42':0,
    '31':0,
    '32':0,
    '21':0,
    '22':0,
    '11':0,
    '12':0
};

let tmrs = context.get("tmrs") || {
    'XX':null,
    '51':null,
    '52':null,
    '41':null,
    '42':null,
    '31':null,
    '32':null,
    '21':null,
    '22':null,
    '11':null,
    '12':null
}

let labels = {
    'XX':"Just a demo nbr XX",
    '51':"Just a demo nbr 51",
    '52':"Just a demo nbr 52",
    '41':"At the front entrance door",
    '42':"Mobile webcam",
    '31':"In front of the carport",
    '32':"In the carport",
    '22':"Around the washroom entrance",
    '21':"In the garden",
    '11':"Near the front entrance door",
    '12':"Walking towards the carport"
};

let cam_pos = {
    'XX':'MotionXX:detect_now1',
    '51':'Motion5:detect_now1',
    '52':'Motion5:detect_now2',
    '41':'Motion4:detect_now1',
    '42':'Motion4:detect_now2',
    '31':'Motion3:detect_now1',
    '32':'Motion3:detect_now2',
    '21':'Motion2:detect_now1',
    '22':'Motion2:detect_now2',
    '11':'Motion1:detect_now1',
    '12':'Motion1:detect_now2'
};

function f(cam) {
    ar[cam] = 0;
    context.set("ar", ar);
    tmrs[cam] = null;
    context.set("tmrs", tmrs);
//    node.warn("Timer triggered: "+cam);
}


let detections = msg.payload;
let cam = msg.topic.split('/')[1];
let score = 0;
let clss = '';
let tclss = '';
let h = 0;
let w = 0;
for (var det in detections) {
//    node.warn(detections[det]);
    sc = detections[det]['score'];
//    node.warn(sc);
    clss = detections[det]['class'];
    w = detections[det]['bbox']['2'];
    h = detections[det]['bbox']['3'];
//    node.warn(w*h);
//    node.warn(w/h);
    if (clss === 'person'){
        tclss = clss;
        if (w*h > min_area[cam] && w/h < ratios[cam]){
            if (sc > score){
                score = parseFloat(sc.toPrecision(2));
//                node.warn(score);
                sc = 0;
            }     
        }
    }
}

if (tmrs[cam] == null) {
    tmrs[cam] = setTimeout( f, 60000, cam );
    context.set("tmrs", tmrs);
}


if ((tclss === 'person' && score > conf_lev[cam] && w*h > ar[cam]) || cam === 'XX'){
    ar[cam] = w*h;
    context.set("ar", ar);
    msg.payload = msg.image;
    delete msg.image;
    msg.filename = '/home/pi/pics/captured'+cam+'.jpg';
    let label = labels[cam];
    msg.caption = label+' '+tclss+' '+score
    msg.itr = 'Intruder detected: '+cam_pos[cam];
    return msg;
}