I think this could be a very relevant & interesting starting point for you.
Adrian at Pyimagesearch is extremely knowledgeable about this topic. Even if the focus is about detecting object in pictures it might be of great interest.
Some example scenarios I can imagine;
For 2 hours no tap water has been consumed & no motion detection registered from indoor PIR's, your cars are all parked in the garage, all doors locked, alarm system is fully armed -> house is empty, you are all away from house -> put house in empty mode
When house is in empty mode, someone comes walking up to the front door, the camera video image analytics finds and identifies your daughter (coming home from school), the alarm system gets disarmed, front door unlocks, lights in the entrance hall are turned on if dark outside, music starts playing...
Except for the face recognition analyze (this is also something Adrian has an article about) required to identify your daughter from the video, no need for neural networks. Scenarios like this can be designed & realized already today with Node-RED just using logic & available nodes (and running the deep neural network analyze as separate Python services).
My personal experience with video analytics is that it requires some reasonable computing power to run the analyze locally. In my attempts I have, in this order, used AWS Rekognition -> local analyze in RPi3's -> finally settled for an outdated Lenovo laptop running Debian and doing local analyze.
If you are looking for speed, forget the RPi3, it takes roughly 7 seconds to analyze an image that AWS or my Lenovo does in less than a second. The Movidius stick is also much faster than the RPi3 but I have no own experience of it. One thing is clear, the future will for sure bring faster "things", also the algorithms will be further optimized and refined.