Monitoring if sensors are regularly sending data

cortlieb · August 5, 2024, 4:01pm

Dear experts,

not sure if this is the correct category, even not sure if this the right forum.
But I think people here are practitioners and my question is kind of a best practices question.

I’m monitoring some LoRaWAN sensors.
Besides the actual sensor readings I also want to monitor if the sensors are sending regularly new values. My knowledge is that TTN does not provide such a metric.

Since I know the interval at which the sensors are supposed to send data, my idea was to count the sensor entries in the database for a given time frame.
More hands on: when a sensor is supposed to send a new value each 5 min I expect 12 sensor values per hour.
My stack is TTN → Telegraf → InfluxDB → Grafana .

The approach described above (implemented in a query in Grafana) works more or less, but has some shortcomings.
What is your approach to monitor your sensors and trigger (at least a visual) alarm, when a sensor is no longer sending in its supposed interval?

Thanks,
Christian

kersing · August 5, 2024, 4:38pm

Do keep in mind RF technology like LoRaWAN does not guarantee every uplink will be received and forwarded. Actually I can guarantee you will have at least 2% and up to 25% loss for perfectly placed end devices and more for those located at the limit of RF range of the available gateways. So your check should allow for packet loss.

descartes · August 5, 2024, 7:06pm

The query is simple enough if you have the uplink interval expected, a multipler to allow for the packet loss as above - say 3 - and the last uplink received date time.

And you can have levels - so orange for 3 and red for 5.

Or look at prior history if some devices are prone to patchy reception due to environmental conditions such as the refuse truck. Hold a running average for a day and then query anything overdue to see what its latest average is.

You can then send an email that can send a text message or some such. As for a visual alert, you’ll need something that can be in touch with the database to know when to activate and what colour to display!

cortlieb · August 5, 2024, 7:20pm

Thank you for you replies!

Ok, that is a wide range (2% … 25%). I guess finally one has to look at RF data (RSSI, SNR) to decide whether the RF connection causes losses or a damaged sensor?

I’m not sure if I understand that remark. What is your idea to do with the last receive time? Simply display it to see if it is a long time ago?

kersing · August 5, 2024, 7:57pm

RF data can vary quickly if someone puts something that shields RF between the device and gateway(s). Lorries, crates, metal boxes, all or any of them in the wrong place might cause a devices RF to fall below the threshold where it can be received.
Not to mention a failing gateway, failing internet connection of the gateway, weather conditions like lightning which might damage both gateway and device or rain which will cause a drop in signal strength.

That’s why multiple redundant gateways at different locations receiving the device is important to have the best chance of getting data. (Don’t forget to monitor the functioning of the gateways, a forum search should provide information on the subject as the ‘how to’ question pops up regularly)

descartes · August 6, 2024, 12:00am

Not quite - the suggestion was to use that as part of the calculation:

IF

( Current time - Last Recd time ) > (Uplink interval * number of uplinks you are prepared to lose before sounding the alert )

THEN sound the horn.

You can do it in a SQL query so it can check all devices in one hit, even to the extent of having it work for uplink interval & lost uplinks on a device by device basis.

Looking at RSSI & SNR and it’s min & max will inform you of devices that are more vulnerable to radio issues - be it periodic transmissions from other devices in the ISM band or a dumper truck parked up next to it. If you have some rampant squirrels or insane seagulls you may end up with a damaged sensor but generally they only stop working because they’ve run out of batteries or someone has decided that they are ‘suspicious’ and use a hammer to disable them.

Fundamentally you have to try stuff out and refine it to suit your situation - there isn’t a golden folder of “do this” - put in a report that’s very paranoid, look at the devices to see what’s going on and set alarms to give some leeway for reality.

Also remember to take a replacement device with you when going to find out what’s happened - because travel time is far more expensive than the cost of proactively replacing a sensor - if it turns out to be OK you can reuse it for the next road trip.

cortlieb · August 6, 2024, 6:16am

Thank you both for your suggestions.
I will dig a bit deeper to see how and where to apply the calculations.

Not sure which post I should mark as solution, I think I will pick the last one.

Thanks
Christian

system · August 7, 2024, 6:16am

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.