Downlink error in v3 UI - message not arriving at gateway nor device

IIRC any ack should be in the next uplink. I don’t think the specification states an immediate uplink is required but I might be wrong. So if LMIC state is lost due to sleep before the next uplink it will not send the required ack.

Anyway, as we once again saw, confirmed down links are a source of issues so better to avoid them. As you stated, use application level logic in stead to ‘confirm’ receipt of new settings. That way you can also indicate the settings are valid or signal the device couldn’t apply them due to invalid values.

Very probably - so many moving parts - just admiring the Beelan source code which suggested it had added MAC command processing whereas in fact it’s still only a PR and only partially implemented - whilst I’m on a roll I can have a peek at LMIC to see what it thinks it’s doing & when.

@kersing, transpires that LMIC does trigger an uplink to confirm a downlink in v4. Took a while to trace so I’ll not do that with older versions.

I have created an ATmega4808 + RFM95 device specifically to trace MAC commands from TTS as they do appear to be numerous and I can put plenty of debug in to LMIC to see what’s going on, but printf only works (easily) on AVR, not SAMD.

Don’t know how the OP got in to a loop, but my client found the loop-da-loop by sending a confirmed reboot downlink and with the MKR WAN which you know I love, you can’t be sure you’ve done a Tx so if you get a confirmed downlink requesting it you’d then have to send a confirmed uplink to clear the confirmed downlink with the uplink being confirmed so you know that the downlink was confirmed (excepting MKR WAN doesn’t tell you an uplink was confirmed in current firmware) - we’ll end up with two Tx’s from the gateway in short order. What a mess!

1 Like

Hi @kersing , for this particular type of device, there is no sleep time. I will be careful the device that does go to DeepSleep. Thank you.

@descartes Thanks Nick. I upgraded my LMIC lib to 3.3.0. I am impressed you can remembered the DELAY_DNW1 :grinning:
enum { DELAY_DNW1 = 1 };

Test Downlink with LAN-Home-broadband connection and it is responding as expected.

DELAY_DNWY1=1, RX1 Delay=1, Test Downlink with V3-Gateway+GSM-Cellular-Network connection and it is responding as expected (Missed downlink sometimes).

DELAY_DNWY1=5, RX1 Delay=5, Test Downlink with V3-Gateway+GSM-Cellular-Network connection and it missed most of the downlinks.

Nick, just to double confirm:

  1. Should the u1_t confirmed parameter value set to 1 for the following UplinkTransmission when a ConfirmedDownlink was received?

  2. Should I stick with LMIC lib version 3.3.0. and not 4.0.0. ?

Thank you very much.

I’m not sure I understand your results - are you saying that if you give the gateway longer to receive the downlink that it misses most of the downlinks?

Can you show the gateway & device web console for an uplink/downlink combo along with the gateway logging so we can see the actual timing.

It may be that there is some drift in the ESP32 that is causing the window to open late on 5 seconds but the drift isn’t sufficient to break the 1s delay. Ideally this needs fixing. Forum search will help.

No, that’s saying send a Confirmed Uplink and has nothing to do with what the Downlink status was. Sending a confirmed uplink isn’t really required with the newer firmware as it has Link Check built in so it will do that for you automatically every 64 uplinks (EU settings).

Anything that causes the gateway to transmit is BAD. Puppies may be harmed.

When??? Where did “when” come from? Please just don’t. Just no. Downlinks are the exception, confirmed downlinks are, as you’ve discovered, a whole world of hurt.

Just in case the message isn’t setting, this mummy cat :scream_cat: is screaming because her kitten :crying_cat_face: is crying because you sent a confirmed downlink.

Well, it’s all about making considered choices. I personally like to leave new versions running for a while on my test shelf - particularly if it’s a first number change. Right now I’m using 3.3.0 but I’m reading about the updates to the event system in 4 so I can start to migrate my code base so that I’m not doing it in a rush when the old event messages are deprecated.

Hi Nick,

Sorry for the confusion. This is what I did:

  1. I recompiled the device codes which uses the LMIC 3.3.0 lib with DELAY_DNW1 set to 5
  2. Upload the codes to the device.
  3. From TTN console, for this device, I set RX1 Delay to 5

From TTN Live Data page for this device, I can see the Uplink msg.
From Device logs, I can see EV_TXSTART and it paused for 5 seconds.

However, no Downlink reach the Device (nothing showing from device logs).

Is the Rx1 Delay indicate the time in seconds that TTN should send the Downlink after receiving an Uplink msg?

Ok, undestood on no more Confirmed Downlink!
Ok, understood on no need to set u1_t confirmed parameter value to 1.

Below are the screenshots of the Device and Gateway LiveData page from TTN console:

Device
TTN-V3-Device-LiveData

Gateway
TTN-V3-Gateway-LiveData

Seems like the Gateway is sending the Downlink to the Device almost immediate after receiving the Uplink?

Sort of, it’s the delay in the whole system, so the network server sends it to the gateway once it’s processed ‘stuff’ with when to send it to so that the gateway times it.

Again, sort of, its the NS logging that it sent the downlink straight away with a delay request baked in of 1 second.

You need to check your device settings on the web console to make sure Rx1 is set to 5s. It may be easier to set up a new device so you can ensure it ‘takes’.

This is the device Rx1 Delay setting:

TTN-V3-Device-RX1-Delay

OK, I haven’t circled back on this one but there are still some things to figure out with ABP and changing settings is one of them - work in progress as much as an issue as we need to apply some science to the process. I have ABP working on a variety of devices but I’ve lost track of it for LMIC.

Best thing is to create a new device. Couple of things to check:

The App/JoinEUI can’t be all zeros as suggested by the console as LMIC doesn’t cope with that at all - just make it all zeros with a one at the end.

If you need a ‘new’ EUI, this works for me: Random EUI or Key generator

Hi Nick,

When creating a new Device, I generate random Dev EUI but
AppEUI = n/a
JoinEUI = none

During my initial tests, Gateway is connected to LAN-Home-Broadband, downlink does work very well. It is when GSM network is involved, downlink just does not reach the device.

I performed a few tests with different DELAY_DNW1 and RX1 Delay value with both LAN-HomeBroadband and GSM Cellular network. I also utilised the simple ttn-abp (Example from LMIC lib) codes for testing the GSM network.

With LMIC 3.3.0 lib, Gateway connected to GSM cellular network, DELAY_DNW1=1, RS1 Delay=1, the downlinks does not reaching the device every time but it is sufficient for performing the necessary initial setup for a new device.

It would help if from TTN LiveData page we could find out whether a Downlink has been received by the device especially if we will not be able to see the device logs. You metioned LMIC will automatically confirm when it received a downlink? Any way to see this from TTN console? Thank you very much.

I think that clearly indicates the source of the problem - can you try a different provider &/or MiFi dongle? What is the speed & ping time?

If LMIC gets a confirmed downlink then it queues an empty uplink to ack for immediate transmissions. Whilst you were thinking about confirmed transmissions, a teddy bear factory burnt down and orphans will no longer have birthday presents.

As I don’t do confirmed anything, I haven’t looked at the console. But this is irrelevant, your GSM connection appears to be the problem.

Hi Nick,

I believed you are correct. Somehow with GSM network the downlink is unable to reach the device, maybe due to longer delays?

Is there something one can set on the device or gateway console side to cope with this type of delays?

Thank you very much.

When an Uplink was sent, a EV_TXSTART event followed by a EV_TXCOMPLETE event will be fired.

When EV_TXCOMPLETE event is received, if the LMIC.dataLen is not zero, it means there was a Downlink.

Maybe the Downlink was sent after the EV_TXCOMPLETE event ?

What have you in mind? You could try stretching out the Rx1 delay to 10s or something.

Maybe.

Perhaps get some logging going on the gateway when it’s on GSM in parallel with the gateway & device web console and the device serial output (that’s four things) to see what happens & what matches up.

If (BIG IF) …

IF it is really because the Downlink reaches the device too slow … and …

IF Rx1 Delay specify the number of seconds after an Uplink recived+processed by TTN before a Downlink is sent … and Rx1 Delay = 1 already too slow … then cannot set it to something less than 1 … :pensive:

In a way I am hoping it is caused by something else …

This IF is like some made up scheme and not LoRaWAN.

Rx1 is the delay from the end of the Tx. It does not relate to processing time - how would the node know how long it takes TTN to process?

https://www.thethingsnetwork.org/docs/lorawan/classes/#class-a

Rather than guessing and using up volunteer time, why not get some logs together?

Thank you Nick, will do that. :+1:

I can not ask a new question, so I will reuse this one.

I do not reliably get downlink messages. In fact, I would say its more likely to not work than work.

Uplink messages are received regularly at 3 minute intervals. Downlink message are checked for five seconds after an uplink. I figure that should cover the two 1 second windows for downlink.

However 99% of the time they are missed.

With that said, I don’t see any documentation on the operation of the downlink queue. If you put a message/payload in the queue, does it get transmitted in 1s, 5s, or 15s? Does it get sent once? Twice?

What does the UI Control “confirm downlink checkbox” do? It does not seem to do anything different.