I had some time to attach a UART console to the TTIG.
Yesterday at around 7:30 pm I booted the node and the TTIG. In the first few minutes I sent a few dummy downlink messages to verify that the downlink is working. The node sends an uplink message every 5 minutes.
After 64 uplink messages the node forced an ADR request which was sent through the TTIG and successfully processed. I can see in the logs that the Node switched to SF7.
After the next 64 uplink messages the node again forced an ADR request which was sent by TTN but never received on the node and thus TTN repeated the ADR until I woke up in the morning.
Now it gets pretty strange and I’m happy for any idea that could shed some light on this.
The following log of the TTIG shows a dummy message I sent yesterday to check if downlink works:
Uplink: 407E2301268003000117E557EABB56F570042D
Dowlink: 607E2301268501000341FF0001BBF92DB0390B
2019-12-04 18:31:27.052 [any:INFO] HW pipe: PowerFolder node: [1337]-[F7DQCVTLdrGprAbCfys1]
2019-12-04 18:31:33.444 [SYN:INFO] MCU/SX1301 drift stats: min: 0.0ppm q50: 1.9ppm q80: 2.9ppm max: 3.3ppm - threshold q90: 2.9ppm
2019-12-04 18:31:33.449 [SYN:INFO] Avg MCU drift vs SX1301#0: 1.0ppm
2019-12-04 18:31:42.590 [S2E:VERB] RX 868.3MHz DR2 SF10/BW125 snr=10.5 rssi=-10 xtime=0x7F0000394564DC - updf mhdr=40 DevAddr=2601237E FCtrl=80 FCnt=3 FOpts= 0117E557…BB56 mic=755265781 (19 bytes)
2019-12-04 18:31:43.855 [S2E:DEBU] ::0 diid=6678 [ant#0] - next TX start ahead by 716ms365us
2019-12-04 18:31:44.562 [S2E:VERB] ::0 diid=6678 [ant#0] - starting TX in 19ms903us
2019-12-04 18:31:44.577 [S2E:INFO] TX ::0 diid=6678 [ant#0] - dntxed: 869.525MHz 27.0dBm ant#0(0) DR3 SF9/BW125 frame=607E2301268501000341FF00…2DB0390B
2019-12-04 18:31:44.736 [S2E:DEBU] Tx done diid=6678
The log looks sane and the message was received.
This morning I check what happened with the repeated ADR request downlinks that are not received anymore:
Uplink: 407E23012680A00001956AC0556A50E43CFAAC
Downlink: 607E2301268531000355070001E3CC1762
2019-12-05 07:48:19.726 [SYN:VERB] Time sync rejected: quality=1070 threshold=1059
2019-12-05 07:48:23.932 [SYN:INFO] Time sync qualities: min=1049 q90=1069 max=1445 (previous q90=1059)
2019-12-05 07:48:24.654 [S2E:VERB] RX 868.1MHz DR0 SF12/BW125 snr=10.2 rssi=-25 xtime=0xC2000ADA06D8C4 - updf mhdr=40 DevAddr=2601237E FCtrl=80 FCnt=160 FOpts= 01956AC0…6A50 mic=-1392886556 (19 bytes)
2019-12-05 07:48:46.238 [S2E:VERB] ::0 diid=43454 [ant#0] - starting TX in 19ms822us
2019-12-05 07:48:46.254 [S2E:INFO] TX ::0 diid=43454 [ant#0] - dntxed: 869.525MHz 27.0dBm ant#0(0) DR3 SF9/BW125 frame=607E230126A5230003550700…FA35372F
2019-12-05 07:48:46.413 [S2E:DEBU] Tx done diid=43454
2019-12-05 07:48:46.415 [S2E:DEBU] ::0 diid=43695 [ant#0] - next TX start ahead by 5m4s
2019-12-05 07:48:47.061 [SYN:VERB] Time sync rejected: quality=2068 threshold=1069
Look at the timestamps when the downlink is sent. It is sent 21.6 seconds after the uplink packet is received. If you look at the downlink frame payload you can see that it does not match the downlink TTN sent. The frame counter value of the TTN downlink is 49 whereas the value of the frame that is sent is 35.
It also schedules a downlink for diid 43695 5 minutes after diid 43454. This is also suspicious. How does it know that the node will send again in 5 minutes?
By checking the MIC I can confirm that both payloads are for the current node. The missing byte replaced by ‘…’ is 01. It looks like the TTIG completely scrambled the downlink packets and the downlink times.
Here’s another example:
Uplink: 407E23012680A20001B902A47B94F34C88D329
Downlink: 607E2301268533000355070001D77A767F
2019-12-05 07:58:34.658 [SYN:INFO] Time sync qualities: min=1046 q90=1070 max=1602 (previous q90=1059)
2019-12-05 07:58:35.543 [SYS:DEBU] Free Heap: 18824 (min=15488) wifi=5 mh=7 cups=8 tc=4
2019-12-05 07:58:35.574 [S2E:VERB] RX 868.3MHz DR0 SF12/BW125 snr=8.5 rssi=-24 xtime=0xC2000AFE70C3FC - updf mhdr=40 DevAddr=2601237E FCtrl=80 FCnt=162 FOpts= 01B902A4…94F3 mic=701728844 (19 bytes)
2019-12-05 07:58:56.052 [S2E:VERB] ::0 diid=43965 [ant#0] - starting TX in 19ms689us
2019-12-05 07:58:56.068 [S2E:INFO] TX ::0 diid=43965 [ant#0] - dntxed: 869.525MHz 27.0dBm ant#0(0) DR3 SF9/BW125 frame=607E23012685250003550700…6C009B58
2019-12-05 07:58:56.227 [S2E:DEBU] Tx done diid=43965
2019-12-05 07:58:56.229 [S2E:DEBU] ::0 diid=44211 [ant#0] - next TX start ahead by 5m5s
I also found the following lines in the log:
2019-12-05 07:59:07.241 [SYN:INFO] MCU/SX1301 drift stats: min: 0.5ppm q50: 1.9ppm q80: 2.9ppm max: 5.2ppm - threshold q90: 3.3ppm
2019-12-05 07:59:07.247 [SYN:INFO] Avg MCU drift vs SX1301#0: 1.0ppm
2019-12-05 07:59:09.352 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (3 retries): 3.6ppm (threshold 3.3ppm)
From the uplink times I think the general timing is okay. I’m not sure what exactly is going wrong here. I will keep looking into it.
Thanks!