Yesterday morning my Mikrotik gateway went off line, I connected with my phone to the WiFi AP and have access to the internet.
I check all the settings and seem fine to me, I rebooted it, but the gateway did not reconnect.
I subsequently have rested to factory default and reconfigured - still not connecting the the TTN server, but I can access the internet via the AP and I can log in from my router side (LAN) into the gateway.
There’s multiple join accepts e.g. TX in response to RX, in that traffic log so I think your assessment is 100% correct, although the existence of the proprietary line there reminds me this list shows CRC-error packets by default e.g. noise that looks like LoRa. </pet peeve>
So the traffic screen shot shows a pile of downlinks which could only come from a network server that you are connected to.
Maybe your device (what is it, where is it?) just didn’t happen to pickup your gateway - maybe something physical has change - obstruction, interference, antenna not quite connected?
With the different options on the Mikrotik, does it show the gateway to server exchanges / traffic?
The screenshot could be less clipped, but those actually look to be receive records, ie, transmissions sent by another gateway which have been received by this one.
Why not connect both gateways (only, one at a time) through a router on which you can run tcpdump or tshark, look at the UDP traffic and compare. Particularly look to see if the cycle of push/pull and their corresponding acks is working.
Oh, incidentally, two UDP based gateways behind the same NAT router is apparently a known issue for some NAT implementations… though that tends to affect the downstream side not the upstream, unless the packet forwarded gives up when it doesn’t get any acks.
So, the gateway definitely do receive Lora packets, it defiantly receive traffic from the TTN server, but some how it does not connect the two together.
I connected the gateway to where my other gateway were connected same behavior, plug the Multiteck gateway back in it works.
Then took it to a mates network, plugged it in, can connect to the Wifi AP and then to the net, cant connect from the lan to the gateway, can ping it, see it on DHCP.
Retested it four times, then could connect with WinBox via the LAN.
So some where in the firmware there are some issue.
In my world [of commercial operations at scale] if a working system fails then a technician would remove the failed unit and replace it with an identically configured spare unit,
If this restores service then it’s a simple unit failure and the failed unit is returned to quarantine for repair, maintenance-contract replacement or scrap.
If this does not restore service then it’s a wider system-environment failure and that is escalated to an engineer.
All the work would be managed by the operations supervisor through the Maintenance Management System (MMS).
… which is a very nice description of sitting around waiting for someone else to fix the problem.
Meanwhile… downtime.
Do you have an actual suggestion for getting this user’s gateway back online? You might note I’ve made concrete suggestions for understanding the issue faced by the existing software, and not just jumped to the idea of replacing it - in fact it was Johann, not I, who blamed the incumbent software, my point is as much that the closedness of components makes understanding the problem challenging, not that I’m ready to conclude the factory firmware is actually responsible for the failure…
Unfortunately I am not in a industrial or Telco where I can replace the unit and then late diagnose the problem, so I need to figure it out and fix it.
I’m not saying it doesn’t, but could you be specific about exactly what piece of evidence or test leads you to conclude it is getting responses back from TTN servers?
Unless I’m mistaken, the leftmost “RX” column on the “Join Accept” messages in the earlier log are things it received over the air, not items queued into it by TTN which it transmitted over the air.
If there is UDP traffic going both ways, I’d be really curious to pull it apart. Maybe some condition changed causing one end to format things in a way the other can’t handle, or maybe it is just too spotty with dropped packets.
Does Miktrok speak the ordinary Semtech UDP protocol, or a custom variation?
The ordinary protocol is documented on the semtech packet forwarder repo and it’s pretty easy to separate the binary header from the textual JSON remainder. Even strings would pull out the textual part, but I’d want to take a careful look to make sure the binary headers were conformant and properly matched tokens in both directions. etc.
Also worth wondering if something is going on with your gateway EUI… if it is derived from a network interface MAC could it have changed? Could someone have, er, “stolen” it by accident for theirs? You’ll see the EUI in the raw UDP, too.
These Join-Accept packets were the leading me to believe there are traffic from the TTN to the gateway.
If the left column that’s cropped off now says “TX” I’d entirely agree.
But in your previous post from an earlier run, it clearly said “RX”
(Granted, I can’t rule out the possibility of a web GUI bug mislabeling everything as “RX”)
But then it’s possible to note that those downlinks don’t seem to be time-correlated to any received uplinks recieved by this gateway which could have triggered them, so I’m still learning towards intercepting the transmissions of another gateway.
Kinda odd that traffic isn’t sorted by timestamp, either. But even looking through that I’m not seeing uplink/downlink pairings, except perhaps with something clipped off.
The Dev Addr confirms that my node are transmitting and been received by the gateway
I don’t see the packet in my application.
From where my node is, it cant see any other gateway, but my gateway do pick up other nodes.
I moved this afternoon my Multiteck gateway to the location of where my Mikrotik is and it does see my node and forwards the packet to the application.
As can bee seen the RSSI is +/- -70, so not a level issue.