TTN Backend reliability boost

EFthings · October 15, 2019, 10:10pm

We are about to set up a larger lora network based on TTN. During the past few weeks we have monitored the packet losses of our test bench. This are the hourly packets lost:

Packetloss

The issue had been discussed before, but I have a different question.

One option we are thinking about was to move to a (paid) private network, which could be a solution.

BUT: With a private network, nobody can benefit from our gateways (and vice versa). We generally like the idea of a free and public network, so moving all commercial projects out of TTN might be not the right way.

So, we would more like to support TTN in a way, that the network will be more reliable in the future. As you see, the loss rate is relatively low most of the time. It seems, that the whole network sometimes is down for 30 minutes or more.

So, here are my questions:

a) does anybody know the reasons for longer downtimes of the backend?
b) how to support the network?

Best regards, Eckehard

BoRRoZ · October 15, 2019, 10:13pm

may I suggest TTN’s SLACK #OPS channel … that’s were you probably find a faster answer

cslorabox · October 16, 2019, 12:00am

This is exactly why TTN needs to support sharing of gateways to which it will not have exclusive access, by implementing a scheme to indicate when a gateway is (as a result of obligations to its owner’s network server) unable to handle a transmit request and TTN should instead assign it to any other gateway that might be in range.

People such as you who invest in putting up gateways because they need their data can’t risk losing it to failures of something not under their control - you’d like to donate your spare capacity to the community, but neither the current architecture nor the grand “someday” plans of high level peering really support doing that, while also meeting the needs that are funding your gateway deployments.

netmonk · October 16, 2019, 4:48am

Is it possible to have a valid invitation URL for slack ? Seems the current on i found is over. I cannot join TTN slack.

BoRRoZ · October 16, 2019, 7:58am

thethingsnetwork.slack.com

kersing · October 16, 2019, 4:52pm

To join the slack channel, login to the account server and use the ‘join us on slack’ link on the right under the profile picture.

EFthings01 · October 16, 2019, 5:44pm

Currently the Gateways are not the bottleneck.

cslorabox · October 16, 2019, 5:51pm

No, and that was not the argument. The argument was that TTN’s inability to share access to gateways prevents people from contributing to TTN while also deploying their own server to work around its infrastructure issues in order to be able to meet the needs that fund the installation of those gateways.

BoRRoZ · October 16, 2019, 6:35pm

a) longer downtimes of the FREE backend ?
b) many company’s and city’s place public gateways @ the moment

for more info about support and private networks contact https://www.thethingsindustries.com/

netmonk · October 16, 2019, 7:06pm

Thank you very much, today it works. But i tried several times in last days, and the link was not anymore valid.

EFthings01 · October 17, 2019, 4:01pm

Maybe downtime is not the right word, but sometimes the backend does not receive packets over UDP for about 30 Minutes. And we see such “drops” on multiple connections at the same time. This does not happen very often, but once or twice per day. The rest of the time there are only some packets lost.

We have moved to a private account, and packet losses are nearly gone. But it would be better to have a reliable public network.

BoRRoZ · October 17, 2019, 4:38pm

of course I fully agree… but it’s permanent ‘under construction’ … V3 is now on the horizon with new and more secure possibilities and very soon TTN has 10 k gateways around the world !

EFthings01 · October 22, 2019, 8:21am

Where to look for more information about “V3”?

As long as suppliers like Kerlink do not support the new protocols things will not get better, i suppose? It would be very helpful to get more information, which gateways do support the current protocols.

There is a list of commectial gateways that should be updated. Some gateways like the Kerlink iFemtocell do only support the old Semtech SPF, which has a lot of quirks and flaws. When we started it looked, like there should not be any problem, but we found that the SPF currently looses up to 25% of the packets if the backend is busy. The new Gateway protocol is much more reliable.

BoRRoZ · October 22, 2019, 9:02am

cslorabox · October 22, 2019, 6:22pm

The basic rule of thumb would be not to a buy a gateway where building the entire stack from source is not known to be a workable option.

Are you sure that isn’t possible with that particular Kerlink?

To some extent, even gateways somehow locked into to an old protocol may be able to be modernized by putting them behind a local translator, preferable inside any firewall/NAT (or co-located with it).

For example in LoRaServer’s (non-TTN) ecosystem, packet forwarders still typically speak the legacy protocol, but only as far as the “gateway bridge” translator, which ideally runs on the gateway itself. Due that, and you remove most of the common causes of data loss. Conversely if you run that gateway bridge in the cloud (perhaps co-located with the server) you suffer most of the issues with the old protocol, except perhaps the UDP server overloading TTN reportedly has issues with. Putting the bridge somewhere in between, ie between the gateway and the ISP may also be a path with the combination of being able to run it on a more easily serviced box, but still keeping the legacy protocol very local.