Risk management for TTN

Thank you for your support of the idea to do at least some risk analysis. Perhaps we should give it a try.

For starters, I’d like to find out how the network is governed. Who decides (whom decide) what should (not) be done? Who decides if one is allowed to say “this gateway is part of the TTN network”? How are reviews done, if there anything like quality control, for example?

Because if the “governance” of TTN, in whatever form it exists, decides that it is a Good Idea™ to do risk analysis, we have achieved the first step of the RA process: acquiring management support. This holds true regardless the management be a person or a group. The first thing that should be done is to acquire the sponsorship of TTN Governance.

Forgive me my ignorance - but how is governance of TTN done?

Hi tkerby,

Perhaps, I’ve not correctly understood what you meant by “spoof traffic at the gateway level”. My understanding was that the NwkSKey prevented spoofing as any tampering with the datagram between the Node and the Broker would result in the message integrity check (MIC) failing and the Broker dropping the datagram. So gaining physical access to a gateway to tamper with datagrams wouldn’t help because only the end Node and the Broker/Network Server have access to the NwkSKey. You could do a replay attack however, but the Frame Counter is there to protect against that.

Running a packet sniffer doesn’t get you much as the payload is encrypted between the Node and the Handler using the AppSKey.

With a NwkSKey providing protection against a man-in-the-middle attack, a frame counter protecting against replay attacks and end-to-end encryption of the payload, the security model seems rather robust. That being said, it’s been quite a few years since I’ve worked in network security and, more importantly, I’m only just starting to get up to speed with LoRaWAN and the TTN’s architecture so perhaps I’ve missed something.

Obviously, my comments focus on network security and don’t address the larger question of risk management.

But your posting DOES list a number of controls that are in place, and they are there to prevent or reduce weaknesses. Network security (especially in our case…) is an essential part of security. Who(m) has/have decided that these controls need to be in place, who has done the risk analysis, and should we address them if we feel that a broader view might be necessary?

Whom, in other words, have - or who has - the final say when it comes to accepting risk in our network?

@daniel already provides a number of threats (e.g. somebody spoofing traffic, gaining physical access to a gateway, replay attacks) and controls in place to prevent them from causing harm (MIC, encryption, frame counter).

This is of great interest to me: apart from finding out about the ‘management side of things’ of TTN I’m also very interested in compiling a list of threats and possible controls (and control objectives) for the TTN infrastructure (and other similar infrastrucures), which, again together with controls to be found in ISO27001 annex A (or ISO27002) may provide a starting point for a more formal RA system.

Many moons ago DTI started similarly when they compiled a list of controls and control objectives, later to become BS7799, then BS17799, then ISO27002 and ISO27001, so perphaps we’re on the brink of creating a new standard to be used for risk analysis and risk treatment within volunteer driven IoT infrastructures (aka volioti) :slight_smile:

An example of ‘objectives’ and ‘controls’ (examples taken from ISO27001, parts in italics are not parrt of the standard but used by me to clarify things):

    _Threat: misuse of TTN asset's by suppliers_
      Objective: To ensure protection of the organization’s assets that is accessible by suppliers.

      Control #1: Information security requirements for mitigating the risks associated with supplier’s access to the organization’s assets should be agreed with the supplier and documented.

      Contro #2: All relevant information security requirements should be established and agreed with each supplier that may access, process, store, communicate, or provide IT infrastructure components for, the organization’s information.

      Control #3: … etc.

(Though just an example, it is relevant: does the ‘formal owner’ of TTN require this from suppliers? Are volunteers that operate the gateways to be seen as ‘suppliers’? Anyway…)

So, by all means, post your threats, controlobjectives and controls here - and if you know who(m) formally govern(s) the TTN network, let me know!

I believe I now know whom our BOG (Body of Governance) are, they are listed on the TTN frontpage https://www.thethingsnetwork.org/.

It are:

Wienke Giezeman Initiator
Johan Stokking Tech Lead
Martijn van der Veen Web Developer
Hylke Visser Backend Developer
Laurens Slats Community Manager
Rishabh Chauhan Community Manager
Ludo Teirlinck Hardware Developer
Thomas Telkamp Network Architect
Wessel Versluis Designer
Dorian Amouroux Web Developer
Romeo Van Snick Front-end Developer
Alexander Overtoom Business Lead
Antoine Rondelet Backend Developer
Roman Volosatovs Backend Developer
Fokke Zandbergen Developer Advocate
Daniel Gómez Jurado Web developer
Thibault Labarre Web developer
Nicolas Dejean Developer

I wonder if there is a way to adress all of them at once, or should I simply mail Wienke to discuss my concerns? I would very much like to contribute to TTN and feel that the best thing I can do for TTN is to introduce RA to it.

I’ve moved some (older) posts from another topic, Elderly care LoRaWAN products here; see below. You may see a strange order of the posting dates (which may be older than the posts above), and some references may be weird.

It started with a reply to the following:

You may well be right doubting if LoRa/TTN is the best fit for ‘critical’ applications. The big issue here is if we can really offer equivalent service levels as, say, big Telco’s that offer similar services. Oh, and should we? It’s a hobby, right?

I don’t think so. Building and deploying “stuff” is, of course, why most of us are here. It gives us a great experience, additional skills and knowledge, it’s a real Feel Good Thing :+1:

But if real people will use our “hobby horse” to rely on, much more is needed. Yesterday I started a thread about the idea to establish a proper risk analysis framework. I firmly believe that’s the way to go: find out whom are our “body of Governance” - so, who is held repsonsible if something goes awry - and if there is no such thing, we should create one. That body of Governance should embrace not just technology, but enthousiastically promote and facilitate frequent risk analysis, preferably within a RA framework, e.g. as defined in ISO27001 and ISO31000.

It’s dull, somebody said. I don’t think so - but even if it were, it needs to be done. I can really do without the excitement of reading in the papers that a user of TTN lost his / her life because our network was hacked, not up or otherwise compromised.

Ugh :relaxed:

Or, the other way around… if your life depends on it, don’t use a service without any SLA.

1 Like

Well, the topic of this tread was about IoT and elderly care, and this implies that people WILL use our network for such rather important things.

When old man Jones dies because he fell down and was not found in time we tend to say “that’s life”. But if the same happens and he wore a IoT / LoRaWAN alert button that did not work, somebody will investigate (I hope). And when it proves the root cause was that TTN was hacked or not robust enough, or the local gateway was in maintenance, we really have an issue, I say. Even if we can legally squirm away from our liability, I would feel bad. Especially if a very simple control could have prevented this.

Such controls are typically not considered until life was lost or other major damage was done. I say: let’s prevent this, let’s set up a risk management system / ISMS.

BTW: this post itself is an example of how simple reasoning about “what might happen” can result in finding controls. For example, if a maintainer of a gateway plans replacement of that gateway, or maintenance, there could be a procedure that ensures that in those situations a second, perhaps temporary gateway is set up, so that ole’ Jones can live :ok_hand:

People tend to use services because they are available, and this especially is true for free services. We, as maintainers of a network, have a moral / ethical obligation to ensure that our network is sufficiently safe and robust.

1 Like

could you give an example of such a ‘very simple control’?

I just did, back in my previous post, when I wrote " For example, if a maintainer of a gateway plans replacement of that gateway, or maintenance, there could be a procedure that ensures that in those situations a second, perhaps temporary gateway is set up,"

I think that some of the readers here are not aware of how risk management works, or even what it is. Allow me to broadly paint a picture.

Risk management is fairly straight forward. First thing we need is a body of Governance (BOG). What’s that? Well, just a team that is responsible for the governance of our network. In practice: what they say goes and anybody that wants to be part of the TTN network MUST adhere to their decisions.

The BOG then stimulates and motivates the installation of what is called “an information security management system”. There are good standards available on what needs to be done, e.g. ISO27001. I know quite a bit about these standards, as you may have guessed by now :upside_down: so if you have any questions about them, don’t hesitate, shoot.

The ISMS is an implementation of a well known approach called “the Deming cycle”, which also is known as the PDCA (Plan-do-check-act) cycle. Actually, it’s nothing else than common sense: you plan to do something, then do it, check if results were as you hoped, if not you act upon it to correct the situation - and you use what you have learned to start a new iteration of the PDCA cycle (which involves a new context, as the times, they are a-changing…), ad infinitum.

Generally speaking,

  • an ISMS starts with getting consent and sponsoring of the BOG, then
  • doing an inventory of what assets there are (which is more than just technology, people matter for example),
  • think about what risk levels are acceptable (and the BOG needs to agree here as they are responsible), then
  • construct (and/or steal..) a list of threats,
  • see if these threats work on vulnerabilities in your assets, then
  • see what can be done about that (controls) to reduce the risk below acceptable levels, then
  • implement these controls (mostly done in project form), then
  • see if it all worked out as planned, if not: learn and correct.

And then it starts all over again.

It’s not something that can be done as a stand alone activity - if TTN would set up some kind of RA system (e.g. an ISMS) that requires that we ALL participate in it. We all need to be aware of the rules and adhere to them, we all need to consider risk and work with the comittee or group that is in charge of the ISMS.

I think there may have been a misunderstanding about the word ‘simple’. Just put another gateway in place is practically not that simple in most situations. Apart from that I’m pretty much in doubt if you or anyone else could persuade a significant part of the TTN members / gateway owners to accept such a policy. Although that may more belong to the risk management topic.

There are enough less critical applications for elderly people to consider, and personally I think that LoRa/TTN applications should be aimed at those. For the simple reason that anything that looks like an SLA isn’t feasible at all in the current setup by any means. So perhaps it would be nice to keep this discussion in the other topic.

Still it is important to be clear about what is and what isn’t a sensible use of the network.
Think about for instance low voltage speaker cable. If someone would use those for 220v and the house burns down, would you imply that the cable manufacturer is responsible by any means? Or the fool who used that cable for that application?

So let’s use this topic to focus on sending music through the speaker wires.

Actually, you are now fully engaged in the process of risk analysis, and I am very happy to see how well you do. What happens here is that I “invented” a threat: maintenance. The vulnerability is that a gateway in maintenance is not able to relay messages sent by life saving devices. We can accept that risk, or insure ourselves against it, or do something against it.

Now, we already have done the next thing: we came up with a control: temporarily install an alternate gateway. What is good to see is that you try to poke holes in that control: will the community adhere to that rule, will they accept the policy?

My instinctive answer here would be “yes, because otherwise they should not be part of this community”. But that may be a bit harsh, especially since I just started to work on awareness of RA in this community and so I should be a bit careful not to scare you away from the food. So let’s soften that statement a bit and say “yes, because we will make it easy for them to do so”.

So, for example, we could build a number of spare gateways and if somebody wants to do maintenance provide one of these to him on a temporary basis. Or we could hire a minivan, put a portable gateway aboard, with a nice long antenna that you set up to temporarily replace the gateway. All that the guy or gall that wants to do maintenance has to do then is to sent a message to the proper queue :nerd: (or an email, app, phone call, you name it), receive confirmation that it is accepted and by (strongly procedural) magic the aforementioned van appears in front of his house on the set time, he can do his maintenance and Ole Jones lives :ok_hand: - et voila.

I strongly resent the idea that we, volunteers, would not be able to create a network that is AS robust - and perhaps even more robust - than that of our commercial peers. I say we can, but also: that that requires some form of RA / ISMS, some creative thinking, not trying to re-invent wheels (so dang it, use the frigging standards) - and support by our members here.

And to add something that is also of great importance: it would be fully acceptable to decide NOT to install a spare gateway and live with the risk - as long as you can somehow prove that it was a conscious decision, based on proper RA, and that the BOG hence is fully responsible for the death of Ole Jones, but consciously has decided that it would be if that ever happened…

I suggest you take a moment to examine the TTN Manifest, it clearly states:

This implies (at least to me) you accept the risk of gateways/network not being available for any reason. So a potential user of the network should accept this risk, not burden the community with it. Few of the kickstarter backers and current gateway owners will accept additional responsibilities that have not been communicated upfront. Community members ‘donate’ LoRaWAN packets by installing a gateway. None of them signed up for the administrative overhead and responsibilities you are suggesting.

With regards to ‘maintenance’, keep in mind most gateways will be connected using residential internet access which is subject to its own maintenance schedule and service levels. You can’t expect gateway owners to know when maintenance is scheduled as not all providers communicate this clearly. And what happens when there is a major outage at a provider or a power outage in a city? Do you expect a gateway owner to find a way to register the gateway is down? Or immediately buy a new gateway when the current one dies?

Sorry, you are asking for something (I deem) beyond the scope of a community network. If you want to have an SLA, contact one of the commercial providers. If you just want to analyse the risks involved in using the current network, feel free.

2 Likes

Hi, Jac, good to see you.

Firstly, thank you for pointing me to the TTN manifest. That, to me, is legalese. Useful stuff that allows the BOG (if there is any here, is there?) to be able to squirm away legally from what I feel are ethical responsibilities. Ole Jones died - pity, but hey, we had a great time building a “network” and look, here is a manifest that says we were not to blame that it was not up.

The manifest is, in as far as I am concerned, in contrast with TTN’s main web page. Take a look at how our network presents itself, for example, open up the main web page. It says:

    We are a global community of more than 4000 people over 60 countries building a global Internet of Things data network. We use a long range and low power radio frequency protocol called LoRaWAN and for short range Bluetooth 4.2. The technology allows for things to talk to the internet without 3G or WiFi. So no WiFi codes and no mobile subscriptions.

Nowhere is there even a suggestion that TTN should not be used for applications for - say - saving or supporting elderly (or other folks). It’s a slick, neat web page, that just as well could be that of a professional Telco. It really succeeds in convincing me that I should use TTN .

But is TTN reliable? Can it be trusted? Well, the page goes to some lengths to at least strongly suggest this. E.g. scroll down a bit where you’ll find this:

    Our goal is to make the network architecture as decentralized as possible and avoid any points of failure or control. We already have a community of 10 developers writing network software and equipment firmware.

(my bold).

So, though formally you are right, and I fully appreciate the need to legally cover our behinds, I believe that the SPIRIT of TTN is (and if not: should be) that we WILL do our utmost to provide a reliable service.

Good risk analysis can help us to do just that. It is surprising what even a basic analysis sometime can do for the various security aspects. Risk analysis is like the proverbial reason a car has brakes: it’s not to stop the car - it’s to allow it to go faster where it can, because there is a control in place that allows it.

Yes, I will analyse the risks involved using the current network, of course I will. But I would really LOVE to be able to report that efforts have been made to improve that network, and let’s face it: nobody may like it much but risk analysis is one of the best ways to achieve that goal. I hope to be able to report that efforts have resulted in a more robust and secure FREE, volunteer driven network.

Even if Ole Jones is poor - and so can not pay for the service - he still might find it of great value to have access to a free, robust, reliable network.

BTW: if we really are not willing or able to make our network safe and robust, we should at least announce this loudly on the TTN front page, e.g. a banner that says “This network should NEVER be used to safeguard animals or people”. That’s also a control, and it’s a cheap and effective one. I really hope we don’t need to use it, as I believe we can do much better.

Thanks! I was writing along these lines when your reply popped up. I think TTN is above all a project to learn if it is desirable and feasible to have a citizens network. In this wicked world your considerations are very valuable imho.
Pieter

1 Like

Thank you, Pieter. You may be right, this may well be “just” a project to learn if it is desirable and feasible to have a citizens network. But that only would underpin my argument that we should invest in setting up a ISMS or similar framework - as a citizens network, IMO, should be robust, reliable, safe, not (just?) a hacking place for hobbyists.

All within realistic bounds - so, yes, in the early stages we will allow more risk (and pretty please, let’s be LOUD about these risks if they might lead to loss of human or animal life) - but by having some kind of ISMS we will have installed a control that forces us to reconsider risk often, and forces us to consciously decide on what risk is acceptable - to us. And by providing insight in our processes and their outcome, our users can decide if they feel that we do a good job or not.

I am not stating I would not like to see a robust and secure network because I would. And a lot effort is going into making it the best it can be while adhering to the manifest. My point is, when I joined this community and connected a gateway to the network a year ago no one suggested (and luckily no one involve in running the network does suggest so now) that I would have to adhere to SLAs, maintain an administration, plan outages etc etc.

I joined a free network where the back-end will be a robust as possible, data as save as feasible and use is free as well. And for me, free is not financial, I’m willing to provide financial support for the back-end if required, but I do want freedom as in not wanting all kinds of administration and other hassle.

Soon over a thousand gateways will be making their way to kickstarter backers in over 40 countries. Do you really think those backers want to perform risk analysis, have to consider what the impact of their gateway being off-line is for the network and work on continues improvements? I think the majority backed this project to get their hands on this new and exiting technology in an affordable way.

Like I said before, feel free to perform an analysis on the network as is. It should provide us all with an insight into risks involved and allow the users to make informed decisions on when its use is appropriate.

  1. Where on the front page does it say this network can be used for this? BTW, I don’t think it is required to add that warning, anyone designing for those use cases should be well aware of the limitations of the technology used in their design and make their own decisions accordingly. Everyone agrees LoRaWAN will suffer packet loss which is perfectly acceptable for a lot of uses, it is not acceptable for ‘panic’ buttons or alarm systems. Those solutions demand reliable end-to-end communications. It is no coincidence (modern) alarm systems have redundant communication links to the back-end.
  2. Where do we say we do not make the network safe and robust? The core team designed the back-end to be as save and robust as possible. However they are unable to resolve issues inherent in the technology. One of them being LoRaWANs use of the ISM band which many other users use, so collisions and packet loss are inevitable. Accept the limitations and choose appropriately.
3 Likes

Firstly, please note that I did NOT say we should introduce SLAs or paper work. Given the tone of resentment in here when such things are discussed I severely doubt that to be the proper controls at this moment.

Proper risk analysis is always done within a scope and it has a number of inputs, amongst them the culture and ethics of the organisation. Given what I’ve read here so far and given the short timespan that TTN exists, I think it is safe to say we’re a startup. Introducing a control like SLA’s and/or lots of paperwork simply does not work in that phase!

Actually, introducing such controls NOW would probably either reduce the number of volunteers whom rather find themselves a new hobby - hence introducing risk of unavailability of the network. Or they would simply ignore the control, introducing a false sense of security, which in itself is a risk! So you would be increasing risk by implementing such impopular controls and I would strongly advise against it.

In short: if a control is bound to fail, you don’t reduce risk implementing it, so don’t implement that control.

That being said… before one can even consider the first control, one has to have a clear picture of the dangers there might lurk (the threats), how severe it would be if a threat became reality (and in my trade one typically looks at aspects like confidentialy, integrity an availabilty of information), from that weigh the danger using some agreed on method and then, only just then, can one discuss controls needed to reduce risk.

I am sure that some form of informal risk analysis is already done by many volunteers. Take for example the case of installing a gateway at home. If it is put outdoors, some may feel that the cabling should be put in metal pipes to prevent rascals from cutting cables. Others may install a UPS or battery so when the power is cut off the gateway can be brought down in a controlled way (some run on PI’s and cold reboots can harm them). Some may even have created a 4G backup router in case their commonly used cable / xDSL ISP fails them. All very good controls to reduce weaknesses.

But given that we don’t have any means of monitoring what goes on in all these heads - thank Goodness for that! - let alone decide if what goes on in there is sufficient to reduce the risks to acceptable levels (and what are these acceptable levels) - we need a way to learn from others and improve our network doing so.

I say we at least need a body that discusses risk, and tries to reduce it. The body could gather data from volunteers (e.g. compile a list of controls the volunteers applied and against which weaknesses these controls work), could gather lists of threats and controls from others sources (standards are available), choose a method and then do a risk analysis for a limited scope, e.g. for the gateways. If our gateways are more robust, so will our network be. Little steps, but steps nevertheless.

If that is a success, we can broaden the scope. What say you?