Signaling Storms and IoT: How to Safeguard IoT Devices from Network Disruption

Eseye author

Eseye

IoT Hardware and Connectivity Specialists

LinkedIn

Signaling storms are a phenomenon that occur in mobile or cellular networks when the volume of control signals emitted from devices exceeds the network’s processing capacity.

Once started, the issue tends to gather a snowball effect, as devices failing to establish a connection on the network will keep retrying repeatedly, essentially making the signaling problem worse and ultimately leading to service disruptions.

Signaling storms have been a persistent threat to mobile networks since the introduction of 3G technology and remain a major concern with the advent of LTE 4G and 5G.

A signaling storm occurs when the number of devices attempting to connect or send signals to a network overload network resources and cause service disruptions, including major connectivity issues and service degradation.

Although not often publicized, there have been a number of high-profile network service outages worldwide in the last decade that were attributed to signaling storms, resulting in financial losses running into millions of dollars, and millions of subscriber hours lost.

For IoT networks, where the number of devices that need to communicate can run into the tens of thousands or millions, signaling storms can be disastrous, both for the network operator and the enterprise, which could see its IoT estate taken completely offline for hours or potentially even days.

Such a service interruption might not be so problematic for a water or energy monitoring application run by a utility. But it could be catastrophic, even life threatening, for a remote patient monitoring medical device network or a fire alarm system in a smart building.

One of the reasons signaling storms can be so dangerous is that even if a VPN is used, it’s the underlying transport network that is impacted, resulting in total loss of connectivity.

Understanding signaling storms and their impact on IoT

Abstract image to represent signaling storm

Although malicious intent is always one of the possible root causes of a signaling storm, it is often not the case. History tells us that technical malfunction and misconfiguration are more likely culprits to trigger a signaling storm.

With each consecutive generation of cellular technology from 3G to 4G to 5G, more and more network elements have been virtualized, giving MNOs more granular control over network functions, but also increasing the number of signals required to complete system procedures.

Couple this with an ever increasing diversity of application scenarios and continuing increase in device numbers and service subscriptions, and the complexity of network resource management and orchestration has increased significantly. So when the network control plane fails, it does so spectacularly.

Publicly documented cases of signaling storms include:

  • An outage in New Zealand caused by thousands of devices suddenly re-registering on the network after a separate routing fault affected a number of cell sites. It took three days for service to be fully restored.
  • In Norway, a software update on the network triggered unusually high signal load leaving millions without connectivity for 18 hours.
  • In Japan, the rapid proliferation of new consumer applications outpaced network infrastructure oversight, causing signal congestion in the core network elements including authentication, user management, mail information server, and packet switching, eventually overloading the network’s ability to process signals and resulting in network malfunction lasting several hours.
  • In the USA, a misconfigured router became overwhelmed by call signals, causing network congestion of 2G and 3G circuit switched traffic as well as VoLTE traffic. Sensible network design meant that devices failing to initiate a call would try to re-register with the IMS over WiFi instead, but a secondary software issue prevented devices registering and instead attempted to reroute the signaling traffic which also failed, causing the devices to continually attempt reregistration. This traffic also congested the IMS, forcing devices to attempt to register on the 2G or 3G network again, effectively creating a perpetual loop and extending the signaling storm to another part of the network control.

    Eventually, the signaling storm spread nationwide and ultimately lasted for around 12 hours. In this instance it also affected 911 emergency calls and triggered an investigation and multi-million dollar fine from regulator the FCC.

  • One IoT-specific outage occurred in Japan in 2021, after the operator had performed an upgrade on their core network location management server. Inbound roaming IoT devices had a software incompatibility that caused connection failure with the management server. A software rollback to the previous server software then triggered a massive number of registration requests from the IoT devices causing congestion in the location management server and eventually spreading to the whole core network.

    This incident alone highlights the potential risk of large fleets of IoT devices intended for roaming suddenly activating and attempting to register on a mobile network all at once. The challenge is that these IoT devices are manufactured from various sources in the ecosystem and are out of the operator’s control.

In many cases, MNOs have had to budget to increase their network capacity after experiencing signaling storms, and some carriers have had to work with device manufacturers and operating system developers directly to get the amount of signaling traffic emitted from devices reduced. 

But despite technical errors being the most frequent source of trouble, that’s not to say deliberate attacks are unheard of.

The potential for signaling storms to cause significant disruption has caught the attention of those with malicious intent to undertake what is known as a Signaling Storm Attack (SSA).

In this case the adversary utilizes standard mechanisms of the network control plane to cause a Denial of Service (DoS), flooding the network with invalid or repeated registration requests.

Given that the massive number of globally roaming cellular IoT devices are notorious for limited security features, this means an army of devices susceptible to being compromised to execute a low-cost Distributed Denial-of-Service (DDoS) attack.

Such an incident has happened at least once. In 2016, a compromise of around 600,000 IoT devices created a botnet that was used to attack DNS provider Dyn and cause internet disruption across Europe and North America.

The malware responsible, Mirai, exploited default username and password combinations to compromise the IoT devices and did not require a high level of technical knowledge.

Strategies to prevent and mitigate signaling storms in IoT networks

Man understanding data

When it comes to IoT connectivity you cannot afford to rely on a single network because if it fails or loses availability your device is offline. A signaling storm could be one such factor resulting in connectivity disruption.

For IoT devices to work at optimum levels they must have access to a consistent, secure, and reliable connection always, regardless of location. To achieve this result, additional networks should be available for connecting devices, ensuring redundancy.

For cellular IoT solutions, a multi-IMSI SIM, eSIM or eUICC (Embedded Universal Integrated Circuit Card) can be used to store multiple IMSIs, enabling devices to switch to different networks without physically changing the SIM. A perfect solution if one mobile network suddenly becomes unavailable.

Another key focus should be securing IoT devices to prevent their exploitation in signaling storms by malicious actors. Gartner estimates 25 billion IoT devices in deployment by 2025 and the majority of research seems to suggest that most of these devices are not, and will not be, adequately secured.

As the success of the Mirai malware revealed, tens of millions of common-use, mass-manufactured IoT devices such as security cameras, use unencrypted and unsecured protocols to communicate, making them vulnerable by default.

Although it doesn’t help if a signaling storm is already underway, one final consideration could also be to use Private APNs (Access Point Names) to enhance IoT security by providing dedicated network access, reducing exposure to public networks, and enabling more control over data traffic. This setup is crucial for maintaining secure and reliable IoT communications.

Eseye’s approach to reliable IoT connectivity in a congested network landscape

Eseye’s IoT solutions are designed to withstand network challenges, offering robust connectivity management that adapts to high-demand scenarios. Eseye’s Infinity IoT Platform provides intelligent connectivity that dynamically switches networks to avoid congestion, ensuring that devices remain connected, even during potential signaling storms. Eseye also offers advanced diagnostics and analytics to monitor network performance, predict congestion, and proactively optimize device behavior to prevent signaling overloads.

Protect your IoT devices from network disruptions. Speak to one of our experts at Eseye and discover resilient IoT connectivity solutions. Contact Us

Eseye author

Eseye

IoT Hardware and Connectivity Specialists

LinkedIn

Eseye brings decades of end-to-end expertise to integrate and optimise IoT connectivity delivering near 100% uptime. From idea to implementation and beyond, we deliver lasting value from IoT. Nobody does IoT better.

Free IoT Device Assessment Speed up deployment with a free IoT device assessment.

Let our experts test your device for free. Receive a free SIM kit and speed up your IoT deployment with expert insights and seamless connectivity.