Eliminating opportunities for traffic hijacking
A little historical overview
- BGP hijacks — when an ISP originates an advertisement of address space that does not belong to it;
- BGP route leaks — when an ISP advertises prefixes received from one provider or peer to another provider or peer.
This week it has been 11 years since the memorable YouTube BGP incident, provoked by the global propagation of a more specific prefix announce, originated by the Pakistan Telecom, leading to an almost 2 hour in duration traffic disruption in the form of redirecting traffic from legitimate path to the bogus one. We could guess if that event was intentional, and even a correct answer wouldn’t help us completely prevent such incidents from happening today. While you read this, a route leak or a hijack is spreading over the networks. Why? Because BGP is not easy, and configuring a correct and secure setup is even harder (yet).
In these eleven years, BGP hijacking became quite damaging attack vector due to the BGP emplacement in the architecture of modern internet. Thanks to BGP, routers not only acquire peer information, and therefore all the Internet routes — they are able of calculating the best path for traffic to its destination through many intermediate (transit) networks, each representing an individual AS. A single AS is just a group of IPv4 and/or IPv6 networks operating under a single external routing policy.
And thanks to BGP in its current state attackers are capable of conducting massive heists of traffic, efficiently hijacking target network’s prefixes, placing themselves in the middle. And that’s just the beginning — in the era of state-sponsored cyber actors, it is evident that the keystone of Border Gateway Protocol, which is trust, is no longer sufficient enough to prevent malicious outbreaks of routing incidents, deliberate or not, to occur. Since BGP plays such an essential role in the existence of the internet as we know it (it is the only exterior gateway protocol to control traffic flow between different Internet Service Providers all over the world), for a decade we’ve seen attempts to patch things up.
The overall complexity of managing an AS working with the BGP protocol does not go down with time and effort. Operators in developing markets, where ISP and carrier markets grow each year dramatically, do not correctly estimate the possible consequences of improperly tuned AS. No security measure could defend against such an internal mistake. The absence of automatization within the BGP makes it impossible for new players, with less experience, to properly handle AS in high-stress situation, like after the route leak creation or under the hijack.
Unfortunately, due to the place in human life that the Internet occupied it is tough for many AS operators to keep up with the growing numbers. Simple example: in 1997 the number of autonomous systems was 3000, in 2005–17000, at the beginning of 2019–63678. Routing tables (which are stored within BGP routers) grew from 50k entries in 1997, and 180k in 2005 to astonishing 850k at the beginning of 2019.
Both route leaks and hijacks have similar effects on ISP operations — they redirect traffic, resulting in increased latency, packet loss, or possible MITM attacks. However, the level of risk depends significantly on the propagation of these BGP anomalies. For example, a hijack that is propagated only to customers may concentrate traffic of a particular ISP’s customer cone. If the anomaly is spread through peers, upstreams, or reaches Tier-1 networks, thus distributing globally, traffic may be redirected at the level of entire countries and/or global providers. The ability to constrain propagation of BGP anomalies to upstreams and peers, without requiring support from the source of the anomaly (which is critical if a source has malicious intent), should significantly improve the security of inter-domain routing and solve the majority of problems.
By merely taking a quick look at the Wikipedia cited events related as “BGP Hijacking” we see a rise of their frequency in the last two years. Why? Because there are damn many ways to organize an attack on the BGP and too little probability to get disconnected. Yet, those companies that are well-known as bad actors are losing their upstream connections in the end, as it hurts reputation. And still, such incidents occur since there is (yet) no way to prevent a route leak or a prefix hijack with a 100% accuracy.
Hundreds of routing incidents occur daily, the majority of which is the result of a simple misconfiguration. Those really dangerous and malicious hijacks are usually deceptive and precisely targeted, which makes them harder to find, and to fix. And only a small percentage of them got media coverage because due to the nature of BGP every incident has to propagate to a more significant (nationwide) level to get noticed — and those companies that monitor their networks could stop such propagation.
Since the concept of BGP hijacking revolves around locating an ISP that is not filtering BGP announces (and there are plenty of such providers), there always is a suspect. For each and every BGP incident for the past two years, there was one or several specific ASes responsible.
You should think, that due to such an enormous potential of collateral damage to such a vital part of modern life businesses should be scared to death and take all the measures they could to prevent such events in the first place. Well, this sounds too good to be true IRL.
At the end of 2018, 7% of transit ISPs in IPv4 and 1% of transit ISPs in IPv6 routing world accepted leaks outside of their customer cone. You may say that the numbers aren’t that high, but let’s take a closer look at the results sorted by the ISP “size”:
Unsurprisingly, all Tier-1s are affected, with more than 50% of the TOP 400 IPv4 ISPs. IPv6 looks a bit healthier, just let’s not forget that it has fewer prefixes and supporting ISPs. So the problem exists and is not being fixed efficiently enough — we will try to explain, why.
Attempts to apply a patch
Actually, there has been a significant effort in improving BGP security. Currently, the most common technique is to use ingress (inbound) filters, built upon Internet Routing Registries data. The idea is simple: use “approved” route objects and AS-SETs to create filters at customer links. An underlying problem is that both AS-SETs and route objects vary from one IRR to the other, and sometimes different objects may exist with the same identifier in separate IRRs. And, of course, IRR policies are not obligatory — they’re voluntary to implement, leading us to a situation where a lot of the IPv4 and IPv6 does not have any objects registered, or have them incorrect. Besides those wrong ones. So lots of IRR objects are poorly maintained, and even some huge Tier-2 level networks fail to configure their filters properly.
There exists another exciting option for filtering improper announcements out — ROA-based Origin Validation could be used to detect and filter accidental mis-originations. While these BGP upgrades are quite useful, they still rely on transitive BGP attributes — AS_Path, that the attacker can and will manipulate.
RPKI (ROA foundation) is based on a hierarchy of resource certificates, aligned to the Internet Number Resource allocation structure. The resource certificate is linked to, for example, RIPE NCC registration. This is because only for as long as someone is a RIPE NCC member and have a contractual relationship with the RIPE NCC can it be authoritatively stated who the holder of an individual Internet number resource is. The certificate has a validity of 18 months, but it is automatically renewed every 12 months. RPKI is structured in such a fashion so that each current resource certificate matches a current resource allocation or assignment — a crucial moment, for ASPA too.
Worth noting that 2018 was an essential year for BGP security with many remarkable events. BGPSec is finally a standard, although nobody expects it to be fully implemented due to its complex support requirements. At a 100% adoption rate, BGPSec solves malicious hijacks with very high precision, but the computation costs of strict cryptographic AS_Path validation are unbearable for most of the players, except for maybe the most wealthy AS operators in the world. And, again — to work appropriately BGPSec needs to be implemented at every network that is managing its own route announces.
All the recent events make it evident that the BGP manipulation genie is out of the bottle, making someone’s wishes come true. We can’t wait for another decade to come to a conclusion we had 10 years ago — this should be stopped now, by providing effective technical measures for prefix validation, both ingress, and egress.
Previously, our company targeted protocol improvements on decreasing mistakes and errors, and in 2018 we concentrated on fighting malicious activity — specifically BGP hijacks. This vector is potent as we have already seen on several occasions, and attackers aren’t hesitating to employ it in attempts to disrupt service or steal user data. The main problem is that, except for monitoring, there is currently nothing to prevent path hijacks in the protocol itself. As we have already stated, BGPSec will not change anything until it is implemented on the most significant network providers (or, rephrasing, on a majority of networks), which could have happened already if the protocol was suitable.
Our answer is the hacker approach — ASPA, where current tools are used to fight the most severe problems in the world of BGP routing. Implementation is easy, though the resulting solution answers almost all of the related threats. The fact that there’s no need to wait for full ASPA adoption is the main reason supporting our approach. In 2018 we saw significant route leaks, serious hijacks and lots of other events involving BGP, which is the main reason why we need to find something that works in the nearest months, without waiting 10 years for the BGPSec implementation.
The Autonomous System Path Authorization improvement to the BGP protocol that our engineers are working on, in collaboration with others, could efficiently and, more importantly, quickly solve the global hijacks issue. ASPA focuses on automating detection and prevention of malicious hijacks (in pair with ROA) and route leaks.
To achieve this specific goal, a new AS_PATH verification procedure is defined, that helps automatically detect malformed AS_PATHs in announcements that are received from both customers and peers. The procedure itself uses a shared, signed database of customer-to-provider (C2P) relationships, being built with a new RPKI object — Autonomous System Provider Authorization (ASPA). It is lightweight and fast to deploy, detecting invalid AS_PATHs right after implementation.
ASPAs are digitally signed objects that attest that a Customer AS holder (CAS) has authorized a particular Provider AS (PAS) to propagate the Customer’s IPv4 or IPv6 BGP route announcements onwards — to the Provider’s upstreams or peers. To dive deeper into details, you should take a look at the ASPA record profile.
So if a valid route is received from a customer or peer it MUST have only C2P pairs in its AS_PATH. After, if we have a validated database of customer-to-provider pairs, we are able to verify routes received from customers and peers! Of course, this is not a silver bullet — it’s only a useful tool to stop anomaly spreading closer to its origin.
This simple scheme shows how such ASPA-based verification works. Green ASes represent signed pairs, and the red one represents the attacker. In this example, AS1 has an RPKI ROA object in place.
Corner 1 shows that if the closest AS in the AS_PATH is not the receiver’s neighbor ASN, then procedure halts with the outcome “invalid.” Also, if in one of AS_SEQ segments there is an “invalid” pair then the procedure also halts with the outcome “invalid”;
AS_PATH: AS4 AS1
Corner 2 shows the same happens if the attacker tried to make a new unvalidated pair the result would also be “invalid”;
AS_PATH: AS4 AS2 AS1
Corner 3 illustrates that any attempt to announce an unvalidated pair would result as “invalid,” and therefore dropped, route.
AS_PATH: AS2 AS1
In the last Corner 4, we’re back to the initial condition under which if the closest AS in the AS_PATH is not the receiver’s neighbor ASN, then such procedure is kept as “invalid.”
In-depth details on the AS_PATH verification procedure could be found within the corresponding IETF draft. It may seem a bit complicated, but it is lightweight, works upon existing RPKI infrastructure and brings effect at the state of a partial adoption, cost-effective. In other words, it could work now, helping us to bring BGP route leaks and hijacks to the brink of extinction!
Why you should support us in improving BGP security
How do we know, first of all, that the RPKI ROA is suitable for this role? Let’s take a closer look at the current statistics on its adoption.
In the IPv4 for 938526 prefixes, 95272 have valid RPKI ROA entries or ~10%. In the IPv6 world for 80480 prefixes 10593 have valid RPKI ROA entries or ~12%
Those digits are not as high as we’d wish, but they grow each day! Since the number of validated ROAs is not very large, it is evident that only those most responsible maintain them daily. And though we don’t entirely trust the IRR data, signing BGP announcement with ROA is already useful and in place. There is progress among Internet Exchange operators, as well as the largest Internet Service Providers all over the world. And regarding “ordinary internet users” — your own ISP could be the next one if motivated enough to keep your data safe.
If you are as interested in eliminating opportunities for traffic hijacking as we are, sign ROAs, and support ASPA adoption at the IETF mailing list as we hope on its transition to the RFC stage in the observable future!