IETF 113

Last week I attended the IETF 113 meeting in Vienna. I primarily went there to reconnect in person with some old IPv6 fellows, but also to see what’s going on in the IPv6 standardization space which I hadn’t been following closely in recent times.
In this post I’ll shortly summarize some contributions presented in the main v6-related working groups (wg), that are v6ops and 6man.


Video recording of the full session here.
Individual comments here.

IPv6 Deployment Status
Current draft here.
Slides from wg session here.
This is the abstract of the draft:

I for one am not sure if this draft/effort is really needed in 2022. There are many reasons why the global IPv6 deployment is not happening at the speed/scale that IPv6 proponents have been hoping for, and those reasons might be very diverse in nature on the one hand, and might not need another discussion/documentation on the other hand.

NAT64/DNS64 detection via SRV Records
Current draft here.
Slides from wg session here.

NAT64 currently gains ground & is actively discussed in many environments, but a number of operational aspects like the placement of the NAT64 function within the network, or which prefixes to use have to be considered. This is why I think this is an important draft. Martin presented existing methods (for one specific aspect ;-), why those might be insufficient, the goals of their suggested approach, and the layout of the planned SRV records. Also proof-of-concept code is now available.

Scalability of IPv6 Transition Technologies for IPv4aaS
Current draft here.
Slides from wg session here.

Again I think this is a relevant effort as, evidently, scalability considerations play a huge role once a transition technology gets deployed, but there’s not much existing work available (not in the methodology space, and not when it comes to real-life metrics & measurements). Their work/this draft might hence provide

  • some indication re: what’s realistic.
  • types of measurements to request from vendors.

Neighbor Discovery Protocol Deployment Guidelines
Current draft here.
Slides from wg session here.

Here I’m not certain if the IPv6 world needs this type of guidance/documentation, as the respective issues have already been extensively discussed for the last ten years, and several architectural or implementation-level approaches how to deal with ND (security) shortcomings have been developed (e.g. see ‘client isolation’ section in this post).

Requirements to Multi-domain IPv6-only Network
Current draft here.
Slides from wg session here.

This draft discusses some scenarios in multi-operator settings using v6-only, which I hadn’t thought about earlier. Interesting work to be followed.

Just Another Measurement of Extension header Survivability (JAMES)
Current draft here.
Slides from the wg session here.

Éric Vyncke supervises these measurements performed by Raphaël Léas and Justin Iurman from the University of Liège. This was also presented at the IEPG meeting covered by Geoff Huston in this blogpost. Important effort in general, and I always welcome IPv6 research work performed together with academia. Some may say the results are not too surprising 😉 – here’s a tweet with some data, and Geoff commented as follows:


Video recording of the full session here.
Individual comments here.

IPv6 Hop-by-Hop Options Processing Procedures
Current draft here.
Slides from wg session here.

Taking the results from the last presentation in v6ops into account (see above), there might be a bit of irony here, but I found especially the discussion after the presentation quite enlightening.

Source Address Selection for foreign ULAs
Slides from wg session here.

In this one Ted Lemon spoke about an interesting scenario in a home network with multiple routers and multiple ULA prefixes, where certain destination hosts are not reachable from specific (source) hosts, due to a combination of factors (routers themselves ignoring RAs and hence not learning prefixes originated from other routers’ RAs & the way how source address selection works as of RFC 6724). This talk triggered a long & interesting discussion. Some people stated that a misconfiguration must be present in the scenario (I don’t think there is, and I know a bit about the background of the talk/scenario), others stated that the C[P]E router ‘violated RFCs’ (namely RFC 7084 Basic Requirements for IPv6 Customer Edge Routers) which I think is a ridiculous stance. Still overall very good discussion which was helpful for identifying approaches how to deal with such situations.

I hope to be able to meet some of you, dear readers, at the upcoming RIPE meeting in Berlin. I even consider reviving the tradition of an ‘IPv6 Practitioners Dinner’ – let me know if you want to join.

RFC 9099 / Intro & Overview

Recently RFC 9099 Operational Security Considerations for IPv6 Networks was published. It was authored by Éric VynckeKiran Kumar ‘KK’ Chittimaneni, Merike Kaeo und myself, and we plan to write a little series on its objectives & main recommendations on the APNIC Blog. To prepare for that let me provide a short overview of it in this post.

RFC 9099 was a long time in the making (nearly nine years! between the first Internet-Draft in the OPSEC working group and the final publication). As you’ll see in a second it covers many IPv6 areas which by themselves are in the centre of nearly religious debates (like filtering of extension headers, or ULAs + other addressing topics). Hence quite a bit lengthy e-mail threads on the WG’s mailing list were created, which made reaching consensus not necessarily easier. Also at some point IETF procedures – this sounds better than ‘politics’, doesn’t it? 😉 – kicked in which led to additional delays (for those interested in this dimension of work within the IETF see Geoff Huston’s lucid Opinion: The making of an RFC in today’s IETF).

The document is focused on what we call ‘managed environments’ like service provider/operator networks or enterprise environments, and it is organized in several sections:

  • Addressing: evidently the addressing architecture chosen for a specific IPv6 deployment can have significant impact on a network’s security posture (when it comes to routing, traffic filtering or logging), so the various types of IPv6 addresses and their security implications are presented in detail in this section.
  • Extension headers: as those constitute one of the main technical differences between IPv4 and IPv6, and at the same time they have interesting (one could even write: ‘challenging’) security properties, they’re discussed in a dedicated section.
  • Link-layer security: examining the local communication mechanisms of IPv6 both from an offensive and from a defense point of view makes the main content of this section. Here all the stuff like NDP attacks, rogue router advertisements, and their related protection mechanisms are described. Again, this is an area where major differences between IPv4 and IPv6 exist.
  • Control plane security: very important topic from an infrastructure security perspective which is why it has an own section.
  • Routing security: same as for the previous section – overall very similar security best practices as in IPv4 networks have to be applied for IPv6 in this space as well, e.g. the excellent guidance provided in RFC 7454 BGP Operations and Security.
  • Logging/monitoring: some elements of the overall IPv6 architecture (like the ephemeral nature of IPv6 addresses, the fact that usually several of them co-exist on a given interface, or their general format) have significant impact on the way how logging and security monitoring are done in many organizations. These are looked at in detail in this segment.
  • Transition/Coexistence Technologies: from my experience various organizations underestimate the efforts for properly security dual-stack deployments (which btw is another argument for going v6-only where you can). Furthermore the use of tunnel technologies traditionally creates headaches for security practitioners, so they merit respective considerations (at least we thought so. This section was heavily contested during the development of the RFC as people thought that the related security challenges do not stem from IPv6 itself but mostly from operational deficiencies in IPv4 networks, namely those not aware of the concurrent presence of IPv6 in their world).
  • General device hardening: a security guidance document wouldn’t be complete without this, right? 😉
  • Enterprise-specific security considerations: deploying IPv6 in enterprise environments needs some additional reflections (see also RFC 7381 Enterprise IPv6 Deployment Guidelines) which is why we cover the security side of things in a dedicated chapter, which in turn is split into two subsections on external and on internal security.
  • Service provider security considerations: obviously operator networks need proper IPv6 security. While many of the needed security controls are already covered in earlier parts of the RFC some operator-specific aspects like lawful intercept are discussed here.

This post was meant to make you aware of RFC 9099 in case you didn’t know it before, and to provide a quick overview of its content. Additional posts with technical details on its individual areas will be published on the APNIC blog.

Additional references

IPv6 in Enterprise Wi-Fi Networks

At first I wish all readers a very happy new year and all the best for 2022! May the force be with you for your IPv6 efforts ;-).

In this post I’m going to discuss some characteristics of IPv6 in common organization-level (as opposed to home networks) Wi-Fi deployments. These characteristics have to be kept in mind both during design & implementation and in the course of troubleshooting. Many IPv6 practitioners learn(ed) about IPv6 fundamentals in Ethernet networks (quick hint on terminology: in this post the term ‘Ethernet’ always means ‘wired Ethernet’ as of IEEE 802.3 standards, and ‘Wi-Fi’ refers to technologies in the context of IEEE 802.11), and it’s probably a safe assumption that the designers of IPv6 (in the 90s) mostly had such networks in mind when core parts of IPv6 and its communication behavior on the local-link where specified. While IPv6 neighbor discovery (NDP) as of RFC 4861 strictly speaking supports many different link types (section 3.2), the protocol overview in section 3.3 heavily relies on multicast transport (which doesn’t make sense on certain link types). This is aligned with a mental model of IPv6 behavior that quite a few of us (practitioners) have, and which is based, among others, on the following assumptions:

  • (1) on the local link there are usually (at least) some neighbors, and if so, then interaction with them is possible by certain mechanisms like NS/NA messages.
  • (2) multicast is a somewhat reliable mechanism (otherwise NDP would be unreliable), and it has at least similar performance properties as broadcast (otherwise NDP would be slower than ARP in IPv4 which certainly wouldn’t have been an acceptable objective ;-).
  • (3) sniffing ICMPv6 messages (which encompasses all NDP packets incl. router advertisements) will provide an initial understanding of the local environment.

As we will see in the following, very often Enterprise-level Wi-Fi networks are implemented in a way that renders quite some of these assumptions debatable. Again, it should be noted that the resulting differences do not apply to IPv6 in home networks which hence can be expected to work in a way that aligns with the above assumptions.
The mentioned differences mainly stem from two sources which is why it can be helpful to understand those first.

Assumptions & Security Properties

Wi-Fi networks are often treated slightly different from a security perspective, based on certain assumptions incl. (but not limited to 😉 ) the following:

  • They are considered more hostile environments than ‘the trusted corporate LAN’ (based on thinking along the lines of “heard of those guys getting into our Wi-Fi network from the parking lot, via that compromised PSK?”). So more scrutiny is put onto basic network security measures (like just dropping certain packets, see 3rd point).
  • their traffic is expected to be primarily ‘eyeball traffic’ flowing from clients to servers either in the Internet or in the organization’s data centers, hence no need to communicate with other systems within the same Wi-Fi network/VLAN (as opposed to the Ethernet VLANs where a user/system might still need to reach that lab system under the desk, that printer over there, or the web interface of that building management system which is placed in the same VLAN, for ‘historical reasons’). In enterprise-grade Wi-Fi networks subsequently very often mechanisms to isolate clients from each other can be found (discussed in more detail below).
  • infrastructure systems like routers or DHCPv6 servers are expected to never reside in the Wi-Fi which is why packets supposed to originate from such systems (IPv6 router advertisements or DHCPv6 Advertise, Reply or Reconfigure messages) can be & actually get dropped by default. Please note that the presence of devices implementing Thread networking (like the HomePod mini) puts this approach into question, but that’s another discussion, and the respective filters might not even be (easily) configurable.

Handling of Multicast Traffic

For reasons laid out in section 3 of RFC 9119 Multicast Considerations over IEEE 802 Wireless Media the use of multicast transport in Wi-Fi networks brings various challenges which can heavily degrade the network performance and/or the battery life of connected devices (for the latter see also RFC 7772 Reducing Energy Consumption of Router Advertisements, and for the former it can be a good idea to read the excellent ‘Wireless Means Radio‘ article). That’s the reason for a number of related optimizations commonly found in enterprise Wi-Fi networks (or, for that matter, in conference networks, see Chris Werny’s talk on IPv6 in the Troopers Wi-Fi).

Now let’s look at some technologies in more detail, together with their impact on troubleshooting.

Client Isolation

This is a feature that blocks ‘direct’ connections between clients associated to the same WLC or the same AP. The actual technologies are vendor-specific (‘Peer-to-Peer Blocking’ in Cisco land or ‘Deny Inter-User Traffic’ in Aruba land) but the impact can essentially be broken down to: wireless clients can’t ‘see’/reach other by means of unicast traffic nor by certain multicast traffic (which usually includes IPv6 NDP but *not* mDNS/LLMNR, so the latter commonly pass the boundary). It should further be noted that this feature is implemented on the WLC/AP level, so attackers might still be able to send packets directly to individual stations.
Impact on behavior, in particular in the context of troubleshooting:

  • the actual implementations of different vendors might vary, so one should be extra careful with conclusions. This applies to both handling of specific multicast traffic and to traffic to/from the Ethernet side of things (commonly at least some of this is passed — think: physical router sends RAs to ff02::1 — but other stuff might be dropped, e.g. neighbor solicitations to SNMA of individual Wi-Fi clients. Some devices allow configuring some properties, e.g. look for ‘Forward-Upstream’).
  • keep this feature in mind when troubleshooting connection issues with colleagues (‘can you ping my MacBook?’ might not work as expected ;-).

Performance- or Security-oriented Optimizations of NDP Traffic

A number of mechanisms/configuration tweaks exist in the context of NDP (router advertisements and NS/NA packets). The most known ones are the following (the terminology is a bit Cisco-oriented, based on stuff we used to do at Troopers, but these features can be found, under one name or another, in most Enterprise-level Wi-Fi solutions):

  • RA Throttling: WLC/AP rate limits forwarding of RAs to Wi-Fi, based on certain thresholds & related timers. From an operator perspective one has to make sure that the Router Lifetime in the RAs exceeds the timers used here (see also section 4 of RFC 7772. Andrew Yourtchenko, one of its authors, used to use 9000s in one of his networks, see this post). Some years ago the default Router Lifetime on Junos was 180s which could to lead to issues in networks using RA Throttling (wireless clients losing their default route as they did not receive a new RA before the default route generated from last received RA timed out).
  • Unicast RAs: router advertisements sent in response to a RS are only sent to unicast address of requesting node (instead of sending them to the all-nodes multicast address/ff02::1. RFC 4861 states [in section 6.2.6] that a router ‘MAY’ do this, so it’s a valid, and commonly used, approach).
  • ‘NDP proxy’: when using this feature the WLC responds to NS packets from the Ethernet side by sending NAs ‘on behalf’ of Wi-Fi stations. At this point it can also convert (for unknown MAC addresses) the multicast NS into a unicast packet sent to the MAC address of the wireless client, and some implementations have a dedicated mode for DAD. See also RFC 8929 for a technical description of an ‘ND proxy’.
  • RA Guard (I tested this some years ago with surprisingly solid results).
  • IP Source Guard: this is a security feature that checks MAC address-to-IP(v6) address bindings. From an operations perspective one may keep in mind that there’s a threshold of IPv6 addresses which can be associated with one MAC address (iirc, on Cisco devices it’s eight [8]), and subsequently apparent violations might occur once clients regularly generate privacy addresses after coming back from sleep mode or similar. While I’ve never seen this irl I’m not sure which risk is supposed to be mitigated by the feature anyway (connectionless spoofing of a station’s IP address by another? for which attack vector? who would ever do this?).

Impact on behavior, in particular in the context of troubleshooting:

  • these features are vendor-specific. Their default settings, configuration approaches, and working modes might vary, even between devices from the same vendor (e.g. see this thread).
  • expected behavior re: link-local traffic might differ from observed behavior (certain NDP messages not seen on Wi-Fi due to controller interaction, RAs seemingly missing due to RA throttling etc.)

‘Mobility’ / Layer 2 Will Never Be the Same

In order to allow stations to physically move/to roam between areas covered by different APs, all modern controller-based Wi-Fi solutions implement techniques that span kind-of virtual Layer-2 domains across multiple APs or even across multiple controllers. Furthermore traffic can be tunneled between controllers over Ethernet over IP (EoIP) — this is often, but not only used for Wi-Fi guest networks — which then includes so-called anchor controllers providing a break-out point of the traffic towards certain parts of the corporate network or to the Internet. The main thing to keep in mind here is that a neighbor (in IPv6 terms) can actually be a system separated from a vantage point by many Layer-2 and Layer-3 devices/hops (this is the same in VXLAN environments, but from my experience in Wi-Fi space diagnosing errors might be more difficult due to lack of proper tooling or even proper CLI access/commands).

Impact on behavior, in particular in the context of troubleshooting:

  • from an operator perspective one should note that any type of tunneling can have an impact on the MTU, an area where IPv6 traditionally does not have a great reputation, so to say (e.g. see this Cisco bug).
  • while troubleshooting carefully consider the impact of (all of) the above technologies. For example imagine you don’t see an NA after a station has sent an NS for the default router’s IPv6 address. The actual traffic flow could easily be the following:
    for the NS packet: station sends multicast packet -> AP -> (over a tunnel protocol to) WLC -> Ethernet -> SVI
    for the corresponding NA: unicast packet all the way back, but potentially slightly different path (AP).

tl;dr: In W-Fi networks usually a number of techniques can be found which interact with IPv6 in various ways. As a network designer you should be familiar with those technologies. When doing troubleshooting in such networks it might be helpful to keep them in mind, too ;-).

Thank you for reading so far. I’m always happy to receive feedback, either here on the blog or via Twitter. Happy IPv6 networking to you all!

Disaggregated Security Enforcement / Self-service ACLs

In large environments security controls based on packet filtering, such as firewalls and ACLs on network devices, often face an unfortunate dilemma: there’s a gap between the parties understanding the communication needs of an application (say: the application owners) and the parties implementing the actual security enforcement (e.g. the firewall ops team). Those also have different motivations: “it has to work” (see RFC 1925 rule 1 😉 for the former group versus “it has to be secure = fulfill certain security objectives” for the latter. This gap can manifest in many socio-technical ways, which is the reason why ‘firewall rule management’ has been subject of many discussions over recent years. In another post which I wrote a few years ago I stated that going for the upper-right quadrant in the following diagram usually requires high operational effort (which can actually produce the opposite outcome due to added process complexity), a high level of automation, accepting trade-offs, or a combination of these.

That’s why several organizations are considering another approach or have already started deploying it. Here I’ll call it ‘self-service ACLs’, and it can be summarized as follows:

  • move (the enforcement function) of packet filters towards the hosts (e.g. via ip[6]tables running locally or some set running on a network device ‘just in front’ of a group of hosts, e.g. a VPC).
  • provide a nice web-based management interface to these rules
  • store all rules in a centralized database
  • allow application teams themselves to manage the rules. Besides the technical decentralization (or to put into more familiar lingo from the networking space: ‘disaggregation’), this one constitutes the main paradigm shift.

The underlying idea is simple: “let the owners of an asset/a service handle what they need, in a flexible manner”, without all those organizational or process-induced gaps, and it’s seems like a good idea to solve the issues I laid out above.

Alas – as so often with simple ideas seemingly solving complex problems – there are some often over-looked pitfalls. These are going to be the subject of this post.
Quick disclaimer re: terminology: I’ll use the terms ‘rules’, ‘firewall rules’, and ‘ACLs’ interchangeably. Just think of rules being part of larger rule set, implementing packet filtering based on a traditional approach of sources, destinations and services with the former represented by IP addresses/ranges and the latter by protocols or ports in some notation.

Let’s start with looking at the lifecycle/dimensions of such a rule:

  • (1) a ‘management’ step/function, like the creation of the rule (by some party) or the modification of the rule (by some party)
  • (2) the actual enforcement function of the rule
  • (3) logging (of certain enforcement events, e.g. dropping a packet)
  • (4) analysis of a rule (e.g. as an intellectual exercise performed in certain life situations 😉 or as a contributing element to metrics)
  • (5) troubleshooting network communication flows which often involves the functions (3) and (4).
  • review

In the ‘traditional model’ most of these were performed by the same party (‘firewall ops team’), but here the self-service model induces significant changes. The expected benefit is centered around moving (1) (into the hands of the app owners holding the contextual intelligence) and (2) (topographically towards the assets actually needing the protection). Sadly this also brings changes to the other functions, with some interesting effects. Let’s look at two affected functions/lifecycle elements: logging and review.


In infosec circles there’s an old adage ‘each security layer should provide logging’. Let’s assume the log files are still written to a central place (this is what most organizations do, for a variety of reasons, and I for one think that this makes sense). This can create interesting situations:

  • The new owners of the rules will, somewhat legitimately, think that they own the logs , too. (“These are our rules, we manage them, and we should be able to see what’s happening”).
  • How do they then get access to the (centralized) log files?
  • More importantly: get access in a tenant-proper way? (you don’t want the database team to be able to see the log files of the authentication servers, do you?)

I’ve yet to see an organization which has solved this problem in a way that fulfills the requirements of the different parties. So one might have to accept some trade-off here (e.g. the loss of visibility into log files for one of the involved parties).

Rule Review

A similar conflict of interests arises in the context of rule review. Can one reasonably expect the party whose main interest is essentially ‘it has to work’ to perform a review of rules based on corporate policy, PCI requirements or the like? Again, this would be an inherent dilemma only solvable by a high degree of collaboration (self-service ACLs are often supposed to reduce needed collaboration). On the other hand rule review might a bit of a dysfunctional process in some organizations anyway, as this recent Twitter poll seems to imply 😉

General Paradigm Shift

Finally one should keep in mind that the introduction of self-service ACLs can mean a cultural shift for application teams, from opening tickets (which includes a partial transfer of responsibility) to managing rules (= security controls) in their own hands (which in turn also requires developing security skills & practice). Not all app teams might be happy with that; especially those running core applications with high availability requirements might be a bit risk averse in this context ;-).

tl;dr: While self-service ACLs can address long-existing process-level deficiencies in some organizations, they might well introduce new ones. Understanding the demarcations of the individual functions within a rule’s lifecycle, and the incentives of the different involved parties, will be crucial for a successful deployment of the approach.

IPv6 Reporting

I know that some of the readers of this blog are IPv6 cheerleaders in their respective organizations, and as such they might occasionally face questions along the lines of “what’s the state of IPv6 in our company?” or “are we progressing IPv6-wise?” (the latter in particular when dedicated resources are spent on the IPv6 transition on a regular basis, as opposed to those 50-person-days “let’s get ready for IPv6 (by writing some documents)” projects once every few years).
In this post I’ll discuss some approaches to make IPv6 progress within an environment visible.

For those interested in IPv6 progress in the global Internet or on a country level, an overview of the main sources for numbers can be found here. When undertaking a similar exercise for an individual organization let’s keep in mind that in general reporting in corporate life has three main aspects:

  • a question (to be answered, or: which message do we want to convey by means of the reporting effort?)
  • audience (who’s the recipient of the message?)
  • method (which approach/tooling/communication channel etc.)

A main differentiator between other types of measurements commonly found in large enterprises like “number of systems being on the latest OS patch level” and IPv6 reporting is that in our case usually two dimensions are of interest:

  • (1) to what extent are we ready/prepared for IPv6?
  • (2) to what extent is IPv6 (traffic) really happening?

On the one hand those are fundamentally different dimensions (or: ‘questions’ to be answered to the audience of the respective reporting efforts), on the other hand there are relationships between them. Still I think it’s important to understand the differences. I know quite a few companies where the question ‘do we support IPv6?’ asked to the ‘network infrastructure folks’ would be answered with a resounding ‘yes, of course, we do!’. However that does not necessarily mean that much real-life application traffic happens over IPv6 in those environments. As long as there are no applications or services actually being IPv6-enabled, that nice positive answer from the network people might not be of much practical value, unfortunately (don’t get me wrong dear networking colleagues, of course I know it all starts with the network…). And enabling IPv6 for applications and services usually depends on certain core infrastructure services (those in the realm of authentication, provisioning, monitoring, or security are common examples; see also the ‘three dimensions of IP addresses’ discussed here). From that perspective displaying the current (IPv6-related) state of those (as I call them) dependencies might be more important – at least for a certain audience – than just coming up with numbers of ‘dual-stacked hosts’ (which might not use IPv6 at all as there’s nobody reachable to talk to over IPv6 ;-).

Let’s look at some typical parameters of the two above categories.
In the space of (1) the following ones come to mind:

  • State of important dependencies, maybe even in a non-numeric way (like, per dependency: ‘no ongoing v6 efforts’, ‘has v6 in dev’, ‘has v6 in production’ etc.)
  • Number of dual-stacked hosts. Assuming that the default preference of OSs is to use IPv6 when possible (of course it’s more complicated considering Happy Eyeballs and Java options/preferences) this gives an idea of ‘which percentage of our systems would use IPv6 if they had the opportunity [read: when they connect to proper IPv6 endpoints]?’.
  • When many users work from home (like in quite a few organizations nowadays, in October 2021), looking at some stats related to VPN connections might be of interest. (hint: in such times bringing IPv6 to your VPN should be a high priority effort anyway if you want to drive IPv6 across your organization).
  • number of IPv6-enabled VIPs on your load balancers in case those are infrastructure elements being used in your organization. This is a particularly interesting one as one can assume that such a VIP is only created when other infrastructure requirements are already met (read: when an application team is serious about supporting IPv6 connectivity to their respective application, and some dependencies as for authentication, monitoring or logging are already solved).
  • subnets which have IPv6. Again, this one does not tell anything about actual IPv6 traffic, but about the ‘level of preparedness’ (which is the overall purpose of this category).
  • number of AAAA records in DNS. Depending on the point of time (within an application’s journey towards IPv6) when those are created this might be an interesting indicator. For example during LinkedIn’s IPv6 initiative they deliberately did this rather late during the infrastructure services transition (slides and video of their related talk at the UK IPv6 Council can be found here), so the mere existence of an AAAA record meant serious (IPv6) business.

For some of the above values dashboards with ongoing data might be proper (display) instruments, but that would the depend on the ability of collecting the respective numbers in an automated manner, and on the audience the reporting is intended for (not everybody likes clicking on a link to a dashboard and to interpret numbers on their own, but some recipients might prefer to get a quarterly e-mail with some core numbers which can be digested in 10 minutes, on an iPad).

Typical category (2) parameters, to show that IPv6 is actually happening (as in: end-to-end IPv6 connections take place over the network), include the following:

  • all sorts of traffic statistics (e.g. from NetFlow) which usually show the ratio of IPv6 traffic to overall traffic. Here one should keep in mind that one single but traffic-heavy application getting IPv6 might create the idea that significant progress happened which might not simultaneously be the case in the space of infrastructure dependencies, which is exactly why both main categories (and being transparent with regard to their differences) are important.
  • similar stats for connections to major applications or infrastructure services (e.g. authentication servers). Bonus: measure associated RTT or latency from vantage points or sth similar, as a follow-up question from the audience might be: “ok, and now tell us how application performance compares between IPv6 and IPv4”.
  • number of IPv6-only hosts. Assuming that those hosts establish or receive whatever type of network connections (and why else would they exist in the first place ;-), this one can really provide insight into IPv6 progress within an environment. To avoid misinterpretations in rapidly growing environments, ideally put this in relation to overall number of hosts, in case you have that number at hand (see the poll at the beginning of this post on the APNIC blog why I mention this ;-). Also please note that I use the term ‘host’ here in a loose manner which encompasses containers and other types of ephemeral entities having an IP address.

I hope the above provides some inspiration for those dealing with the task of making IPv6 progress within an environment visible. Always happy to receive feedback of any type either here or on Twitter. Thank you! for reading so far, and good luck with your IPv6 efforts.

The Role of IP Addresses in Security Processes

Reflecting on IP addresses, and about factors contributing to having a proper inventory of active ones, recently led me to putting up a Twitter poll. Here are the results:

Looking at these numbers it seems that quite a few organizations struggle with maintaining a more or less accurate inventory of active addresses in their networks. At this point infosec purists may stress the importance of a thing called asset management (I mean there’s a reason why it’s the 1st function in the 1st category of the NIST Cybersecurity Framework, right? ;-), but I for one just felt I should reflect a bit more on the role of IP addresses for certain processes within an organization. Not least as related aspects and questions may become even more important once a whole new class of addresses enters the corporate infosec ring: enter IPv6.
Let’s hence imagine there’s a certain number of systems in your organization’s network, and some of those systems are now getting IPv6. Which security processes could potentially be affected (read, in a sufficiently large organization: which teams should know about the ongoing IPv6 effort? ;-).

From a simplified perspective security processes can be grouped into the following categories (those interested in other categories find some here):

  • preventative ones (‘avoid that a threat can materialize against an asset’). For our purpose here let’s take filtering of network traffic and patch management as examples.
  • detective ones: these can include the detection of deviations of a desired security state like the detection of vulnerabilities (=> vulnerability management) or the detection of security violations (e.g. by system-local mechanisms like log files or agents, or by means of network [security] telemetry).
  • reactive security processes, e.g. incident response.

Traffic filtering is usually mainly done in one of these two flavors:

  • gateway-based filtering (firewalls or router ACLs). Once this is used it may be of (security) interest if there’s at least one active IP/system of a specific address family (here: IPv6) in use within a given subnet.
  • local packet filtering. Once the respective rules are centrally managed those in charge better know about active IP speakers, I’d say. Once those are not centrally managed (which is the case in many environments), first+foremost one might ask: who manages them at all? 🤔😂
    Kidding aside, evidently for those entities responsible for the configuration of local packet filters knowledge of active addresses of all address families is probably valuable.

To properly perform patch management one usually has to know: which systems are out there? And how does one identify those systems? For example, in environments using MS Active Directory probably all domain members can be identified by some AD-inherent logic, and other systems might be identifiable by means of their FQDNs. Still it’s a safe assumption that at least some fraction of to-be-patched systems are identified by their IP address(es), so having a proper inventory can be of help here, too.
Evidently in the course of patch management one might not only be interested in actively running systems, but also & namely in their OS + software components and their respective versions. Coming up with that type of information commonly falls into the realm of vulnerability management. The vast majority of vulnerability management tools & frameworks I’m aware of primarily use addresses as identifiers of systems. From that angle an accurate inventory of addresses certainly helps. The advent of IPv6 might bring some extra challenges here, as simply scanning IP subnets for active systems (in order to subsequently subject those to vuln scanning) doesn’t work any longer with IPv6, so the need of a proper source of truth becomes more crucial (I already discussed this in the IPv6 Security Best Practices post).

Finally in the context of incident response generally situational awareness (what’s happening, which assets are affected, what’s the source of certain things going on etc.) is needed. I could imagine that the ability to map IP addresses to systems can be helpful here (for identification, evaluation, follow-up), so a proper IP address inventory might be of value, so to say.

tl:dr: having a understanding of active IP addresses within an organization affects (at least) the following security controls/processes:

  • (potentially) traffic filtering
  • (potentially) patch management
  • (definitely) vulnerability management
  • (definitely) detection of security violations
  • (definitely) incident response

So when deploying IPv6 in an environment, talking to the owners of these processes is needed, in order to make sure that IPv6 does not lead to an increased risk exposure.

Quick Intro to IPv6

This post strives to provide an overview where (and why) IPv6 is different from IPv4. The intended audience are folks with a solid understanding of IPv4 but not too much exposure to IPv6 so far (I hear such an audience still exists ;-), and the post is intentionally kept short (regular readers of this blog may imagine that I’d love to extensively rant on several of the below items. some of them would deserve full posts on their own). Also I won’t go into technical detail too much.
In a nutshell the post tries to summarize why, under the hood, IPv6 is quite different from IPv4, and what those differences are.

Design Objectives

In order to understand certain elements of IPv6, it’s helpful to keep in mind that it was mainly developed in the mid-90s. It hence tried to solve some of the issues & challenges found in networking at the time, besides introducing general new ideas.
For the purpose of this post the following objectives are of interest:

  • autonomy: hosts should be able to come up with the configuration of basic IP parameters on their own, without the need for human intervention/administration or the need for additional services (like DHCP).
  • (restoration of the) end-to-end principle: hosts should be able to communicate with each other without ‘the network’ providing functions besides simple packet forwarding.
  • optimization: come up with some changes that ‘make network communications more efficient and hence faster’ such as replacing broadcast by multicast, or simplifying the IP header.

And, yes, the one main reason why IPv6 gets deployed in many environments today, that is larger address space, played a role, too.

The Unpleasant Reality

In case you’ve already been working a bit with IPv6, and you have scratched your head while reading the above section, thinking sth along the lines: “wait a second, those ideas haven’t really worked out, they’ve only been implemented half-baked, or they have even added a lot of complexity and operational pain”, you’re fully right.
Several factors have contributed to the mess we have today, like:

  • Dynamics within standard bodies based on (seemingly) voluntary work like the IETF, including the composition of working groups, their politics plus the associated way of finding compromises etc.
  • Over-engineering in general, further fueled by certain incentives within those working groups
  • Lack of ‘feedback from the field’: during the first 15 years or so after the initial specification not much deployment happened, so nobody told those well-intentioned and smart – seriously, no irony here: they are, but the ecosystem is complex in itself – engineers that what many networks needed was just more address space, and that all the other shiny enhancements and features primarily introduced complexity and operational efforts. And things got worse with every year that passed, with regard to protocol complexity, and with regard to the inability to make fundamental changes of the design.

I’m aware that some of these things and developments are hard to understand from a technical perspective or from a 2021 point-of-view, but protocol development doesn’t happen in a vacuum, and of course in hindsight it’s always 20/20.
Point is: from a deployment perspective, accept IPv6 as it is (the boat for significant changes has long sailed), and drive the right conclusions for your operational decisions.

Technical Elements & Changes

In this section I’ll list some of the main technical elements of IPv6 which are new for those coming from IPv4, and which play a huge role in both the way how an IPv6 stack works (by default) and how ‘an IPv6-enabled network’ behaves in general:

  • Router advertisements (‘RAs’): these are packets sent out by routers to their adjacent networks which carry information that enables hosts to perform autoconfiguration (remember the above autonomy objective). Understanding these packets, and their operational implications is crucial for smooth operations of the vast majority of IPv6 networks. I might add here that, based on some of the factors of the 2nd section, RAs are super-complex packets themselves, so they are somewhat metaphorical for the state of IPv6 ;-).
  • The link-local address (‘LLA’): in contrast to IPv4 where one and the same address is usually used both for communication within a subnet and with remote hosts, IPv6 strictly differentiates between local communication and non-local communication (the latter happening through a router/’the default gateway’). This differentiation includes a special address only used for local purposes. It uses the prefix fe80::/10.
  • Multicast: the approach of ‘general broadcasting’ when communication with multiple or ‘unknown’ hosts in the local subnet is needed, was replaced by using multicast groups (their addresses start with ‘ff’) for these types of communication. Combined with new/additional interactions (like RAs), at least in the local network (the ‘local link’ in IPv6 terms) one will usually see a lot of multicast traffic with different addresses, and for different purposes. Evidently this has a number of operational implications (which, again, are outside the scope of this post).

To some lesser extent one could add IPv6 extension headers (EHs) to this list, but – luckily – there’s a fair chance that many of you joining the IPv6 world in 2021 won’t ever see them in operational practice (besides security filters dropping them), so no need to discuss them further here.

As one can see, quite a few architectural changes have happened between IPv4 and IPv6. Understanding them can help to make well-informed decisions during the deployment of IPv6.

IPv6 Duplicate Address Detection

In this post I’ll take a closer look at IPv6 Duplicate Address Detection (aka ‘DAD’, which evidently bears all of types of jokes and wordplays). While the general mechanism should be roughly familiar to everybody working with IPv6 there are some interesting intricacies under the hood, some of which might even have operational implications.

DAD was already part of the initial specification of SLAAC in RFC 1971 (dating from 1996), which was then obsoleted by RFC 2462. RFC 4429 describes a modification called ‘Optimistic Duplicate Address Detection’. Neighbor discovery and SLAAC, incl. DAD, were later updated/specified in the RFCs 4861 and 4862 which are considered the main standards as of today. Finally DAD was enhanced in RFC 7527 but that’s of minor relevance here.

Its goal is to avoid address conflicts (within the scope of a respective address). To do so it is supposed to perform a specific verification procedure (‘ask a certain question’) and subsequently to act on the result of that procedure. However, as we will see, namely the latter can depend on a number of circumstances, in particular on the type of the address/IID.

How to ask the question?

Generally speaking a host is expected to perform the following (for a given unicast address):

  • send a Neighbor Solicitation (ICMPv6 type 135) message.
  • use the unspecified address (::, see RFC 4291, section 2.5.2) as source address, the requested unicast address’s Solicited-Node multicast address (SNMA, see RFC 4291, section 2.7.1) as target address and put the to-be-used unicast address as target address into the ICMPv6 payload.

This can look like this (ref. RFC 2464 for the ’33:33′ in the Ethernet multicast address):

It should be noted that RFC 4862 states that “Duplicate Address Detection MUST be performed on all unicast addresses prior to assigning them to an interface, regardless of whether they are obtained through stateless autoconfiguration, DHCPv6, or manual configuration”, but in practice this can be turned off on the OS level (and there might even exist situations where this could be desirable, see below). Still, the general verification procedure is mostly identical on the vast majority of operating systems.

Shall we wait for a response?

This is where the differences between scenarios start. As stated above RFC 4429 describes a thing called ‘Optimistic DAD’. The idea here is put an address into an ‘optimistic’ state right after sending out the NS and thereby make the address operational pretty much immediately (with some minor restrictions like not to send certain packets with said address as the Source Link-Layer Address Option [SSLAO]). This optimization is supposed to be used when – as of RFC 4429 section 3.1 – “the address is based on a most likely unique interface identifier” such as an EUI-64 generated one, a randomly generated one (Privacy Extensions, RFC 4941, more info here), a Cryptographically Generated Address (as for example used by Apple devices, see here) or a DHCPv6 address (note that the concept of ‘stable’ addresses as of RFC 7217 did not exist at the time). Optimistic DAD explicitly “SHOULD NOT be used for manually entered addresses”.
As of today it’s a fair assumption that all ‘client operating systems’ use Optimistic DAD, as can be observed in the above example, but this does not apply to servers using static addresses. This is how it looks like on macOS Big Sur (note that the router solicitation is sent already two milliseconds after the DAD neighbor solicitation)

What if the response indicates a conflict?

This is where things (differences) become really interesting. While RFC 4429 has a dedicated section on the ‘Collision Case’ (sect. 4.2), it remains relatively vague, includes terms like ‘hopefully’ 😉, and states that an address collision “may incur some penalty to the ON [optimistic node], in the form of broken connections, and some penalty to the rightful owner of the address” (which doesn’t sound right to me…).
RFC 4862 mandates (in “5.4.5.  When Duplicate Address Detection Fails”) that in case of a collision of an EUI-64 generated address the IPv6 operation of the respective interface “SHOULD be disabled”, but “MAY be continued” in other (address generation) scenarios. Furthermore “the node SHOULD log a system management error”.
An interface with a static address where DAD failed could look like this:

inet6 2001:db8:320:104::9/64 scope global tentative dadfailed 
valid_lft forever preferred_lft forever

So, overall no guidance is provided here how to proceed in case of a detected conflict for addresses based on RFC 3972 (CGAs), RFC 4941 (Privacy Extensions) or RFC 7217 (‘Stable IIDs’), but this may be specified in other places (see below), and/or might be left to the implementors of individual OS stacks. Many years ago Christopher Werny and myself performed some testing for Windows and Linux, creating various scenarios with address collisions, and from the top of my head I recall that their behavior was both quite different and not necessarily intuitive (sorry I don’t remember details).

CGAs have a dedicated Collision Count parameter which can be “incremented during CGA generation to recover from an address collision detected by duplicate address detection” (RFC 3972, section 3).

RFC 4941 includes this (with the TEMP_IDGEN_RETRIES defaulting to the value 3):

RFC 8415 on DHCPv6 specifies as follows (with a DEC_MAX_RC parameter indicating the number of client-side retries of getting a new address. it defaults to the value 4):

Furthermore the DHCPv6 server “SHOULD mark the addresses declined by the client so that those addresses are not assigned to other clients”.
I’m not sure about the exact sequence of things when the client uses optimistic DAD (which in turn should be the default for DHCPv6 addresses).

tl:dr of this section: the exact behavior of reacting to an address collision might not always be the same, and it might depend on several circumstances.

Operational Implications (1): Service Bindings

As laid out above optimistic DAD is not supposed to be performed when static IPv6 addresses are used. This can create issues when during system boot a service is to be bound to an address which is still in ‘tentative’ state (during DAD), as discussed in this thread (also interesting comment there at the bottom, on the differences re: DAD between FreeBSD and NetBSD).
This could look like this:

020/09/26 10:08:22 [emerg] 11298#11298: bind() to [2001:db8:104:1700::12]:80 failed (99: Cannot assign requested address)

Apparently this may be fixed by touching the following sysctl but I don’t fully understand its mechanism, so this might only work in certain scenarios:

sysctl net.ipv6.ip_nonlocal_bind=1

In any case the delay induced by DAD (with static addresses) should be considered for service bindings during startup.

Operational Implications (2): cni0 interface stuck in DAD

I once heard of a case where the cni0 bridge interface on Kubernetes clusters was stuck in DAD when initialized by standard CentOS initscripts (which in turn was difficult to troubleshoot as it only had veth members and wasn’t bound to any physical interface). This could presumably only be solved by disabling DAD as a whole. That might be a debatable approach (I for one think this is perfectly doable even in other settings once one has sufficient control over the [static] address assignment mechanisms), but for completeness sake here’s the relevant sysctl (from the current Linux kernel documentation):

Suffice to say that DAD might kick in various ways and in the context of different dependencies, so one has to be aware of its inner workings and of its role during interface initialization.
To contribute to such an understanding was the exact point of this post ;-). Thank you for reading so far, and as always I’m happy to receive feedback on any channel incl. Twitter.

A Holistic Look on SLAAC and DHCPv6

At first a very happy new year to all readers, and all the best for 2021!

While I wrote a few posts about IPv6-related topics in the past – for many years here and later on the present blog – it seems I never contributed to the ‘classic SLAAC vs. DHCPv6’ debate, besides documenting the behavior of different OSs in a long-expired Internet Draft + in this talk at RIPE70 and besides some ranting about DHCPv6.
Let’s change this today ;-). Here’s a post on those two. It’s a bit inspired by Fernando‘s recent talk on CVE-2020-16898 at the UK IPv6 Council Annual Meeting 2020 (some notes from the event here) in which he briefly covered the differences between both provisioning approaches incl. this slide:

I’d like to add some further perspective on the two. For that purpose I think it’s helpful to differentiate between the (network- or ‘environment level’-) operator point of view and the host perspective, as different parties (potentially pursuing very different objectives) can be responsible for those. Also it might make sense to differentiate between deployment scenarios like data centers or campus networks (some initial discussion of those here).
Let’s first look at the host/the ‘client’ which receives parameters or ‘instructions’ from one of the mechanisms (or from both, in certain settings). From its perspective the following objectives come to mind:

  • Completeness: for the IP-level operation of a host certain parameters (some discussion here) and information bits are needed, and those have to be provided by ‘the environment’, that is from network components, additional systems offering certain services and the like.
  • Simplicity: whatever happens keep it as simple as possible, from a general configuration perspective but also when it comes to components (like pieces of software, e.g. daemons, which have to be started, configured, maintained, patched etc.).
  • Security: the security attributes (or lack thereof) of the two mechanisms can play a role for the provisioning strategy in a given environment. At this point it should be noted that, in general, I think that inherent security issues of RAs and DHCPv6 packets should be addressed on the infrastructure level and not by hosts themselves (hence rules 5 and 6 in this post).
  • Support: this is seemingly evident, but still worth a mention/consideration. The configuration approach has to be supported at all (at least in an operationally feasible way, e.g. without installing additional 3rd party components) by the client platform(s) in question for a specific network or use case. Here it should be noted that as of RFC 8504 IPv6 Node Requirements SLAAC is a mandatory element of an IPv6 stack (‘MUST’ be supported, in RFC 2119 terminology) whereas DHCPv6 ‘SHOULD’ be supported (while this capital-letter statement in an RFC is a much stronger claim than a ‘common language’ interpretation of ‘should’, it’s still weaker than a ‘MUST’). For example it’s well known that Android does not support DHCPv6, which might have implications on the overall configuration approach of an environment.

From the perspective of the ‘operator’ (of the network infrastructure and/or of the servers being part of an overall provisioning picture, like DHCPv6 or TFTP servers for UEFI netboot) the following objectives may play a role:

  • Functionality: it has to work ;-).
  • Simplicity: whatever happens keep it as simple as possible.
  • Security: the fewer components involved, the less vulnerability exposure.
  • Support: different use cases and client platforms.
  • Traceability: in many environments the capability to identify systems involved in security violations (be them security incidents, be them violations of intellectual property regulations, e.g. in university networks) is an important consideration. Traditional thinking often stated DHCPv6 would strictly be needed for this, but this might not be the case.
  • Control: for system management and inventory purposes in data centers there’s often the desire to assign IP(v6) addresses to systems in a controlled way. Usually this precludes the use of SLAAC.

Now let’s have a look at both mechanisms in different scenarios.

Client perspective, campus networks

In the vast majority of networks I’ve seen in the last years going with SLAAC (with RDNSS) was/is considered fully sufficient. Recent examples of v6-only Wi-Fi networks like this one or this one) took this route, and I’ve also seen this in large conference networks (e.g. Cisco Live Europe 2019). I’m aware that back in 2019 Chris Werny and I recommended (in our talk on IPv6-only in Wi-Fi networks at Troopers) going with an approach of both SLAAC+RDNSS and stateless DHCPv6 in parallel, but I don’t think this is still needed in 2021, except for networks

  • where support for clients without RDNSS abilities is needed (namely Windows before Win10)
  • where IP parameters other than address(es), default gateway and DNS resolvers have to be provisioned by the network infrastructure (as opposed to be provisioned via other centralized mechanisms influencing the behavior of hosts like MS Active Directory/Group Policies or the like). VoIP telephones are an often-cited example of systems possibly needing DHCPv6, but I’ve yet to encounter such a deployment irl.

In campus networks going with SLAAC clearly wins with regard to the objectives ‘simplicity’ and ‘support’.

Client perspective, data center networks

Let’s first assume that we usually see a very different mix of operating systems in such networks (than in campus networks). Let’s further assume that from the operator side DHCPv6 might be a preferred option (pursuing the ‘simplicity’ objective and assuming that at least *some* systems in the DC might require DHCPv6 for netboot purposes). From my perspective there’s two aspects/objectives that deserve special attention here:

  • Security: while probably most dual-stacked clients in campus networks run a DHCP client, this is not necessarily the case in DC environments. Adding a DHCP(v6) client on systems can increase their vulnerability exposure, e.g. see CVE 2018-1111 in RHEL discovered a while ago by Felix Wilhelm with whom I had the pleasure to work for a few years.
  • Support: I have seen a few special-purpose appliances (both virtual and physical ones) commonly found in DC environments which only supported SLAAC, but not DHCPv6. Obviously ‘just installing dhclient’ might not be an option for such systems.

For the above reasons SLAAC might be a viable, or even required, option for hosts in data center networks, too, but this might collide with an organization’s system management & control approach. On the other hand SLAAC and particularly container networks have a lot in common (think existence of ephemeral entities and reliance on DNS).

Operator perspective, campus networks

In case the operator is responsible only for the network infrastructure, but not for the client environments (which is a common case in ISP scenarios), there’s usually a strong focus on ‘support’ (of as many different client platforms as possible) but not on any other of the above objectives. With regard to this one SLAAC is the better option.

A network operator can choose to provide both mechanisms at the same time, in order to support as many heterogenous clients as possible (which can lead to interesting results on the client side, e.g. see the Comcast approach described here. I still don’t understand the reasoning for doing SLAAC [with A-flag] and managed DHCPv6 in parallel).

In case the operator is also responsible for the hosts (e.g. enterprise networks), the picture is usually more nuanced. Here often traceability plays a major role. DHCPv6 is often considered helpful in this context, but it might not be the only option.

I’d hence say that going with SLAAC (only) provides advantages from the (enterprise) operator perspective as well, provided the traceability angle can be solved without DHCPv6. It might also facilitate things in the security context (focus on rogue RAs only instead of addressing DHCPv6 related threats as well).

Operator perspective, data center

As of 2021, UEFI netboot over IPv6 practically always requires DHCPv6 (option 59 et al.) so it’s a fair assumption that DHCPv6 can be found in many data centers.

As I discussed in this post on ‘IPv6 configuration approaches for servers’ there might furthermore be the approach to (ab)use DHCPv6 reservations as a mechanism to assign specific addresses to specific systems, based on a centralized database (or several ;-). In this case the infrastructure (DHCPv6 relay agents running on ToR switches, and the DHCPv6 servers themselves) often needs to support RFC 6939, which, as far as I can see, might still not be easy as of today, given the lack of support of RFC 6939 in network gear. Another potential obstacle: the (from the DHCPv6 process perspective) ‘client’ – which actually is a ‘server’ 😉 – could show up with different MAC addresses during different stages of the netboot process which can create additional challenges… I may discuss intricacies of this in another post ;-). Overall it seems that IPv6-based netboot is a real PITA in many environments, e.g. see this presentation on v6-only at the Imperial College London.

Still, for the purpose of our discussion we have to note that DHCPv6 may very well be a mandatory requirement in many DC environments. The operator then has to decide between pursuing the ‘simplicity’ objective (by going with DHCPv6 as the only possible approach) and prioritizing the ‘support’ objective (by allowing both mechanisms for hosts in the DC, which ofc increases the overall complexity of the environment).

Personally I think that supporting SLAAC in DC environments might be inevitable in the mid-term due to the considerations laid out above in the ‘Client perspective in DCs’ section. As an additional note I have to admit I’ve never been a big fan of DHCPv6 at all as I’ve seen too many interoperability and instability issues during my years performing projects in various large organizations (pre mid-2019), but things might have developed to the better in the interim. Happy (seriously!) to hear your success story of a stable large-scale deployment of DHCPv6, either in a campus network or in a data center.

Thanks for reading so far, and happy IPv6 networking in 2021 to you all!

Notes from the UK IPv6 Council Annual Meeting 2020

Today the UK IPv6 Council held their annual meeting. These have been great events for many years (e.g. see 2019, 2018, 2017). Many thanks to Veronika McKillop and Tim Chown for organizing it! In the following I’ll discuss some of the talks (full agenda here).

Colin Donohue & Ian Hallissy: The AIT Experience with IPv6 Only Wifi

Colin and Ian work at the Athlone Institute of Technology (AIT), a technical university in the Irish Midlands with 6000+ students. They started their IPv6 journey back in 2012, and decided to use a 2020 refresh of the Wifi infrastructure to switch from dual stack to IPv6 only incl. the management plane and the control plane. This would allow to easier identify IPv6 issues, to manage one protocol instead of two, and to provide a live IPv6 environment for research, testing and development. Here’s some relevant technical details of their approach:

They implemented the infrastructure with Aruba components (related vendor material with some technical details on the case study here ;-), and the NAT64 function is handled by Fortigate firewalls, with seemingly sufficient performance. With some small exceptions everything could be done in a v6-only way:

Ondřej Caletka from RIPE NCC asked how they (plan to) achieve their stated goal of traceability (e.g. in case of security incidents) given they only use SLAAC (yes, DHCPv6 might not be needed to achieve that objective). They responded that they expect (Aruba) ClearPass to handle this.
Another question was if they had a fallback plan in case some important application didn’t work properly without IPv4. Such a plan does not really exist – their stance was: “if urgently needed, we could add IPv4 on a VLAN basis, but we strive to avoid this”.

Overall a super-interesting presentation which clearly showed, like Jen Linkova’s recent talk at RIPE81, that v6-only Wifi (even for a heterogeneous user base) is doable in 2020.

Slides can be found here.

Pavel Odintsov: DDoS Challenges in IPv6 environment

Pavel is the author of FastNetMon. He started his talk laying out protocol-level DoS techniques on layer 3 and layer 4. He then discussed IPv6-related challenges from a DDoS protection perspective

and how he dealt with those when adding full IPv6-support to the latest FastNetMon community editions.

Slides can be found here.

Sam Defriez: Community Fibre IPv6 Update

This was one of the provider case study talks I really love at the event. Community Fibre is a London-focused (commercial) broadband ISP. The main objective of their IPv6 deployment was simply to avoid spending a fortune on IPv4, but instead use the money on “putting fibre in the ground”, as Sam phrased it. Seems reasonable and much more sustainable to me as well ;-). He discussed a few challenges during their IPv6 implementation, nicely illustrated by this chart:

They do DHCPv6 based on ISC’s Kea which had a few teething problems. Furthermore about 20% of their BNGs didn’t support DHCPv6 relay which was later fixed by Huawei, supposedly due to pressure from another, bigger ISP. Finally they suffered from the Cogent-Google (de-) peering issue (for those not aware of this see this discussion on the NANOG mailing list), which they solved by peering directly with Google.

Slides can be found here.

Erik Nygren (Akamai): CDN Provider’s view of IPv6 in 2020

Erik explained Akamai’s IPv6 efforts and developments since they first enabled it for HTTP traffic back in 2011. He then discussed in which spaces they see below-average IPv6 deployment: enterprise organizations (presumably due to backend systems & supporting processes not being IPv6-ready) and in the gaming industry (probably no incentives to add IPv6 once initial launch was v4-only).
He also mentioned a recent peak of 28 Tbps IPv6 traffic due to a specific event. (I wonder if that one happened in July? 🤔😉). Here are some interesting numbers he showed:

Asked if they see DDoS attacks happening over IPv6 he mentioned that they see a few, but they don’t consider this a relevant issue so far (not least as Prolexic, their DDoS protection service, supports IPv6 now).

Fernando Gont: Deeper dive into the recent Windows ICMPv6 vulnerability 

Fernando spoke about CVE-2020-16898, an RCE vulnerability in the Windows 10 IPv6 stack due to (mis-) handling of router advertisements (for the initial Microsoft advisory and some related discussion see here). The details of this vulnerability are quite interesting, and they were laid out in two independent posts, one from Francisco Falcon and the other one from Adam Zabrocki. In a nutshell the underlying problem is incorrect parsing of the (length of the) RDNSS option which in turns leads to improper buffer management. This can only be exploited once the malicious RA is sent heavily fragmented. Such a (fragmented) local ICMPv6 datagram shouldn’t be processed at all if the Windows IPv6 stack correctly implemented RFC 6980, which Fernando authored in 2013 (a related post about the latest Windows IPv6 stack here).
A very similar vulnerability was just recently identified by Francisco Falcon in FreeBSD (see here).

The slides can be found here.

Another great event from the UK IPv6 Council. Apparently 48 people joined, and there were some real good discussions among IPv6 practitioners. I expect the videos to be published soon, and I will update this post once that has happened.

Happy holidays to you all. Stay safe and healthy!

IPv6 Security Best Practices

That’s an ambitious title, from many regards.
Still, late 2020 might finally be time that we, as the IPv6 community, try to come up with a set of simple IPv6 security best practices to be used both as guidance and in a checklist manner.
One of the earliest of such efforts goes back to my friend Eric Vyncke, yet that one dates from 2007 (btw, Eric is also the maintainer of the useful ‘IPv6 Deployment Aggregated Status’ site). I had the pleasure to co-author an Internet-Draft on “Operational Security Considerations for IPv6 Networks”, but after years of discussions on the mailing list it has been stuck in ‘IETF review hell’ for a while (I paraphrase this term from Geoff Huston’s essay on “The Making of an RFC in today’s IETF“). Currently it doesn’t feel like that any of the authors incl. myself has any motivation or energy left to drive this further.
I’ll hence use this blog to formulate some ideas. As I have done IPv6 projects predominantly in large enterprise environments (before taking over my current role in mid-2019) there’ll be a certain focus on that space, though many of the thoughts should be applicable in various types of organizations. Also I’ll try to keep it simple, which isn’t always easy for me 😂.

Harden your servers

I recently wrote a piece about hardening of IPv6 stacks. For most systems the security benefits gained from it might not be worth the associated operational effort, but for systems in data centers hardening should be on your list of IPv6 security measures, especially when those systems are configured with static (global) addresses. Simply not offering certain interactions usually strengthens the security posture of systems or network devices.

Review packet filtering rules

In pretty much all environments packet filtering, either on the network level (via firewalls or ACLs on routers) or on the host level contributes to the overall security stance. In the course of your IPv6 deployment you might want to check those rule sets, as IPv6 brings technical changes which in turn require adapted rules, like

  • Different functionalities of ICMPv6 (some additional info here, here, and here)
  • IPv6 extension headers (see also next section)
  • Different network behavior within subnets, that is – in IPv6 lingo – ‘on the
    local link’
  • An altered addressing architecture (more on this here, here, or here)

An organization’s IPv6 transition might also provide a good opportunity to review rule sets from a broader perspective (“do we still need all those rules sitting in there since 15 years?”), but maybe you shouldn’t overload an IPv6 deployment effort with the ambition to cure deficiencies in your company’s security governance space at the same time ;-). Usually the former is complex enough.

Drop packets with extension header types 0, 43, 60 at network borders

There’s an active, and worth a read, Internet-Draft on “Operational Implications of IPv6 Packets with Extension Headers” (authored by Gert Doering, Fernando Gont and others). It includes this section:

This nicely summarizes the potential security issues of those headers, more information here or here. The main point, however, being that there’s no compelling (or other) use case for any production use of EH types 0,43, and 60 at all. If there are no practical benefits of a certain functionality, but many associated risks – why would one ever allow such stuff to enter one’s network?

Strongly consider dropping all inbound IPv6 fragments

Security problems related to IPv6 extension headers are often amplified when fragmentation comes into play (see here or here). There’s hope that the amount of fragmented IPv6 packets in the Internet right now significantly decreases due to the recent DNS flag day 2020. In recent years the majority of IPv6 fragments has been DNS traffic; nowadays one should only see very few TCP fragments, if at all, given TCP has the MSS to deal with situations requiring to split a payload, and at the same time the black hole detection of most OSs has become quite mature.
You might face some resistance when coming up with this measure. It could be an idea to perform an extra amount of testing incl. documentation, to closely observe counters of related ACLs for a while, and/or to come up with some training to identify MTU related connection issues during troubleshooting or via network telemetry. I also encourage all involved parties to read RFC 8900 “IP Fragmentation Considered Fragile”.

In shared L2 subnets perform risk analysis for RAs, NDP, MLD, and DHCPv6

All these packet types can cause security issues on the local link. Over the years there’s been some debate if the operational effort related to deploying associated controls on the switch port level (often called “First Hop Security” features) is worth the resulting risk reduction, in particular as from my experience several vendor approaches have shown serious teething problems. One may also keep in mind that at least in the past some of these features could be easily evaded.
There are RFCs on addressing rogue RAs (RFC 6105) and on protecting against unsolicited DHCPv6 packets (RFC 7610); see also next section. Eric Vyncke, Antonios Atlasis and myself once came up with the idea of an ‘MLD-guard’ feature but this didn’t take off, presumably because bad things doable with MLD are considered less severe than those related to the other packet types. In the projects mentioned above I primarily saw RA Guard and DHCPv6 Guard getting implemented (hence this post from 2015).

Understanding the risks associated with these functionalities & packet types, and if or how to protect against those, is part of your security-related homework of the IPv6 journey. I’ll discuss some of this in the following (yes, some overlap of the rules here ;-).

In shared L2 subnets with Ethernet drop RAs and server-side DHCPv6 messages on all access ports by default. Same for Wi-Fi on the controller or AP level

As stated at other occasions this is simply basic network hygiene. Many network operators integrate this into their ‘access port security templates’ (for physical switches, for virtual ones it’s a different story), and barring very specific conditions it’s not only safe but highly advisable to drop these packets. Looking at RFC 8415 section 7.3 ‘server-side DHCPv6 messages’ should encompass Advertise, Reply & Reconfigure messages. Common DHCPv6-Shield (that’s the ‘official’ name of the feature as of RFC 7610) implementations like Cisco’s DHCPv6 Guard probably filter exactly those.

In Wi-Fi networks such filtering can often be done on the controller or on the AP level, and evidently even fewer legit scenarios with such packets exist there than, say, in campus Ethernet networks. The related configuration approaches may be quite different though, e.g. see this talk by Christopher Werny from the Troopers 2016 IPv6 Security Summit, or this guide for Cisco 9800s. Last year I did some cursory testing on its effectiveness on a specific platform (results here).
Given the differences of commercial enterprise Wi-Fi solutions re: the exact configuration steps and their default settings, I recommend going with a combined approach of auditing (which, indeed, you periodically do anyway, right? 😉 ) and configuration steps (to happen in an automated manner during device deployment, maybe).

Think hard about a source of IPv6 truth for processes like asset management or vulnerability management

The importance of asset management and of a proper source of truth for it is, of course, also true for IPv4 environments. There’s a reason why the NIST Cybersecurity Framework starts with the Identify function…
On the other hand most of us know that the simple question “what do we have?” might not be simple to answer, even if executives or regulators may think so. However, in IPv4 networks one could always try to cover blind spots by simply scanning address ranges in a sequential manner, in order to identify ‘alive’ systems (which are then subjected to further enumeration steps). Surprisingly many organizations still do this today. Point is that such an approach is just not feasible in IPv6 subnets. Consequently one has to come up with different methods (such as ingesting data from neighbor tables, incorporating DNS AAAA records etc. – I’ll discuss some of these in another post), and I recommend to start thinking about this better sooner.

Take care that the vulnerability management framework is IPv6-ready

As an IPv6 practitioner one definitely wants to avoid being responsible for increasing the risk exposure of an environment, so taking care of the IPv6-readiness of an organizations’s vulnerability management process makes sense. This usually includes three main fields of activities:

  • A source of truth which provides ‘alive’ IPv6 systems/addresses, see above
  • IPv6 capabilities of the tools & techniques used for the vulnerability scanning itself
  • IPv6 reachability of all to-be-scanned endpoints from the scanner networks

Reflect on network security telemetry and on threat detection in the age of IPv6

I’ve discussed ‘Organizational security implications for IPv6’ on the APNIC blog a few months ago. Furthermore I wrote down some initial thoughts on threat detection in IPv6 networks back in 2016 here, but overall I feel this needs more critical reflection at another occasion (read: stay tuned for an upcoming post ;-).

Suffice to say that, whatever type of smart machinery and/or human intelligence you may be using to look at large sets of security-related data, you’ll have to re-think and adapt some of that for IPv6.

Evaluate the implications of dual-stack and of v6-only + NAT64 on security processes

Some overlap here with the previous point, still this deserves a dedicated mention in the list. Both deployment strategies bring some changes/challenges when it comes to analyzing & namely correlating connection data and traffic flows. In dual-stack environments one endpoint might show up with different addresses, from different families and with differing ‘lifetimes’. In networks using translation techniques one and the same connection might have an IPv4 part and an IPv6 part (initially IPv6 and subsequently IPv4 in case of translation ‘close to the client’ = with NAT64, or initially IPv6 and then IPv4 in case of translation ‘close to the server’, e.g. on load balancers or reverse proxies). Either way all of these scenarios require to re-evaluate the idea of a ‘connection identity’.

Tl;dr: (not only) from a security perspective the advent of IPv6 requires technical implementation steps and process changes. Here’s a summary of the rules I discussed: