Ready for the Root Zone Key Rollover?

During ICANN discussions@AtLarge the question arose, how a ordinary user can verify, if he will run into problems during the DNSSEC Root Zone Key Rollover. Determining this is surprisingly difficult or extremely easy.

The Rollover Problem

DNSSEC basically works by pointing to the active key in the child zone from the parent zone. Carefully updating this Delegate Signer in the parent zone, it's trivial to roll a key in the child zone.

But for the Root it's hard, because there is not parent zone for the Root itself. The key material for the Root is stored and managed on the hard disk of each of the DNS resolvers out there. But where does it come from originally?

The most basic method is a manual import. In the early days, it was the most prominent solution. The key material was distributed in newspapers, web pages and so on. Nowadays the keys are preinstalled by the software and operating system vendors. An other possibility is to gather the keys from the (unvalidated) DNS during the installation and store them locally.

Now the problem is to change those keys. Millions of servers need to get learn the new keys. Usually that would be an impossible task. Since several years the trust anchor keys can be updates automatically: RFC5011. The approach is to trust all new keys, the resolver can validate using the existing ones. This ways the resolver can learn, store, and trust new keys.

But does it work in reality? Nobody did a comprehensive test so far. The Root Zone KSK Rollover will be this test. And of course, the test might fail. The reasons for failure are numerous: I.e. configuration can be write protected at the local hard disk.

If something goes wrong, the resolver will not answer to any question anymore. Consider for a moment, that this resolver is the central one of a large ISP with lot's of broadband customers: There will be a total blackout.

Consequences

How many resolvers out there are a problem? Nobody knows. Therefore ICANN decided to postphone the rollover planned for last October.

New methods to determine if a resolver already learned the new key, were developed during the last few month. Unfortunately it's not very widespread. Only well maintained systems are likely to deploy the new requests, but those systems are not the problem anyway. So even this approach will not bring any more light into the shadows.

Now we have to answer the question, if and when a rollover should take place. New data to reason about are not to expect.

On the other hand there is a problem with new devices. More and more small, networked devices are pushed into the market. Most IoT-vendors do not care about updates or support. But the probability raises, that even those devices will contain validiating resolvers, which are unable to update their configuration. Delaying the rollover will allow such vendors to ignore the problem. Only chaninging the Root Zone KSK periodically will force the industry and operators to behave correctly.

Fears

It's an emotional topic:

  • Am I on risk to lose my Internet?
  • Who will sue me (as an operator) for outages?
  • Who will sue me (as ICANN) for outages?
  • How many people will simply disable DNSSEC, instead of fixing the setup?

So the main interest is a personal one: Am I on risk?

Luckily there are only two real possibilities:

  • My ISP (resolver) is validating DNSSEC. It might be possible, that this thing will fail.
  • My ISP (resolver) is ignoring DNSSEC. Then any change will not touch it or me.

In order to determine those basic cases, Alan Greenberg recommended to set up a simple web page, showing the local state at the users system.

How to determine the local state of DNSSEC?

Of course the browser is unable to natively speak arbitrary DNS at the web page in order to distinguish various cases. Any active programming is not an option,

It is possible to split the page into different sections and load them. A special section is not resolvable, if the DNS resolver is validating. I chose CSS for differentiation,

<link rel="stylesheet" href="dnssec.css">
<link rel="stylesheet" href="http://css.fail.donnerhacke.de/dnssec-fail.css">

The second CSS is overwriting the former one:

$ cat dnssec.css
.failed { display: none; }
.dnssec { display: block; }
$ cat dnssec-fail.css
.failed { display: block; }
.dnssec { display: none; }

This second style sheet should not be loadable if the resolver is validating.

So I have to introduce a stable error in my DNSSEC signed zones. The error have to be designed as stable, so that the automatic repair scripts are unable to fix it (and should not complain about it). I chose an erroneous delegation:

$ORIGIN donnerhacke.de.
fail            NS      avalon.iks-jena.de.
                NS      broceliande.iks-jena.de.
                DS      12345 8 2 1234...

The child zone is not signed, despite the manually crafted DS record in the parent zone claims otherwise.

In reality this results in the following picture:

fail.donnerhacke.de

Any validating resolver can't prove this and responds:

;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 10476
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;css.fail.donnerhacke.de. IN AAAA

;; Query time: 49 msec
;; SERVER: 100.100.100.100#53

A non-validating resolver will come to a different conclusion:

donnerhacke.de.         NS      avalon.iks-jena.de.
donnerhacke.de.         NS      broceliande.iks-jena.de.
;; Received 185 bytes from 2001:678:2::53#53(a.nic.de) in 25 ms

css.fail.donnerhacke.de. CNAME  pro.donnerhacke.de.
pro.donnerhacke.de.     AAAA    2001:4bd8:1:1:209:6bff:fe49:79ea
donnerhacke.de.         NS      broceliande.iks-jena.de.
donnerhacke.de.         NS      avalon.iks-jena.de.
;; Received 219 bytes from 2001:4bd8:52:1:20a:e4ff:fe80:bec8#53(broceliande.iks-jena.de) in 1 ms

So the rendering of the web page depends on the ability of the resolver to validate DNSSEC.

And now go and test your resolver: *click*

Post a comment

Related content