DNS: The sky is falling

In April I speculated about the impending doom of the DNS.

Now we know what was in the works, and yes, it’s not a pretty picture.

My idea from april doesn’t work 1:1, as the attacker doesn’t attack a single target, sondern arbitrary other hostnames in the same domain.

Anyway, I spent the last days analyzing data from .at nameserver regarding patch discipline in Austria. You can read the depressing results here.

New I-D on ENUM for loose-route SIP

I’ve submitted a new I-D defining an enumservices subtype for loose-route SIP according to J. Rosenberg’s UA loose route (which right now is one of multiple proposals to address one problem).

The basic idea is the following: SIP proxy should distinguish between “retargeting” and “routing”.

  • “retargeting” is done, whenever a proxy decides that this call should be directed towards a new destination. Consequently, the Request-URI changes.
  • “routing” happens, when a proxy forwards the SIP message to the next hop, but does not change the identification of the target.

In the context of ENUM, the destination of the call is identified by a E.164 phone number (whether that number is encoded in the user-part of a sip: URI or a in a tel: URI doesn’t really matter). That number is the key for the ENUM – lookup which returns (in most cases) a SIP-URI.

The current RFCs define this as a retargeting operation: the phone number is mapped to a SIP AoR, and from now on the call is towards that URI and the original phone number is no longer relevant.

If you look at what currently done in private/carrier-ENUM settings, then this is not how ENUM is used: In most cases there, ENUM returns the next-hop for this call towards the phone number. That next hop re-extracts the phone number from the Request-URI and applies his own number/prefix based routing to the call. In other words, this is “routing” operation.

My draft makes this explicit: if the service field in the NAPTR is “sip:lr”, then this records contains a next hop and does not rewrite the destination of the call. Here is the example from the I-D:

   To visualize the difference between how "sip" and "sip:lf" entries
   are interpreted, consider the following entries:

             $ORIGIN 6.9.4.0.6.9.4.5.1.1.4.4.e164.arpa.
             @  IN NAPTR  ( 100 10 "u"
                            "E2U+sip"
                            "!^.*$!sip:alice@example.com!" .
                          )
             @  IN NAPTR  ( 100 10 "u"
                            "E2U+sip:lr"
                            "!^.*$!sip:p1.example.com;lr!" .
                          )

   A SIP proxy dealing with a call to tel:+441154960496 can select
   either record.  The first leads to

           INVITE sip:alice@example.com SIP/2.0

   being sent to the proxy responsible for example.com.  If the sip:lr
   record is used, then

           INVITE tel:+441154960496 SIP/2.0
           Route: <sip:p1.example.com;lr>

   is sent to p1.example.com.

DNS: What to do if the sky is falling?

Having been treated to another iteration of “we need to deploy DNSSEC, otherwise DNS will fall apart due to rampant forgeries” talk recently, I started to think what other options resolvers have to protect themselves. See also bert’s draft.

First of all, is it possible to implement a successful forgery attack without the client noticing that anything is wrong? IMHO not. Most scenarios assume that the attacker sends a burst of forged replies to the recursor in the hope of hitting the right query-ID before the real answer arrives. So, even with a successful attack a lot of answers with mis-matched ID will arrive (and be rejected) before the forgery actually succeeds.

This spike in ID mismatches towards a query should tell the recursor that something fishy is going on. TODO: Instrument bind/powerdns/… to actually log such ID mismatches.

If this works, then the 99.9999% percent of normal DNS lookups which are not targetted by a forgery attack can continue without modification. The only question now is: what can a resolver do if he has a clear suspicion that he is the target of a DNS forgery attack?

There is a trivial solution: Redo the query a few times, preferably utilizing all authoritative servers.

Let’s do some simple math: assume that the forger has a 1% chance of a successful spoofing attack. What are his chances if the recursor only accepts the answer if he got the same data from n queries?

n = 2: 0.01 % chance
n = 3: 0.0001 % chance

That doesn’t sound so bad, does it? The downsides sound rather minimal: no impact on queries which are not target of an attack, a few more requests during an attack and a slight delay (sequentially querying! You can’t do that in parallel.) before answering back to the client.

A few optimizations come to mind: Bert’s draft mentions a few other options, e.g. source port (or even IP address) variability. These come with some added cost, so perhaps they could be utilized just for these cases.

What did I miss?

[Update: 2008/04/22]

After a chat with Klaus and Alex, two new thoughts on this:

  • There might be legitimate cases where multiple queries might give varying results. E.g. global load-balancing like Akamai or www.google.com. This might break the “ask multiple times and check for consistency” idea.
  • Another option is the following: if a forgery attack is detected, simply switch to TCP for this query / target nameserver.

SPIT: Where do we stand?

SPAM over Internet Telephony (SPIT) is a strange beast: Everybody knows it is a threat, but in real life SIP installations, it has hardly been observed. In other networks it is not uncommon, Skype needs to police its users to get a grip on abuse, and all Instant Messaging networks have to deal with IM spam, and once they offer voice capability, SPIT as well.

So, SPIT over SIP hasn’t been a problem so far. After all, there are millions of SIP devices out there replacing PSTN phone service. Why?

  • In many end-user installations, SIP is just a replacement for the old analog last-mile technology. There may be SIP accounts provisioned into these devices (ATAs, hardphones), but the user does not see them. Nor do they know their own SIP AoR. None of the people calling them knows their SIP AoR. It wouldn’t help them anyway, as these networks are usually not reachable via SIP from the Internet
  • The same is true for IP enabled PBXs: They talk SIP on the inside, and maybe to other
    PBXs in other offices of the same company (usually via some QoS-enabled VPN). In rare cases, people build SIP trunks to upstream carriers or even establish manual cross-links to the PBXs of other companies.
  • Fully-featured SIP installations which implement inter-domain SIP routing as standardized by the IETF in RFC 3263 are the exceptions. They are the enthusiast’s Asterisk installations, some enterprising VoIP operators (perhaps even using User-ENUM) and that’s it.

Summary: the vast majority of SIP devices out there is not reachable over the public Internet.

Corrolary: There is no SPIT problem, as the target audience is too small to make SPIT profitable.

This is some sort of vicious circle: Operators don’t want to open up their SIP ingress elements, because they fear SPIT. On the other hand, it is hard to build defenses against SPIT, whose characteristics and weak points you don’t really know. And it is completely impossible to prove that certain clever anti-SPIT measures work if you can’t test them in practice.

There has been no shortage on clever ideas concerning SPIT defenses. Academic papers galore, internet drafts, and even products claiming to solve the problem.

We are still in the same vicious circle.

Will RUCUS help?