Intermittant outtages on .org sites (i.e. ncsasports.org)

We’re tracking a problem that’s manifesting itself as intermittent outages to .org domains.  what appears to be happening is that sometimes the .org DNS servers will return a null response instead of the authoritative servers.  This results in our local DNS servers caching a “null value” on the response and the site appears down until the cache expires and the full recursive lookup happens again.

Here’s an example of a failed recursive lookup:

sfrazer-mbp:~ sfrazer$ dig +trace www.prairie.org

; <<>> DiG 9.4.2-P2 <<>> +trace www.prairie.org
;; global options:  printcmd
.            79601    IN    NS    l.root-servers.net.
.            79601    IN    NS    j.root-servers.net.
.            79601    IN    NS    c.root-servers.net.
.            79601    IN    NS    k.root-servers.net.
.            79601    IN    NS    i.root-servers.net.
.            79601    IN    NS    d.root-servers.net.
.            79601    IN    NS    b.root-servers.net.
.            79601    IN    NS    f.root-servers.net.
.            79601    IN    NS    a.root-servers.net.
.            79601    IN    NS    m.root-servers.net.
.            79601    IN    NS    e.root-servers.net.
.            79601    IN    NS    h.root-servers.net.
.            79601    IN    NS    g.root-servers.net.
;; Received 449 bytes from 192.168.0.21#53(192.168.0.21) in 11 ms

org.            172800    IN    NS    C0.ORG.AFILIAS-NST.INFO.
org.            172800    IN    NS    D0.ORG.AFILIAS-NST.org.
org.            172800    IN    NS    A0.ORG.AFILIAS-NST.INFO.
org.            172800    IN    NS    A2.ORG.AFILIAS-NST.INFO.
org.            172800    IN    NS    B0.ORG.AFILIAS-NST.org.
org.            172800    IN    NS    B2.ORG.AFILIAS-NST.org.
;; Received 435 bytes from 192.58.128.30#53(j.root-servers.net) in 31 ms

org.            0    IN    SOA    a0.org.afilias-nst.info. noc.afilias-nst.info. 2008502420 1800 900 604800 86400
;; Received 96 bytes from 199.19.56.1#53(A0.ORG.AFILIAS-NST.INFO) in 49 ms

sfrazer-mbp:~ sfrazer$

A0.ORG.AFILIAS-NST.INFO should have returned a list of our DNS servers, which would then be queried.

In short, the issue is out of our control, as our DNS servers remain healthy and serving the correct content, and the websites themselves are still up, even though some people will be unable to get to them.

Because we set our Time To Live on DNS zones to 5 mintues, the outtages generally don’t last long (the cache expires quickly, and is refilled) but the request rate is higher, so people are more likely to see the problem.  The alternative would be longer TTL settings which would reduce the number of times people saw the problem, but would lengthen the time until the problem resolved itself.

Update: The problem has apparently been resolved.  More information here.

Add a Comment 

One Response to “Intermittant outtages on .org sites (i.e. ncsasports.org)”

  1. [...] DNS issues Yesterday we experienced an issue reaching some of our .org domains and I wanted to write a bit about the troubleshooting process I used to determine what the problem [...]

  2. Leave a Reply