Posted on Wednesday, January 7th, 2009 | Bookmark on del.icio.us

Where Botted PCs Go – 2008 Q4

by Jose Nazario

I’m digging through our fourth quarter 2008 malware analysis data set and looking at what netblocks stand out. To do so, I’m using a number of different tools. Perhaps the most interesting one is Aguri from Kenjiro Cho of Sony Labs in Tokyo. When I have a pile of IPs to analyze, one of the things I do is run it through Aguri to look for common netblocks. Ptacek wrote up a nice piece on how Aguri works in Aguri: Coolest Data Structure You’ve Never Heard Of. What I do is take a list of IPs and make a fake Aguri report, with one IP per line, then crunch it through Aguri:

$ cat /tmp/2008q4_ips_connect.in | blacklist2aguri | aguri -l 65535

My AWK script “blacklist2aguri” just fakes up an Aguri report. I get results that look like this:

[src address] 36570 (100.00%)
 0.0.0.0/2      559 (1.53%/28.53%)
    4.0.0.0/7   535 (1.46%/4.73%)
     5.0.0.0/10 410 (1.12%/1.12%)
     5.64.0.0/10        411 (1.12%/1.12%)
     5.128.0.0/10       374 (1.02%/1.02%)
   8.0.0.0/5    556 (1.52%/5.68%)
    12.0.0.0/7  392 (1.07%/4.16%)
     13.0.0.0/10        387 (1.06%/1.06%)
     13.64.0.0/10       373 (1.02%/1.02%)
     13.192.0.0/10      371 (1.01%/1.01%)
    16.0.0.0/8  394 (1.08%/1.08%)
     58.0.0.0/9 563 (1.54%/1.54%)
     58.128.0.0/9       502 (1.37%/1.37%)
...

From there I can start to pick out the most interesting, most specific CIDRs or even IPs. If this many IPs survived this much data set reduction, something is afoot. Aguri is far from perfect, for instance it’s order dependent, but it’s a great way to start dissecting some data.

The first data set is related to connections all of the infected boxes made during our analysis. Discarding some of the malformed ones, we can see a few CIDRs pop out, and from there look at the associated ASNs based on analysis of over 36,000 connections in this quarter:

  • 66.220.17.128/25, belonging to AS6939, HURRICANE – Hurricane Electric, Inc. HE has a lot of hosting and picks up a lot of crap; my experience is that they’re decent about remediation.
  • 91.203.92.96/30, belonging to AS44997 BTG12-AS BTG route block. The connectivity chain has me greatly concerned, as they’re behind UkrTelegroup. We know them from the same sphere as CERNEL and such.
  • 193.142.244.0/25, belonging to a provider in the Netherlands, UABSIP-NET. I don’t have much data on them at this point, I’ll have to keep digging to see if they’re cause for concern.
  • 210.51.47.128/25, our old friends in CHINANET-BACKBONE. We’ve been looking at a lot of malware in .cn this past year. Some juicy stuff came about, we hope to send out some public writeups soon.

Everything else is a lot less specific and very broadly based. This isn’t surprising, it just means that we analyzed a lot of malware going to a lot of different places. This may also be due to the large number of scans that we have recorded in the connection table.

The next data set we had a look at were the URLs that our malcode contacts. This can be a second stage payload for a downloader, this can be a checkin site for a POST, this can also be a benign fetch to Google, Microsoft, or some other site to check for Internet connectivity. Here’s the most specific netblocks we saw and our brief analysis of them:

The 92,000 URLs we analyzed are far more concentrated than the above data, for reasons that aren’t always obvious. Hosting providers, recurrence to the same places, etc … all of these reasons are why this data set may be more solidly well formed.

But, based on the above data, I have a few more netblocks to start keeping an eye on and digging into our data to dissect. If new hotbeds of “badness” are emerging, this is one of the ways we can use to discover where they may be.

Popularity: 1% [?]

Leave a Comment