Why Domain Generating Algorithms (DGAs)?

Datetime:2016-08-23 03:11:52          Topic: Algorithm           Share

Command-and-control (C&C) infrastructure plays an essential role of coordinating botnets and malware. Attackers set up C&C servers to distribute commands or harvest sensitive data from victims’ computers.

Simple C&C setups based on hardcoded IPs provide an opportunity for defense. Defenders can blacklist the C&C server IP address or domains. Law enforcement agencies may take down the server. Using dynamic domain names instead of IP addresses can provide a certain degree of resilience to the C&C infrastructure by presenting a moving target. Domains are dynamically bound to different IPs as needed by the bot herder.

Many sophisticated malware families use a Domain Generating Algorithm (DGA). These algorithms provide dynamic predictable domains to the bot herder. As these domains are short-lived, blacklists will be not effective. Often they are quite numerous, with upwards of tens of thousands of domains generated per day by a single malware family. Malicious actors need only register one of the domains in order to carry out C&C, whereas defenders need to be aware of and block any generated domains that are registered to completely eliminate C&C activity.

What is a DGA?

A Domain Generating Algorithm is a class of algorithm that takes a seed as an input, outputs a string and appends a top level domain (TLD) such as .com, .ru, .uk, etc. in order to form a possible domain name. The seed is a piece of information accessible to both the bot herder and the infected host now acting as a bot. Often the current date is used as a seed, as evidenced in the long-lived and still highly active botnet Conficker. Sometimes seeds are static integer values that are packed with the bot code, allowing different values to be used for different subsets of the botnet. In both of these cases, access to a bot sample can allow reverse engineering of the DGA, thereby providing researchers with the ability to pre-generate domains which could be used by the bot. Occasionally seeds use dynamic information that is impossible to predict, such as the insignificant digits of foreign exchange rates, as leveraged by the Bedep botnet. In all cases, malicious actors are using DGAs to provide themselves with predictable rendezvous points for C&C use.

Why do take-downs such as Operation Tovar (GameOver Zeus take-down) fail?

As discussed in the blog The FBI vs. GameOver Zeus: Why The DGA-Based Botnet Wins , when botnets are uncovered, the law enforcement and security communities often take steps to take them down. One approach is to sinkhole the DGA domains the botnet uses for C&C thus forcibly preventing malicious DNS entries from resolving to their intended hosts). For date-based DGAs, domains can be pre-registered as a part of the sinkhole, thereby preventing their use as C&C domains. However, this approach has some limitations:

  • It only works for seedless or static seed-based DGAs. DGAs using dynamic seeds are resistant to this approach. A given seed may be sinkholed, but use of a new or undiscovered seed will render the sinkhole ineffective.
  • The sinkhole must be maintained in perpetuity, across all of the relevant registrars. This requires coordination with all of the registrars for the top-level domains that the DGA uses. If the maintenance expires or fails to cover all registrars, the botnet can reanimate and herders can regain control of their bots.
  • The botnet can be modified to change the top-level domains (TLD) it uses. This is a simple modification to a DGA and requires the participation of new registrars for effective sinkholing.  

What protections can Trend Micro TippingPoint provide?

TippingPoint provides an evolving set of filters that recognize the unique DGAs employed by specific malware families. This technology, known as DGA Defense, provides customers exclusive protection against command and control, data exfiltration, and further malware propagation via infected end users.

Additionally, TippingPoint provides filters that defend against techniques commonly used in DGAs, catching previously unknown and unseen DGA C&C. These filters make use of statistical properties of language and identify patterns that are likely non-linguistic. The technology does not require knowledge of the malware family, its DGA implementation, seeds, or top-level-domains used in order to determine whether a domain is randomly generated. In this manner these DGA-agnostic filters provide protection that sinkholes, blacklists and takedowns cannot match and can do so for malware families and threats that are still unknown (zero day).

In particular, the previously mentioned blog provides an example set of 8 domains associated with the Zeus trojan. Trend Micro TippingPoint IPS Filters 19665 and 20602 will identify and block 6 of these 8 domains (75%). While this may not seem like a high detection rate, identifying just one attempt to contact C&C is sufficient to quarantine a breached host, perform mitigation and prevent any further malicious activity. Since DGA-based malware often attempts to contact C&C thousands of times per day or more, a probabilistic detection will quite effectively identify these attempts before any malicious activity occurs.

Customer Use Case: DGA Defense via Trend Micro TippingPoint IPS

About List