Domain Categorization

Domain categorization is classifying a domain or website based on the content it hosts. For instance, https://sleepy-student.co falls under the Training and Tools category, according to Paloalto Networks.

You can use various other tools/proxies to query for a website’s category, but if you want to use the one from the figure, here is that link: https://urlfiltering.paloaltonetworks.com/

Domain categorization varies depending on the proxy. Some tools use web crawlers and machine learning or pattern matching to categorize a domain based on a web page’s content. Some tools have a minimum amount of content that a website needs to contain before categorizing a domain. Domains can also be submitted to a proxy to be categorized. The verification process varies by the tool.

 

How Businesses Use Domain Categorization Today

Suppose a company finds a malicious URL involved in an incident. In that case, this URL is typically added to some malicious list and blocked by the company’s firewalls. Incident after-actions such as these can take some time to propagate. We want a proactive approach to help us prevent users from navigating to less than credible sites before an incident occurs. That is why we started leveraging domain categorization. 


It is common practice for companies to block uncategorized domains or explicitly allow only a select few categories. This can help prevent distractions at work, but the main objective is to try and prevent users from unknowingly browsing a malicious site.

Newly registered domains are typically uncategorized. Blocking uncategorized domains can be a pain point for attackers who purchase new domains that are spelled similarly to their target domain, typosquatting. However, this will not halt a motivated adversary forever. 


How Attackers Evade This Defense Mechanism

There are ways attackers can gain categorized domains to leverage in their attack infrastructure. Knowing which method will be the best fit may require some situational awareness.


Purchasing Categorized Domains

Attackers can wait to purchase expired categorized domains. There are tools such as domainhunter and catmyfish that find such domain names. The domain needs to be a category that the target’s proxy will accept for this to work. For instance, assume an expired domain toteslegit.com is available for registration. Assume toteslegit.com is categorized as a “low risk, social media” website. Suppose the target organization only allows Search Engines, Webmail, and financial websites through their proxy. The domain toteslegit.com is not very useful in this scenario. Users will not be allowed to visit this domain. 

However, suppose the target organization allows access to social media sites; toteslegit.com would be a totes legit option. The attacker must opt for an existing domain with an existing reputation(category). In this situation, typo squatting is an unlikely option.


Categorizing Your Own Domains

As defense mechanisms and techniques change, so do attack methods. There are ways attackers and red teamers can “force” categorization of their domains.


Forcing Categorization via Tools

Chameleon is a tool that helps quickly categorize infrastructure under arbitrary categories. Chameleon supports proxies such as Bluecoat, McAfee Trustedsource and IBM X-Force. Chameleon works by submitting your domain to be categorized and providing each proxy the appropriate information required to fool the proxy into categorizing your domain. 

Using HTTP Redirects

Another way to bypass domain categorization is to use HTTP redirects. 

URL redirection or URL forwarding can forward incoming traffic for one domain to another domain. URL redirects work by sending a 3xx status code with the new URL in the Location header of the HTTP response. While this method may influence performance, it’s rarely a genuinely noticeable lag. 

We will not dive too deeply into various redirect codes in this blog entry; however, different 3xx status codes have different meanings. There is a notion of permanent and temporary redirect codes. 

A 301 redirect is a permanent or strong redirect code. The text in a 301 response typically contains “Moved Permanently.”  You may notice this if you make a curl request to google.com like so


curl http://google.com




If you typed http://google.com in your browser, it would redirect your request to https://google.com

HTTP redirection tells the search engine where you prefer users to be directed when they type in a particular URL. This is commonly used for redirecting HTTP to HTTPS. URL redirection is useful when a website has been reorganized to reprioritized web pages. Redirection can also be helpful when a new website replaces an old one, and the site owner wants to redirect traffic from the old URL to a new one. 

We can forward all incoming traffic of our malicious example domain to a more credible domain using an HTTP redirect. Leveraging an HTTP redirect would eventually cause our domain to be categorized similarly to that credible domain we are forwarding traffic to.


Key Takeaways

  • Domain Categorization is the system of classifying domains based on the content it hosts.  

  • New domains are most often uncategorized.

  • Today, some businesses use domain categorization as a security control to keep their users from browsing malicious websites by blocking uncategorized websites, categorized as malicious, or categorized as content the business does not want employees browsing on their network.

  • As new defense mechanisms are developed, attackers learn new ways to bypass such defense mechanisms. We must practice defense in depth.

Previous
Previous

I Wrote Vulnerable Code

Next
Next

How To Get Started With Microsoft Graph API