XXE: Entity injection attacks

With entity injection attacks, applications can be attacked and cause data breaches. This happens when the XML parser is configured incorrectly. The most dangerous variant is XXE, which stands for XML eXternal Entity injection. This may involve retrieving external resources, such as arbitrary files from the server or from other locations within the network.

What is XML

XML (eXtendable Markup Language) is a language used to store data in a structured way. The format of this is pre-arranged, and is therefore understandable to both man and machine. XML is used in various ways, such as telling a browser how this page is constructed. An example XML file looks as follows:

John Doe
1
Groningen University

The school system may provide an option to upload new students using an XML file. Its users can save the above example as a students.xml file. After uploading, the school system can read the XML file and query it as a database. This makes XML very suitable for exchanging structured data.

What are entities

The XML format also supports variables. Within XML, such a variable is called an “entity.” Suppose all students are from the same school and to avoid typing “Cyber University” 1,000 times, the following can be done:

John Doe
1
&school;

At the top, the entity “school” is now created, and it is used in the document.

External entities

It can be even more convenient: if the text “Cyber University” is delivered in a separate file, so that the XML file does not need to be modified, the name can also be loaded externally. This can be from a web address or a local other file, for example. This is called “external entities,” also known as XXE.

]>

John Doe
1
&school;

In the above example, the contents of the file “school.txt” are read and used. This is obviously convenient, but can also pose a security problem. This is because if the XML file is loaded on a server, then this way files can be read that are on the server. Suppose there is a file called “passwords.config” in the same directory, this way it is possible to retrieve the passwords from the server, or for example configuration files such as the /etc/passwd file. This could cause a huge data breach!

Entity expansions

Another problem that can occur is “entity expansion.” With this, it is not so much possible to steal data as to make the server inaccessible. A well-known example of this is the “billion laughs attack.” This one is below:












]>
&lol9;
What happens here is that an entity is created with the value “lol”. A new value is then created, with 10x the value of the entity. So now the result is “lollollollollollollol lollollollollol”. This process is repeated until the entity lol9 contains the word lol a billion times. Since a letter takes 1 byte in memory, the last entity takes up 3 billion bytes, which is 3GB. This is a huge drain on the server’s resources, since the result not only has to be put into memory but also compiled. Most servers will crash as a result, and should the server survive, the attack is easily scaled up by adding a few more lines.

Prevent

There is no standard way to prevent entity injection and the billion laughs attack. Most XML parsers do have a setting that can disable this capability. Therefore, please refer to the manual.

Entity injection is common in larger software packages and custom software. During a pen test, CyberAnt checks for this.