CWE 117: Improper Output Sanitization for Logs

Flaw

CWE 117: Improper Output Sanitization for Logs is a logging-specific example of CRLF Injection. It occurs when a user maliciously or accidentally inserts line-ending characters (CR [Carriage Return], LF [Line Feed], or CRLF [a combination of the two]) into data that writes into a log. Because a line break is a record-separator for log events, unexpected line breaks can cause issues with parsing logs, or used by attackers to forge log entries.

NB: You may have seen an LF character expressed as "\n", and a CR character expressed as "\r" in various programming guides. These escape codes represent the single LF and CR characters.

The following example is what can happen if an application fails to handle CRLFs while processing data.

<form id="login" runat="server">
    <div>
        <asp:TextBox ID="UserName" runat="server"></asp:TextBox>
        <asp:TextBox ID="Password" TextMode="Password" runat="server"></asp:TextBox>
        <asp:Button ID="buttonLogin" runat="server" Text="Login" OnClick="buttonLogin_Click" />
        <asp:Label ID="errorMessage" runat="server" Text="Label"></asp:Label>
    </div>
</form>



public partial class Login : System.Web.UI.Page
{
    ...
    protected void buttonLogin_Click(object sender, EventArgs e)
    {
        Response.AppendToLog($"Authenticating User '{UserName.Text}'");
        if (PerformAuthentication(UserName.Text, Password.Text))
        {
            Response.AppendToLog("User logged in.");
            Response.Redirect("Index.aspx");
        }
        else
        {
            errorMessage.Text = "Unable to authenticate based on given credentials.";
            Response.AppendToLog("Unable to authenticate user.");
        }
    }
    ...
}

This login page behaves normally, if a user enters valid credentials, the site logs them in and directs them to another page. Meanwhile, a logger also records the event and its outcome, using calls to Response.AppendToLog(). Hackers anticipate that such logs exist, and that they could contain evidence of crime. For example, hackers know their breach will probably be investigated and want to redirect future attention toward another suspect by populating the log file with false information. If the login form allows CRLFs inside its input, hackers could use CRLFs to force parts of what they enter onto extra lines when the logger records it. The hacker will use this principle to write fraudulent entries to the log.

Attackers have to know what a valid log entry looks like in comparison to a failed log entry. For this example, we assume they have made an educated guess based on prior experience with common loggers. They must then design a "payload": data designed to cause malicious effect when entered into some context.

Whenever a failed login occurs, the three lines are written to the log. The username in the first line is the only variable. Users can enter whatever they want for a username (here we use john), so they control the data in this bounded space.

[INFO] - WebApplication - Login.aspx - Authenticating User 'john'
[INFO] - WebApplication - AuthenticationService - Validating credentials.
[INFO] - WebApplication - Login.aspx - User logged in.

The attacker enters the following malicious data into the username field (we have added visual line breaks for readability, the attacker's line breaks are encoded as %0D%A, see below):

james%27%0D%0A%5BINFO%5D+-+WebApplication+-+AuthenticationService+-+Validating+credentials.%0D%0A%5BINFO%5D+-+WebApplication+-+Login.aspx+-+User+logged+in.%0D%0A%5BINFO%5D+-+WebApplication+-+Login.aspx+-+Authenticating+User+%27joe

NB: Form fields are URL-encoded, even when part of a POST body, which means the %0D%0A is an encoded CR followed by an encoded LF. The %20 is a space, and so on.

Because this encoded date contains the CRLF line breaks, and the code does not do anything to strip them, the decoded data writes directly to the log, which looks like this:

[INFO] - WebApplication - Login.aspx - Authenticating User '+++{james'
[INFO] - WebApplication - AuthenticationService - Validating credentials.
[INFO] - WebApplication - Login.aspx - User logged in.
[INFO] - WebApplication - Login.aspx - Authenticating User 'joe+++}'
[INFO] - WebApplication - AuthenticationService - Validating credentials.
[INFO] - WebApplication - Login.aspx - Unable to authenticate user.
  • The payload begins with the value of UserName.Text. This value is part of the first line that writes to the log after a failed login (which this payload will generate).
  • For the username itself, the attacker picks james as the user name, another user whom will be framed.
  • Next, they type a CRLF &x0D;&x0A; to force the rest of the data onto a new line in the log.
  • On the next line, a fake record of a successful logon for james follows, making it look as if James was logged on to the system. Another CRLF adds another line break.
  • After the payload writes to the log, we know that one more line follows because a failed login has occurred. If these lines appear without the first line of the three, it is obvious that something is wrong with the logs.

Now James looks like he logged in at the time of the actual breach, so he becomes a suspect in the subsequent investigation. If it later emerges that someone had tampered with the logs, this might clear his name, but it would actually compromise the integrity of all log files. How would you know which logs have not been altered?. CRLF injection causes this effect.

Fix

Primarily, before writing any untrusted data to a log file, you should always properly validate and sanitize the data. We should always validate the input provided by UserName.Text and see if it meets the systems expectations. Most systems limit the username only to alphanumerical characters.

Alternatively you could prevent the two characters resulting in CRLF from saving to the logfile by replacing them from the input. You can achieve this by replacing any occurrence of \r and \n (separately) with something else.

Third option would be to URL-encode the data before it writes to the logger. By encoding the CRLF sequence it will not result in a new line if written to a file. Look at the following example.

...
     protected void buttonLogin_Click(object sender, EventArgs e)
     {
-        Response.AppendToLog($"Authenticating User '{UserName.Text}'");
+        Logger.LogInfo($"Authenticating User '{HttpUtility.UrlEncode(UserName.Text)}'");
         if (PerformAuthentication(UserName.Text, Password.Text))
         {
-            Response.AppendToLog("User logged in.");
+            Logger.LogInfo("User logged in.");
             Response.Redirect("Index.aspx");
         }
         else
         {
+            Logger.LogInfo("Unable to authenticate user.");
             errorMessage.Text = "Unable to authenticate based on given credentials.";
-            Response.AppendToLog("Unable to authenticate user.");
         }
     }
     ...
view fixed code only

NB: Most of the .NET log libraries write out the log data to different storage types which vary from files stored on the system, databases, or to Windows Event Log. The log libraries are highly configurable and perform encoding ot sanitization on the data elements written to them. Keep in mind that there remains a risk of an attacker manipulating the log data directly. For example Windows Event Log entries do have a length limitation, allowing an attacker to truncate log entries, possibly removing important information.

Log file injection is the basis of the above example, but CRLF injection can also appear in forms such as HTTP Response Splitting (CWE 113 ↪). This flaw rarely appears in a readily exploitable form, but if a fix is required, you can use the same strategy. You should encode CRLF characters before processing them. ASP.NET has several mitigating controls for reducing the risk of CRLF injection. The solution to this common attack type are these controls and your validation of untrusted data and possibly replacing or encoding the CRLF sequences.

References

CWE ↪ OWASP ↪ WASC ↪

Ask the Community

Ask the Community