Directory Traversal

Setup #

There is nothing to download or install for this lab. You can access the lab environment by visiting astarml.com.

Introduction #

Also known as Path Traversal, this is a vulnerability that leads to users being able to read arbitrary files from the filesystem of the server hosting the application. These files might include:

Source code of the application
Configuration files containing credentials
Sensitive operating system files

Identifying Directory Traversal #

Let’s take a look at our lab site again, specifically /marketing. At first glance, this seems like a normal webpage.

However, if we click into one of the links, such as for About or Services. We’ll see that URL looks a little different from what we are used to: https://astarml.com/marketing?file=about

Let’s look at the source code what this page is doing.

app.get('/marketing', (req, res) => {
    
    // 1) Grabbing the value of the query param 'file' 
    const fileParam = req.query.file || 'home';
    
    // 2) Setting that parameter's value to the variable 'fileName'
    let fileName = fileParam;
    
    // 3) If the provided file path doesn't include .. or / or \, .html gets appended to the file name. If any of those exist, it's allowed as-is.
    if (!fileParam.includes('..') && !fileParam.includes('/') && !fileParam.includes('\\')) {
        fileName = `${fileParam}.html`;
    }
    
    // 4) The directory name of the current module, "page", and then the file name are all concatinated into the full path. 
    const filePath = path.join(__dirname, 'pages', fileName);
    
    // 5) This code checks if the file exists. If it does, it is returned to the client.
    // Check if file exists
    if (fs.existsSync(filePath)) {
        const content = fs.readFileSync(filePath, 'utf8');
        // Replace /style.css references to work with the /marketing prefix
        const modifiedContent = content.replace(/href="\/style\.css"/g, 'href="/style.css"');
        res.send(modifiedContent);
    } 
    // 6 If the file doesn't exist, then a 404 page is returned to the client. 
    else {
        res.status(404).send(`
            <!DOCTYPE html>
            <html>
            <head>
                <title>404 - Page Not Found</title>
                <link rel="stylesheet" href="/style.css">
            </head>
            <body>
                <div class="container">
                    <h1>404 - Page Not Found</h1>
                    <p>The page "${fileParam}" could not be found.</p>
                    <a href="/marketing?file=home">Return Home</a>
                </div>
            </body>
            </html>
        `);
    }
});

What we are particularly interested in here is step 4. In this step, if the user supplies a filename of ../server.js, for example, the new full path is <app_directory>/pages/../server. This is a valid path, and thus the file ‘server.js’ get returned.

We can see why this work by looking at the structure of the project and the breaking down the path. We start in the app folder. The path takes us into the pages folder. The ../ brings us back to the app folder. and then we end on server.js located in the app folder.

We don’t actually have to remain within the project though. If we passed in ../../etc/passwd. We’d again start in the app directory and enter the pages folder, but then we’d back up two levels and end up in the root directory of the server (assuming the app directory is located in the root directory, which is common for containerized applications). We’d then enter the “etc” directory, and end up on the passwd file. Because this file exists as checked in step 5, the application returns it to us.

Writing to files on the system #

While I don’t have running example of this because I did not want to allow write access from the open internet. This vulnerability is also often found on file upload functionality when the user is able to specify their own file name.

// 1) File name is grabbed from the request body
const userProvidedFileName = req.body.filename;

// 2) The filepath of where the file is being place is built
const destinationPath = path.join(UPLOAD_DIR, userProvidedFileName);

// 3) the file is places in that location
fs.renameSync(req.file.path, destinationPath);

In this case, if “UPLOAD_DIR” is <app folder>/uploads, and the user provides a file with the name ../malicious.txt. The full path would ended up as <app folder>/uploads/../malicious.txt which would actually place the file directly in the <app folder>. If we can do this with more sensitive files, such as ../../etc/shadow file, we may have an avenue to being able to log directly into the server by overwriting that file with our own.

Preventing Directory Traversal #

Use specific ID’s not tied to filenames #

When requesting a document or resource, utilize an ID assigned to that resource and tracked in a database, rather than calling the source directly.

# Safer:
download?fileID=123

# Not Safe:
/download?file=../../etc/passwd

Enforce the root path #

In cases where you absolutely need to allow access to direct resource (this is common for images and other media), enforce the root directory of the path and refuse any attempts to escape it.

# Specify the base path
base = /var/www/files

# Build the full path
requested = base + "/" + userInput

# Validate the full path using framework specific functionality
Normalize (realpath, Path.resolve(), etc.).

# Create the checks to reject anything that deviates from normal
Check: if (!requested.startsWith(base)) { reject }

Remove encoding and escape attempts #

Before opening/returning a file:

Resolve . and ... Meaning, decode the input to catch %2e%2e, %2f, ..%2f, double-encoding, etc.
Apply checks after decoding and normalization because attackers will try all sorts of crazy combinations to try and bypass validation.
Use framework/OS functions that give you the canonical path, don’t roll your own string hacks.

Reject suspicious #

One of the very first checks you can do after decoding any input is to check for weird patterns like .. or leading slashes because the odds of those being in valid filename is slim to none. Also check for things like null bytes (%00) because the odds of those being in a valid filepath is zero.

Using out-of band storage mechanisms #

Instead of storing files on the local system at all, utilize out of band storage mechanisms such as Amazon S3, Google Cloud Storage, Database, etc.

Utilize Least-Privilege #

Run the application as a low-level user with locked privileges. With proper permissions applied to the user, even if an attacker could gain access to the system as that user, they would only have access to the application’s own folder and files. They wouldn’t have access to /etc/passwd,shadow,ssh keys or any other sensitive files on the system.

Use Framework-specific mechanisms #

Most modern frameworks have systems for retrieving and uploading file that have been validated for security. Utilize them before attempting to roll your own system.