PHP Security Basics: Input Validation

Validation is a technique to ensure that input is secure before using it in your code.

When validating data, you are verifying that it corresponds to what the program needs. This only works if you have a list of criteria that you can check to determine that the data is valid.

In this article, we’re going to look at:

Whitelisting

The simplest validation method is whitelisting. This only works when there is a precise set of possible values that the data can have.

Let’s look at how whitelisting can be used for validating a theme option controlling the position of the sidebar.

Here is the code that we used to create the setting and the control:

$wp_customize->add_setting(
  'sidebar-position',
  [
    'default' => 'left',
    'sanitize_callback' => 'wpdc_validate_sidebar_position',
  ]
);

$wp_customize->add_control(
  'sidebar-position-control',
  [
    'label'    => esc_html__( 'Sidebar Position', 'wpdc' ),
    'section'  => 'theme',
    'settings' => 'sidebar-position',
    'type'     => 'radio',
    'choices'  => [
        'left'  => esc_html__( 'Left', 'wpdc' ),
        'right' => esc_html__( 'Right', 'wpdc' ),
    ],
  ]
);Code language: PHP (php)

The user only has two choices: left or right. This means that in the wpdc_validate_sidebar_position(), we can determine whether the submitted option is one of the two possible values.

function wpdc_validate_sidebar_position( $sidebar_position ) {
    if ( in_array( $sidebar_position, [ 'left', 'right' ], true ) ) {
        return $sidebar_position;
    }
}Code language: PHP (php)

To do this, we use the in_array() PHP function. This function returns true when the needle, the submitted value for the position of the sidebar, is in the haystack, the list of possible positions.

The third parameter of the in_array() function is to enable strict type comparison. We pass true as an argument, to enable the strict checking. This is important, because in PHP loose type comparison can lead to unexpected results.

So whitelisting simply means that we compare the submitted data against a list of acceptable values. This works well for controls such as checkboxes, radio buttons, selects, and dropdowns.

But how can we validate data for which we don’t know the possible values? Let’s have a look at validating data according to a set of qualifications.

Qualifying data

When qualifying data, we try to find out whether it meets a precise set of criteria. Let’s look at an example of validating data.

Imagine that you have a meta box that allows users to enter a value for the width (in pixels) of the content area of a particular post. While not being a super useful feature in a theme, this example allows us to demonstrate the use of filter_input().

The filter_input() PHP function gets a variable and validates it. The function accepts four arguments: the type of input, the name of the variable to get, the filter (validation) to apply, and an optional array of options.

<?php
$content_area_width = filter_input(
  INPUT_POST,
  'content_area_width',
  FILTER_VALIDATE_INT,
  [
    'options' => [
      'default' => 500,
      'min_range' => 100,
      'max_range' => 1000,
    ]
  ]
);
?>Code language: HTML, XML (xml)

Although the code for this function might seem verbose, it’s much shorter and clearer than writing it all out:

<?php
// Warning: This code does not work correctly.
if ( isset( $_GET['content_area_width'] ) && is_int( $_GET['content_area_width'] ) && $_GET['content_area_width'] >= 100  && $_GET['content_area_width'] <= 1000 ) {
    $content_area_width = $_GET['content_area_width'];
} else {
    $content_area_width = 500;
}
?>Code language: HTML, XML (xml)

You might wonder why there is a warning about this code not working. Seems to look good, right? The problem is that is_int( $_GET['content_area_width'] ) will always return false, so this code will always return 500.

This is because data retrieved from the $_GET and $_POST super globals is always of the type string. Using the filter_input() function allows us to get around this limitation of the PHP language.

Choosing the right qualifications

When validating data, it’s crucial that you choose the right set of qualifications, and express this correctly in the code.

Imagine that you have a Customizer setting in your theme for entering a link to a Twitter profile. You want to have a valid URL for this setting, so you use the filter_var() PHP function with the FILTER_VALIDATE_URL filter.

<?php
// Warning: Insecure code!
function wpdc_validate_twitter_profile_url( $url ) {
    return filter_var( $url, FILTER_VALIDATE_URL ) );
}
?>Code language: HTML, XML (xml)

The next thing you do is output the validated URL in your theme:

<?php
// Warning: Insecure code!
echo '<a href="' . $twitter_url . '">' . esc_html__( 'Twitter', 'wpdc' ) . '</a>';
?>Code language: HTML, XML (xml)

In the four lines of code that we have seen so far, we have made two crucial mistakes:

  1. We trusted the filter_var() function to validate the URL to the Twitter profile.
  2. We didn’t escape the URL on output.

We are going to look at escaping in a later part of this series. For now let’s look at why the validation was too weak to be secure.

The problem is that if you enter javascript://test%0Aalert(321), this is a valid URL. As soon as a user would click on the Twitter link on the front end of the site, a Javascript dialog would appear.

We need to add additional checks to our function:

function wpdc_validate_twitter_profile_url( $url ) {
    if ( 0 !== strpos( $url, 'https://twitter.com/' ) ) {
        return;
    }

    return filter_var( $url, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED ) );
}Code language: PHP (php)

This function now verifies that the data meets three qualifications:

  1. The URL starts with https://twitter.com/.
  2. The URL is valid according to the RFC 2396 standard.
  3. The URL has a path component (as in http://example.org/path).

Validation functions

WordPress validation functions

WordPress only has a couple of validation functions.

  • is_email(): Checks whether the data is a valid email address. The validation done by the function does not comply with the RFC 822 standard, and does not work with internationalized domain names.
  • wp_validate_boolean(): Despite the name, this function not only validates, but also sanitizes the data passed to it. So the return value will always be a boolean. You can use filter_var( $var, FILTER_VALIDATE_BOOLEAN, FILTER_NULL_ON_FAILURE ) as an alternative, as it returns NULL when the passed data is not valid.
  • sanitize_hex_color(): This actually a validation function, as it returns null if the color code isn’t valid. It is only available in the Customizer context, but it’s a small function so you can copy the code to your own validation function if needed.
  • sanitize_hex_color_no_hash(): The same as sanitize_hex_color() but for values without a leading #.
  • rest_is_ip_address()
  • rest_is_boolean()
  • is_serialized()

PHP validation functions

PHP offers a number of validation functions. As we have seen previously, using them can be a bit tricky. So make sure to read the documentation carefully, including the notes.

  • is_bool(): Returns true if the passed variable is of the type boolean.
  • is_float(): Returns true if the passed variable is of the type float.
  • is_int(): Returns true if the passed variable is of the type integer.
  • is_numeric(): Returns true if the passed variable contains a numeric value. Keep in mind that this encompasses all numeric values, so signs, hexadecimal, binary, and octal values are all valid.
  • strtotime(): Not a validation function strictly speaking, but can be used as such to validate dates. The function returns false if the passed data cannot be converted into a timestamp.

Next we have a family of functions that have been specifically designed to validate data.

  • filter_input(): Retrieves an external variable (from $_GET, $_POST, $_SERVER,…) and applies the specified filter.
  • filter_input_array(): Works the same as filter_input(), but allows multiple values to be retrieved with one call.
  • filter_var(): Filters the variable passed as an argument.

When using these functions, you need to indicate a filter to use. The validation filters can be combined with flags to achieve a specific behavior. Some filters also accept additional options.

It’s the combination of the right filter, with the right flags, and the right options that makes these functions do their work correctly.

Conclusion

After reading this, you should have a solid grasp on how validating data works, and how you can use it to make your WordPress code more secure.

If anything is unclear, or if you have a suggestion, please don’t hesitate to reach out to me.

Level Up Your WordPress Business With One Email Per Week

Every Sunday, I send out tips, strategies, and case studies designed to help agencies and freelancers succeed with modern WordPress.

My goal is to go off the beaten path, and focus on sharing lessons learned from what I know best: building websites for clients.

100% free and 100% useful.

Feel free to check out the latest editions.