Introduction
In my last post, I facetiously said that the problem with databases is that they do what we tell them to. Then I introduced you to SQL Injection and demonstrated how to protect yourself against it if you’re running a Java application (for example, under Apache Tomcat 8.0.39). That technique relied on JDBC and parameterized SQL queries, which means we rely on a mechanism between Java and the database engine itself to help protect against SQL Injection. Java has a really robust database abstraction layer called JDBC. Unfortunately, PHP doesn’t have an equivalent, though a technology called PHP Data Objects (PDO) makes the attempt. It’s nowhere near as mature or robust, but for some databases (like MySQL), it’s great. An article on the site Grow the Future gives a concise picture of what databases PHP supports.
Since PDO’s MySQL support is so strong, I’ll show you how to use PDO to implement parameterized queries on a Linux, Apache, MySQL, PHP (LAMP) installation — and thereby add a layer of protection against SQL Injection.
What’s SQL Injection?
This section contains the same information as “What’s SQL Injection” in my last post. I’m reproducing it here in case you weren’t interested in this topic for Java. But, if you read my last post, you can skip this section!
The Open Web Application Security Project (OWASP) maintains a list of the most pervasive/popular vulnerabilities that afflict applications. Sitting at the very top in the number 1 position is Injection. There are different kinds of injection, and this post concentrates on SQL Injection. We’ll talk about other types later.
What’s SQL Injection?
Let’s say we have a database, and in that database, we have a table called asfg_pi_input02_works. That table stores information about DVDs and Blu Ray disks. If I wanted to list all of the contents in that table, I could use a query like:
SELECT * FROM asfg_pi_input02_works
The title is in a column named api02_name_of_work. If I wanted to show the table row or rows whose title is “Attack on Titan’ (yeah, I like Anime!), I could use a query like:
SELECT * FROM asfg_pi_input02_works WHERE api02_name_of_work = 'Attack on Titan'
So far, so good, right? No nefarious code here! But let’s say that we put a form on the web that prompts a user to enter the title. The web form has a field called search_title. In my code, let’s further say that I take the user’s input and build the query using string concatenation, like this:
$strTitle = $_POST['search_title']; $strSQL = "SELECT * FROM asfg_pi_input02_works WHERE api02_name_of_work = '" + strTitle + "' ";
If I enter just “Attack on Titan”, the program will build a query that looks just like the previous query:
SELECT * FROM asfg_pi_input02_works WHERE api02_name_of_work = 'Attack on Titan'
But, what if I’m evil? What if I want to list everything? In other words, what if I enter this string?
Attack on Titan' OR 1 = 1
This is what the query will look like when my innocent and unprotected program uses simple string concatenation to build it:
SELECT * FROM asfg_pi_input02_works WHERE api02_name_of_work = 'Attack on Titan' OR 1 = 1
This query, instead of showing just “Attack on Titan”, will show all of the rows in the table. Why? Because, being evil, I intentionally fooled the program by closing the string with a single quote, then adding “OR 1 = 1”. Since 1 always equals 1, that condition is always true. In effect, I’ve told the query engine to show me all of the rows.
This is a table containing entertainment records, so who cares? You should. Because I didn’t have to stop at just “OR 1 = 1”. I would have done something more nefarious, like:
Attack on Titan' ; SHOW TABLES;
Maybe the output of “SHOW TABLES” wouldn’t necessarily get to the browser’s display; or maybe you’re using reflection to automatically interpret the results and write them to the screen?
Or what if I start experimenting with Data Description Language (DDL)? Like creating or dropping tables?
Or, what if I attack your login screen with “OR 1 = 1”. If you’re not protecting yourself, that attack would let anyone log in without having an ID.
That could be a huge problem.
Fortunately, protecting your PHP application from SQL Injection is really easy.
Parameterized Queries Save the Day!
The problem goes back to databases doing whatever we tell them to do. So, we have to come up with a safe way to ask them for data. PHP provides the perfect mechanism: Parameterized Queries within PDO. The example I’m showing was built on PHP 5.4.16, and it should work on any (currently known) newer version as well.
The examples below use these variables:
$strDatabaseHost = "localhost"; $strDatabaseName = "asfg"; $strDatabaseID = "asfgreader"; $strDatabasePW = "password"; $strDatabaseType = "mysql";
Let’s say we have a web page named input_02.php, which includes an HTML form. That form prompts a user for two things:
- The title of a Blu Ray disk (or a partial title); the name of the HTML field is search_title
- The genre of a title, which is fed by a list of possible genres from a table; the field name is search_genre
We post that form to another PHP page called input_02_results.php. That page stores the two HTML fields to variables:
// TODO: Scrub the data; or use a white list to protect it! $strGenre = $_POST['search_genre']; $strTitle = $_POST['search_title'];
This is bad implementation; we should do something like employ a white list to make sure the data’s safe. But for this illustration, we’ll rely on PDO and parameterized queries to protect is. To build a query that will show us a list of possible Blu Ray matches, we’ll use parameterized queries.
Here’s a general overview of how the two pages relate:
From the perspective of code, here’s how we get started:
$arrReturn = array(); if ($strGenre == "") { $strGenre = "0"; } $strNameOfWork = "%" . $strNameOfWork . "%";
This function will return the results in an array named $arrReturn. The prompt for genre on the previous form returns the identifier for the genre or a zero if the user made no selection. The code makes sure that $strGenre isn’t blank by converting a blank to a zero. Then, we prepare the title for a LIKE search. Performance isn’t the greatest, but we’re just demonstrating parameterized queries here.
Now we begin by opening a PDO connection within a try block:
try { $conn = new PDO($strDatabaseType . ":host=" . $strDatabaseHost . ";dbname=" . $strDatabaseName, $strDatabaseID, $strDatabasePW); $conn->setAttribute( PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION );
This logic instructs PHP to create a PDO connection against a MySQL database named “asfg” on the host localhost. It passes in the ID and password. After setting some attributes, we then build the query statement:
$strSQL = "SELECT a.*, b.* FROM asfg_pi_input02_works a, asfg_pi_input02_genre b WHERE api02_name_of_work LIKE ? "; if ($strGenre != "0") { $strSQL .= " AND a.api02_genre = ? "; } $strSQL .= " AND b.api02g_identifier = a.api02_genre ORDER BY a.api02_name_of_work "; $stmt = $conn->prepare($strSQL);
Do you see the question marks? These represent place holders — they’re called parameters. That’s where the phrase “parameterized queries” comes from. The “prepare” statement sends the query, question marks and all, to MySQL. MySQL compiles the statement to reference it in the next step. We’ll tell PHP/PDO what to use in place of the parameters in the next lines of code:
stmt->bindParam(1, $strNameOfWork); if ($strGenre != "0") { $stmt->bindParam(2, $strGenre); } $stmt->setFetchMode(PDO::FETCH_ASSOC); $stmt->execute();
The bindParam statement tells the query engine that the contents of the variable $strNameOfWork as the value for the first question mark. We do the same thing (optionally, if the user specified it) for the genre. Then, we actually execute the query. Notice that there’s no opportunity for an attacker to assault our code with string concatenation. The MySQL query engine just crams whatever’s in the variable $strNameOfWork or $strGenre into the query “as is.” If either variable contains something like “‘ OR 1 = 1′”, the search will just return nothing (unless there’s a title like “My Blu Ray Disk ‘ OR 1 = 1′”.
The rest of the code looks like “normal” PHP/MySQL interaction:
while($row = $stmt->fetch()) { $arrRow = array(); $arrRow['api02_identifier'] = $row['api02_identifier']; $arrRow['api02_name_of_work'] = $row['api02_name_of_work']; $arrRow['api02_ordering_url'] = $row['api02_ordering_url']; $arrRow['api02_price'] = $row['api02_price']; $arrRow['api02_date_ordered'] = $row['api02_date_ordered']; $arrRow['api02_date_received'] = $row['api02_date_received']; $arrRow['api02_genre'] = $row['api02_genre']; $arrRow['api02_rating'] = $row['api02_rating']; $arrRow['api02_date_released'] = $row['api02_date_released']; $arrRow['api02_vendor_identifier'] = $row['api02_vendor_identifier']; $arrRow['api02g_identifier'] = $row['api02g_identifier']; $arrRow['api02g_genre'] = $row['api02g_genre']; $arrReturn[] = $arrRow; }
This block of code walks through the results and, for each row, builds an array containing the data. It then adds that array to the “master” array, which gets returned to the calling application. The calling application (in this case, input_02_results.php), walks through the “master” array and displays the search results.
Some Implications
Parameterized Queries are a powerful addition to your security toolbox (they can also boost batch INSERT statements if the column list is static). However, they aren’t the only protections, and you should consider using multiple layers of defense to implement security in depth.
For example, earlier, I mentioned the possibility of using a white list. If possible, you could only allow letters and numbers, which would eliminate most of the raw material an attacker would need. This would make entering a string like “‘ OR 1 = 1” impossible, since it could exclude single quotes.
Also, consider limiting the power of the ID you use to query the database. In the example above, you’ll see that I used the ID “asfgreader.” When I created the database “asfg,” I issued the statement “GRANT SELECT ON asfg.* TO ‘asfgreader’@’localhost’ IDENTIFIED BY ‘password’. That created asfgreader as a read-only ID. Even though I extensively use Parameterized Queries in my applications, there’s a chance that a given version of PHP’s PDO implementation might have a flaw. That flaw might allow a SQL Injection attack. I can limit the damage, especially from DDL commands, by using a low-power ID.
Does this make sense? Do you think it’ll be easy to implement in your applications? Let me know why or why not in the comments!
by Terrance A. Crow
Terrance has been writing professionally since the late 1990s — yes, he’s been writing since the last century! Though he started writing about programming techniques and security for Lotus Notes Domino, he went on to write about Microsoft technologies like SQL Server, ActiveX Data Objects, and C#. He now focuses on application security for professional developers because… Well, you’ve watched the news. You know why!