An Introduction to CWE-89: Improper Neutralisation of Special Elements used in an SQL Command | Lucideus

SQL Injection has been a major source of cyber attacks as its status as top in the list of Top 25 most dangerous software errors provided by the CWE and SANS Institute. The SQL injections are covered in CWE-89 by mitre, which gives its description as follows:

“The software constructs all or part of an SQL command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended SQL command when it is sent to a downstream component.”

This weakness was first discovered by Jeff Forristal(signing as rain.forest.puppy) in 1998[].In his example, Forristal demonstrated how to hijack the following sql command:

                         SELECT * FROM table WHERE x=%%criteria%% AND y=5.

He pointed out how the criteria "1 SELECT * FROM sysobjects --" will change the equation to the following:
 
           SELECT * FROM table WHERE x=1 SELECT * FROM sysobjects -- AND y=5

commenting out the last part of statement executing the SQL query 
SELECT * FROM sysobjects,

this will enable the intruder access to the information to the sysobjects table without authorization.

What Jeff Forristal discovered became and remained the topmost software vulnerability to date. It is now a requirement for anyone who takes user data to make user that the data is properly cleansed before inserting it into an SQL query. Let us discuss the description given by mitre mentioned above. It is still based on Forristal’s approach. First of all, we are talking about the software which constructs all or part of SQL command using externally-influenced input from upstream components. If such is the case the one will have a chance to modify the SQL command in a way that is not intended by the software developer. What makes this chance a requirement is the second clause of above description: ”..., but it does not neutralize or incorrectly neutralize special elements that could modify the intended SQL command when it is sent to downstream component”. This is the part which turns the necessity to sufficiency. If the upstream component is not properly neutralized then the intruders will be able to insert more than what is intended and get what they are looking for in the downstream component. 

Mitre also gives an extended description of SQL injection as follows:
“Without sufficient removal or quoting of SQL syntax in user-controllable inputs, the generated SQL query can cause those inputs to be interpreted as SQL instead of ordinary user data. This can be used to alter query logic to bypass security checks, or to insert additional statements that modify the back-end database, possibly including execution of system commands.

SQL injection has become a common issue with database-driven web sites. The flaw is easily detected, and easily exploited, and as such, any site or software package with even a minimal user base is likely to be subject to an attempted attack of this kind. This flaw depends on the fact that SQL makes no real distinction between the control and data planes.”

SQL injection is indeed a common issue often encountered in the VAPT assessment done by Lucideus for its clients. It should be noted that this type of attacks emphasized the need for input validation and the requirement for the protection of different channels between different services to be vetted before usage. This can be seen from the relationship chart for SQL Injection from the CWE-89 page as follows:

Relationships


This chart also highlights that this weakness can follow another weakness called missing initialization of a variable, and is itself actually a type of improper neutralization of special elements in data query logic. 


Mode of Introduction
Mitre points that the weakness usually occurs in data-rich applications which stores user inputs in the database, and is caused by the implementation of an architectural security tactic. 

This is especially true for those cases where those user inputs within the database can potentially be reused as part of an SQL query. The specified mode of introduction should be a trigger to call in a check for the controls to prevent this weakness. 


Applicable Platforms
For CWE-89, Mitre states that this weakness occurs for any language but where there is a dependency on the database server. Basically, where ever one is querying the SQL server but with an input from the user. This section gives the trigger that is required for the assessment of the presence of this weakness. 


Common Consequence 

The common consequences of this particular weakness area listed as follows:

Scope: Confidentiality and Technical Impact: Read Application Data
SQL injection enables one the access to the restricted part of the database and that generally leads to the access to the confidential information. 

Scope: Access Control and Technical Impact: Bypass Protection Mechanism
Usually, access controls require certain interaction with the SQL database. It is possible through SQL injection to bypass these controls and gain illicit access to controlled areas. 

Scope: Integrity and Technical Impact: Modify Application Data
A successful SQL injection is not just interacting and showing the table but is also capable of modifying them. The integrity of a vulnerable application should not be trusted for integrity. 

Through this list of common consequences, it can be seen that the impact of a breach can result in total compromise of an application susceptible to it. 

Likelihood of ExploitMitre writes in just one word that Likelihood of exploit of this weakness is high. This is because the presence of the weakness is pretty easy to detect and further it is not so difficult to exploit this weakness.

Demonstrative Examples
The extent of the past exploits and the extent of the damage caused illustrates the reason why this particular weakness takes the first place in the Top 25 CWEs. Following are the examples given by Mitre

1. The following example illustrates the automatibility of the exploits and ease with which the consequences becomes widespread. 

“In 2008, a large number of web servers were compromised using the same SQL injection attack string. This single string worked against many different programs. The SQL injection was then used to modify the web sites to serve malicious code.”

2. The second example talks about the following Ci# code:
...
string userName = ctx.getAuthenticatedUserName();
string query = "SELECT * FROM items WHERE owner = '" + userName + "' AND itemname = '" + ItemName.Text + "'";
sda = new SqlDataAdapter(query, conn);
DataTable dt = new DataTable();
sda.Fill(dt);

Mitre illustrates that this code is to run the following sql query:
SELECT * FROM items WHERE owner = AND itemname = ;

However, a simple input of itemname string to 
“name' OR 'a'='a”
 changes the above query to 
SELECT * FROM items WHERE owner = 'wiley' AND itemname = 'name' OR 'a'='a';
which is basically same as 
SELECT * FROM items;

This basically turns the request form into a full access to the data contained in the whole items table. 

3. The content of previous example as indicated by mitre is by no means the whole scope of this weakness. As they provide further example to illustrate the alternative consequences as follows:

If, in the above example, instead of the input
name’ OR ‘a’ = ‘a
one chooses an input
name’; Delete FROM items; --
for itemName, then the query converts to the following two queries:
SELECT * FROM items WHERE owner = 'wiley' AND itemname = 'name';
DELETE FROM items;
--' 

Mitre points that many database serves like MS SQL Server 2000 would execute the multiple sql statements separated by semicolons, though others like Oracle would through errors as they do not process batch queries.
Mitre also suggests that these problems can be handled as input validation problem and recommends either whitelist only suggested values, or blacklisting the potentially malicious values. 

They further recommend whitelisting as an effective means to enforce security, but also  highlights that although parameterized sql commands requires less maintenance and can offer more guarantee with respect to security. Blacklisting is, on the other hand full of loopholes that makes it ineffective at preventing SQL injection attempts. One can look at their discussion on this example for more details on their CWE - 89 page.

4. The fourth example given by Mitre takes an adversary outside the database and give access to shell commands themselves. It is about the MS SQL which has a built in function to execute shell commands. They demonstrate how one can exploit a command like:


SELECT ITEM,PRICE FROM PRODUCT WHERE ITEM_CATEGORY='$user_input' ORDER BY PRICE
to execute the following:
SELECT ITEM,PRICE FROM PRODUCT WHERE ITEM_CATEGORY=''; exec master..xp_cmdshell 'dir' --' ORDER BY PRICE
thereby executing the dir command in shell through MS SQL. 

5. The fifth example given by them shows the execution of sql injection through exploitation of cookies instead of fields. The example on PHP is the following: $id = $_COOKIE["mid"]; mysql_query("SELECT MessageID, Subject FROM messages WHERE MessageID = '$id'"); where, the developers might have missed the input validation with an assumption that the cookies cannot be modified by the attacker.

And a simple replacement of cookies may execute the following:
SELECT MessageID, Subject FROM messages WHERE MessageID = '1432' or '1' = '1'

which would produce all the messages instead of showing only the one indexed 1432.

6. This last example given by them indicates a scenario where the developer is required to take the user name which might include hyphen and apostrophe as legitimate characters. In the following query:

$userKey = getUserID();
$name = getUserInput();

# ensure only letters, hyphens and apostrophe are allowed 
$name = whiteList($name, "^a-zA-z'-$");
$query = "INSERT INTO last_names VALUES('$userKey', '$name')";


attacker may use the hyphens to truncate the command and apostrophes to alter the structure of the whole query. 

Observed examples
The examples that are observed over the years have been placed into the National Vulnerability Database(NVD) with proper enumerations. Some of these examples are given on the CWE-89 page as follows:

CVE-2004-0366
chain: SQL injection in library intended for database authentication allows SQL injection and authentication bypass.

CVE-2008-2790
SQL injection through an ID that was supposed to be numeric.

CVE-2008-2223
SQL injection through an ID that was supposed to be numeric.

CVE-2007-6602
SQL injection via user name.

CVE-2008-5817
SQL injection via user name or password fields.

CVE-2003-0377
SQL injection in security product, using a crafted group name.

CVE-2008-2380
SQL injection in authentication library.

CVE-2017-11508

SQL injection in vulnerability management and reporting tool, using a crafted password.

Note that the earliest example given here dates back to 2004 while the latest one is from 2017. It should be noted further that this is not an exhaustive list. The actual listing has around 5000 entries in this category. This shows that the sql injections are actually observed in the works of experienced developers too. 

Potential Mitigation
SQL injection has been a problem security researchers and developers are trying to resolve from a couple of decades already. There are tools that targets it specifically. Some of the potential mitigations recommended by mitre are also demonstrated as good in the earlier examples also. Let us have a look at them one by one. 

Phase: Architecture and Design:

1. Strategy: Libraries or Frameworks
In this strategy, CWE-89 page indicates that one of the ways to mitigate the sql injection is at the Architecture and Design phase by using the libraries or frameworks which either does not allow the weakness to occur or provide some constructors that makes the weakness easier to avoid. The example that they provide in this document is the persistence layer like Hibernate or Enterprise Java Beans. 

2. Strategy: Parametrisation
They recommend, based on availability, the usage of structured mechanisms that automatically separates the data from the code, and are able to provide the relevant quoting, encoding, and validation automatically. This reduces the reliances on the developer to provide this capability where ever an output is produced.

3. Strategy: Environment Hardening
They give a few suggestions for environment hardening so that the code is executed at the lowest possible privilege required to accomplish the necessary task. Some of these suggestions are as follows:

  • Creation of isolated accounts, if possible, with limited privileges that are only used for a single task. This restricts the access of the attacker over the software and its environment, like a normal application would not require administrative privileges for day to day operation.
  • Adherence to the least privilege criteria when creating user account to an SQL database. It is suggested that the user should not have any privilege that is not necessary for their account.
  • Usage of the strictest permissions possible on all database objects, such as execute-only for stored procedure.


4. Strategy: Repetition of client-side security checks onto the server They recommend that the security checks performed on the client-side should be duplicated on the server so that the attacker will not be able to bypass these security checks on client side and feed in malicious values to the server. 5. Strategy: Enforcement by ConversionThey recommend that the cases where the set of acceptable values are known, for example files or urls, then the maps with a fixed number of indices should be used and the rest are prohibited.
Phase: Implementation

1. Strategy: Output Encoding
To address those cases where the dynamically generated queries, which although risky but becomes a requirement, the suggestion given by Mitre is to ensuring that the arguments are quoted and the special characters are escaped. They also point out that the most conservative one can go is to escape or filter all those character that do not pass an extremely strict whitelist. Also, if the special characters are required then one can wrap each argument in quote after an escaping/filtering step. They also points out that these measures might already be available in the database or the programming language. The examples they give are Oracle’s DBMS_ASSERT package and mysql’s mysql_escape_string() api function. It should be noted here that each of the sql query is a channel between the software and the database. If that channel is not properly restricted then it can be taken advantage of and hacked into by adversaries. Thus ensuring output encoding is essentially ensuring that the said channel may not be used in any other way than its intended purpose. 2. Strategy: Input ValidationThe recommendation of Mitre is the assumption that all input is malicious. The usage of “accept known good” or whitelisting, i.e. use of whitelist values which conforms to the strictest of the specification while rejecting all other values is seen as the best solution when available. They suggest that the input validation should have proper consideration to the potentially relevant properties such as “..length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules.”
They take a different stand on blacklisting because of its potential flaws, by saying “do not rely on a blacklist”. The argument given states that any change of environment will likely introduce at least one undesirable input, and this can give an attacker enough room to bypass intended validation. They point that the benefit of blacklist is seen in detection of potential attacks and in that the defence against the inputs that are so malformed that they should directly be stopped by blacklisting. Their recommendation for construction of sql queries other than out validation is the usage of the stringent whitelists that limit the character set based on the expected value of the parameter in request.
Their stance for input validation is still lower than output encoding, escaping, and quoting as the latter is seen as a better practice for preventing sql injection while former is seen to have some flaws like in the example when it is required to introduce some special characters in validation like a single quote in names like O’Really but which can provide attackers opportunity to exploit the sql commands. In their words, “proper output encoding, escaping, and quoting is the most effective solution for preventing SQL injection” Lastly, they point that, if feasible, it might be safest to disallow meta-characters entirely instead of escaping them, because once the data is entered into the database the meta-characters might not be escaped in later processes. 3. Strategy: Properly handle error messagesError messages are recommended by them to have the minimal amount of information. They suggest that the error messages are to strike a balance between being being too cryptic and not being cryptic enough. They should not reveal the process used to generate the error message. This can potentially avoid exposure to the weaknesses in the process or sql queries.
Essentially in context of sql injection they recommend that the error messages or logs should not provide any indication to the structure of the sql query thereby avoiding attackers to take comfort in tailoring successful attacks. 4. Strategy: Environment HardeningWhen using PHP, Mitre warns against the usage of register_globals. They recommend the the applications should be developed without relying on this features. They also suggest to be wary of the implementing an register_globals emulation that are subject to weaknesses like CWE-95, CWE-621, and similar issues.

Phase: Operations 

1. Strategy: Environment Hardening
This is as discussed above, the environment should be properly configured to only give the lowest level of privileges to the users and to avoid the usage of the features that are too malicious to use. 

2. Strategy: Application Firewall

They recommend application firewall for the environment where the application may not be controlled for sql injection like in the case of the third party applications. It should be noted that they show this as an emergency remediation with moderate effectiveness. 

Detection Methods
The detection methods mentioned by Mitre for SQL injection ranges to a number of recommendation ranging from Manual Detection for source code to Dynamic detection when does not have the access to it. Let us list down the recommendation given by them one by one:

Static Analysis

1. Usage of third party tools
They point that this weakness can be detected using the modern tools that use the data flow analysis or constraint based techniques to minimize the number of false positive in the determination of the sql injection. But they might not be able to recognize where the proper input validation is performed which leads to false positives again. Another problem is that they might not be able to scan the APIs which might be using the sql queries indirectly and hence leading to false negatives.

2. Manual Source Code Analysis
When the source code is available the first recommendation given by Mitre and SOAR is to do manual code review which is highly cost effective and is overall effective in determining the sql injection. Another approach is to go through focused manual spot-checks which is cost effective for partial coverage and is also highly effective.

3. Automated Source Code Analysis
They recommend the usage of the source code weakness analyzer or context configured source code weakness analyzer which as they point out are high cost effective solutions. They alose point that this kind of analysis is highly effective. 

4. Automated Binary or Bytecode Analysis

They point out that the binary/bytecode analysis with decompilation and further source code analysis can be highly effective in resolution as well as cost, and thus a recommended detection mechanism for sql injection.

Dynamic Analysis

1. Automated Dynamic Analysis
They recommend the tools that can detect the sql injection through the dynamic tools and techniques that interact with software using large test suites with many diverse inputs, such as fuzzy testing(fuzzing), robustness testing, and fault injection. They points that the software’s operations might slow down, but it should become unstable, crash, or generate incorrect results. They rate this mechanism as the one with moderate effectiveness in detection. 

2. With Manual Result Interpretation
They recommend Fuzz tester and Framework-based fuzzers for the manual result interpretation. This is suggested as something that will good at partial effectiveness suggested by SOAR.

3. With Automated Result Interpretation
The solution that is recommended at the highest level of cost effectiveness is the Database Scanners which will work with a high level of effectiveness. For other instances where it is acceptable to get just a rough idea would be Web Application Scanner or Web Service Scanner which can be used cost effectively at partial coverage. 


It is also mentioned that these techniques can produce results at high effectiveness.

Architecture and Design Review

Formal Methods / Correct-by-Construction
They recommend approaches which either look for formal methods or those methods which are themselves correct-by-construction for the detection of sql injections. The effectiveness of this type of approach is high according to them, in part due to the effectiveness of the third party manually doing the reviews based on these standards. 
Inspection


They recommend that the inspection done according to the IEEE 1028 standards done for requirements, design, source code, etc. can be cost effective for partial coverage, and can produce results which have high effectiveness.