Using Multi-Byte Characters to Nullify SQL Injection Sanitizing

Datetime:2016-08-23 00:14:39         Topic: SQL Injection  SQL          Share        Original >>
Here to See The Original Article!!!

There are a number of hazards that using multiple character sets and multi-byte character sets can expose web applications to. This article will examine the normalmethod of sanitizing strings in SQL statements,research into multi-byte character sets, and the hazards they can introduce.

SQL Injection and Sanitizing

Web applications sanitize the apostrophe ( ' ) character in strings coming from user input being passed to SQL statements using an escape ( \ ) character. The hex code for the escape character is 0x5c . When an attacker puts an apostrophe into a user input, the ' is turned into \' during the sanitizing process. The DBMS does not treat \' as a string delimiter and thusly the attacker (in normal circumstances) is prevented from terminating the string and injecting malicious SQL into the statement.

If a multi-byte character supported by the server ended in the hex code 0x5c, it is possible for an attacker to insert the prefix to this character before the apostrophe, so that the escape, in combination with this prefix, turns into a different character altogether and allows the single quote to escape the string input unscathed. While this idea isn't necessarily new, finding research online that includes an entire list of character sets and characters is cumbersome at best. This article attempts to put all of the research and tools in one place.

Researching Multi-byte Character Sets

A small python script was devised to determine which character set and characters within them contained multi-byte characters ending in 0x5c. The script iterates over all installed character sets and then inspects their hexadecimal values for each character. A list of character sets found to contain valid multi-byte character sets ending in 0x5c is provided in. Additionally, a video of running the script has been provided to show what the output should look like in.

Figure A: Character sets containing valid multi-byte characters ending in 0x5c
Used in Taiwan, Hong Kong, and Macau for "Traditional Chinese"
Hong Kong's Big5 Supplementary Character Set
Windows-31J (Japanese)
Microsoft's implementation of Big5
Chinese National Character Set
Simplified Chinese
Korean Legacy Encoding
Shift Japanese Industrial Standards
Figure B: Multi-byte Inspection Script Video


In conclusion, there are hundreds of multi-byte characters that could potentially allow attackers to perform SQL injection through sanitizing. It is interesting to note that these character sets are intended for use in a specific region of the world. Ways to fix this by forcing both the webserver and the SQL server to use the same character set exist, as this vulnerability only occurs when multiple (and different) character sets are in use. Those looking to do so may find this research interesting.


Put your ads here, just $200 per month.