Despite the many strategies that are being promoted for optimizing MySQL 5 databases there is one basic strategy that I believe to be important, optimization with proper use of data types. Granted that other strategies are great, but they essentially rely on a database with tables designed in an intelligent fashion.
The two primary categories which I am going to focus on are Numeric Data Types and Character String Data Types for these are used most often and are usually the ones which need optimizing the most.
ABC's of Data Types
First, lets get this out of the way. When designing your table's you should keep the ABC's of data types in mind.
A –
Apt – use a data type that is most fitting for the data that will be stored
B –
Brief – try to use the smallest fitting size for your data as it will save storage space and improve performance
C –
Complete – use a type that is most accommodating for the size of the data being stored (keep in mind future data sizes)
This simple guideline will in fact guide you in the right direction when creating your table and its related fields (columns).
Numeric Data Types
When dealing with numeric data types, there are many options in which you can utilize. Again keeping in mind the ABC's of data types, you must ask yourself, “What am I storing?”. Depending on your requirements, you can use one of the following data types:
Integer Types
Numeric data that does not contain fractions (ie. 1.005)
| Type | Bytes | Minimum Value | Maximum Value |
|---|
| | | (Signed/Unsigned) | (Signed/Unsigned) |
| TINYINT | 1 | -128 | 127 |
| | | 0 | 255 |
| SMALLINT | 2 | -32768 | 32767 |
| | | 0 | 65535 |
| MEDIUMINT | 3 | -8388608 | 8388607 |
| | | 0 | 16777215 |
| INT | 4 | -2147483648 | 2147483647 |
| | | 0 | 4294967295 |
| BIGINT | 8 | -9223372036854775808 | 9223372036854775807 |
| | | 0 | 18446744073709551615 |
Optimize: Following the ABC's you want to select the data type that is most fitting for your min an max data size. Simple as that!
Floating Point Types (FPT)
Numeric data that is used to represent approximate (not accurate) value numbers. Don't think about using this for any kind of monetary system such as an accounting or banking system.
Although not accurate, FPT's are very efficient for storage and computation as they are stored in the native binary floating-point format (IEEE 754) used also by the server's CPU. But again, values are subject to rounding error!
| Type | Storage Required | Signed Range | Unsigned Range |
|---|
| FLOAT | 4 bytes | -3.402823466E+38 to -1.175494351E-38 | 0, and 1.175494351E-38 to 3.402823466E+38 |
| DOUBLE | 8 bytes | -1.7976931348623157E+308 to -2.2250738585072014E-308 | 0, and 2.2250738585072014E-308 to 1.7976931348623157E+308 |
Optimize: Following the ABC's you want to select the data type that is most fitting for your min and max data size. If you are using non fractional numbers with out any calculations, resort to Integer Data Types as they will potentially free up additional byes per entry.
Fixed-Point Types (FXPT)
Numeric data that is used to represent exact-value numbers that have a integer part, a fractional part, or both. This type, unlike the floating point type is not subject to rounding error as they are not stored using the processor's native binary format, hence, not calculated as quickly.
Optimize: Use only if you are in fact using some fractional numbers. In the event that you are not you may want to resort to the most suitable Integer Data Type. These may prove to be smaller in storage size freeing up additional space. Also, if you are storing smaller numbers you could also take advantage of Floating Points and utilize their storage format for efficient calculations, but the smallest storage size will be that of 4 bytes, possibly 8 for Doubles.
Character String Data Types
Although the Numeric Data Types are a bit more straight forward, Text Types are a bit more involved as far as how they are stored and handled by the server. Being that most these types are used quite often, there is usually a great amount for room for enhancing performance with these fields.
CHAR
This data type is a fixed-length string that is padded with spaces to the right of the specified length of the field (ie. CHAR(5) will store “Dan” as “Dan ”). When defining a field as CHAR(m) m specifies the max number of characters accepted in addition to the number of bytes (8 bits in a byte) stored regardless of number of characters stored. (ie From the previous example, “Dan” will take up 5 bytes). The max for CHAR(m) is m = 255.
VARCHAR
This data type is a variable-length string that is not padded with spaces to the specified length, but does include an additional byte to account for the length of the stored text. This length byte is read by the server when querying against the table. Thereby, taking up some extra processing time.
Optimize: First, refer to the ABC's to decide what your field requirements are. Second, lets look at the following scenarios to help in you decide which approach is best in optimizing your fields.
- You know that your data will always be a fixed length. Use CHAR and not VARCHAR! VARCHAR will store an additional byte for the length of the text. You don't need this extra byte of information as the server already knows the fixed length of the column. In addition, the server does not have to read each row to figure out the length of the text like it does for VARCHAR.
- If you are storing data that can vary in length but you know the max length, then use VARCHAR and not CHAR. CHAR(m) will pad the text with trailing spaces to accommodate the specified length of the field, always taking up m bytes. VARCHAR on the other hand will always append an additional byte but it will not pad the text with any trailing spaces, keeping the storage down to a minimum.
| Type | Storage Required | Maximum Length |
|---|
| CHAR (m) | m characters | 255 characters |
| VARCHAR (m) | L characters plus 1 byte | 65,535 characters (subject to limitations) |
| TINYTEXT | L characters + 1 byte | 255 characters |
| TEXT | L characters + 2 bytes | 65,535 characters |
| MEDIUMTEXT | L characters + 3 bytes | 16,777,215 characters |
| LONGTEXT | L characters + 4 bytes | 4,294,967,295 characters |
TEXT
This data type and its variations are closely similar to the previously mentioned with one exception, they all include informational bytes ranging from 1 – 4 in addition to the length of the text itself.
Optimize: Taking into consideration the ABC's begin reviewing your options in terms of suitable data types. Then determine if you can use CHAR or VARCHAR for any of your data fields as they may only use 1 or 0 additional bytes for storage. If not, then use the most fitting data type for your requirements.
Comments
Mon, 21.04.2008 13:43
Hi Vikram-- Have you set up your repository in Subversive and successfully connected?
Mon, 21.04.2008 12:56
On checkout as.. dialog you asked to choose "Check out as a project configured using the New Project Wizard." That [...]
Fri, 11.04.2008 06:06
Excellent post, very detailed and I especially like the images. It helped me a lot with my set up.
Thu, 03.04.2008 01:55
How can they say that they're "not responsible" for "death, personal injury, or severe property or environmental [...]
Tue, 01.04.2008 23:29
Hi Marcel-- Make sure you follow the pre-step I have above and ensure you've installed one of the two Subversion [...]
Tue, 01.04.2008 20:21
Please help when I try to import, I don't have Other/Checkout Projects from SVN. Where is my problem?
Thu, 21.02.2008 05:58
Great, this was useful.
Wed, 13.02.2008 09:26
Thanks , I was trying hard & hard with http://www.eclipse.org/pdt/ins tall.php but success came here... thanks for the [...]
Wed, 06.02.2008 15:39
I had to find packages for e Redhat EL3 server, the following worked for me: Catdoc [...]
Wed, 30.01.2008 14:38
Yeah, it does. Just remember that a function is specific to the database in which it's created. If you're trying to call [...]
Wed, 30.01.2008 14:17
Awesome - I didn't know you could calculate and transform fields using functions that way. Does it work with custom [...]
Tue, 22.01.2008 20:49
Hehe... DNS hosting services I toyed with including in the tutorial but it's really a bit too much for the one-man (or [...]
Sat, 19.01.2008 18:51
Awesome, thanks very much for the guide!
Wed, 16.01.2008 14:21
I'd have to say that I like #2, its much more of a natural web experience to me. We don't always have to emulate [...]
Wed, 02.01.2008 16:10
It's fashionable this time of year to create some sort of year-end "best of" or "top ten" list [...]