If you ever need to import a selected number of columns from a data file in MySQL then this how-to is for you!
Recently I had to import a data file (tab delimiter) containing geo-coding data for zip codes into a table that had less columns than the data file and the columns were not in the same order as my database table. Using MySQL's
LOAD DATA INFILE 'path_to_file' syntax and
@User_Variables I was easily able to get the data that I needed into the table.
Data file colums:
The data format is tab-delimited text in utf8 encoding, with the following fields :
country code : iso country code, 2 characters
postal code : varchar(10)
place name : varchar(180)
admin name1 : 1. order subdivision (state) varchar(100)
admin code1 : 1. order subdivision (state) varchar(20)
admin name2 : 2. order subdivision (county/province) varchar(100)
admin code2 : 2. order subdivision (county/province) varchar(20)
admin name3 : 3. order subdivision (community) varchar(100)
latitude : estimated latitude (wgs84)
longitude : estimated longitude (wgs84)
accuracy : accuracy of lat/lng from 1=estimated to 6=centroid
Database table columns:
zip
city
state
latitude
longitude
timezone
dst
country
When doing an import from a data file there are a few things that you need to take note of and adjust your import script and/or table definition accordingly.
- How your columns/fields are terminated.
- How your data (column values) are enclosed.
- How your rows are terminated.
- How large of a entry will your values be and does the column definition of your database table expect the maximum size. If not, your data will be truncated to the maximum allowed length.
Once your table structure has been adjusted to accommodate the data expected to be imported you are ready to write your script.
Rules on importing in MySQL 5:
- The order of the columns are dictated by the data file, not the table.
- You have to account for every column within the data file rows.
So the fields that I want to capture are mapped to their respective table columns:
datafile columns--------------table columns
country code-------------------country
postal code----------------------zip
place name----------------------city
admin name1-------------------
admin code1--------------------
admin name2-------------------
admin code2--------------------
admin name3-------------------
latitude---------------------------latitude
longitude-------------------------longitude
accuracy-------------------------
In order to produce a query statement that will adhere to the rules mentioned above we need use @User_Variables as fillers for the columns/fields that we are ignoring.
User variables are created by simply using the '@' sign in front of a variable name (ie, @user_variable). The naming convention of the user variable really doesnt matter when being used. But I will use @ignore just for readability sake.
Below is the new mapping with the addition of the user variable, @ignore, which again is being used as a simple filler to get to the columns/fields that we want to import.
datafile columns---------------table columns
country code-------------------country
postal code----------------------zip
place name----------------------city
admin name1-------------------@ignore
admin code1--------------------@ignore
admin name2-------------------@ignore
admin code2--------------------@ignore
admin name3-------------------@ignore
latitude---------------------------latitude
longitude-------------------------longitude
accuracy-------------------------
(not need cause we do not need to get to any columns after this one)
With the completed mapping we can write our script as such.
LOAD DATA INFILE '/path_to_file/datafile.txt'
INTO TABLE zipcode
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
(country,zip,city,@ignore,@ignore,@ignore,@ignore,@ignore,latitude,longitude);
Query Explained
LOAD DATA INFILE '/path_to_file/datafile.txt' ---------Points to the location of the data file.
INTO TABLE zipcode ----------------------------------------------Points the the table that we want to import into.
FIELDS TERMINATED BY '\t' ---------------------------------Specifies how the fields are terminated. In our case, they are terminated by a tab, '\t'.
LINES TERMINATED BY '\n' ----------------------------------Specifies how the lines are terminated. In our case, they are terminated by a new line break, '\n''
(country,zip,city,@ignore,@ignore,@ignore,
@ignore,@ignore,latitude,longitude); --------------------Specifies the table columns that we want to map the incoming data to. Again, the order is dictated by the data file.
For more information on importing data in MySQL 5 visit,
http://dev.mysql.com/doc/refman/5.0/en/load-data.html.