A brief Python / SQLite example
This evening I had a situation where I had an SQLite database that had columns holding dates with different or no time zone. Running a query like so:
SELECT created,modified FROM objects;
… returned this:
… 2008-01-21 20:29:10-08:00 | 2008-01-22 10:58:48-08:00 2008-01-22 12:57:43-08:00 | 2008-01-22 13:20:48-08:00 2008-01-29 16:39:17 | 2008-06-12 06:56:33-07:00 2008-01-30 20:27:32-08:00 | 2008-01-30 20:25:49-08:00 2008-02-02 20:58:34 | 2008-06-09 08:46:41-07:00 2008-02-06 01:53:22-08:00 | 2008-02-06 01:51:37-08:00 2008-03-11 12:28:07 | 2008-06-09 08:27:37-07:00 2008-03-19 10:25:58-07:00 | 2008-04-02 03:56:13-07:00 2008-04-15 10:56:43-07:00 | 2008-05-05 17:06:28-07:00 2008-04-23 10:36:11-07:00 | 2008-05-06 18:28:02-07:00 …
Not a pretty sight. That data really needed cleaned up and, fortunately, I didn’t care about the time zone information. The fastest way to clean it up was to just remove those time zones. Below is my solution. I post it not because it’s great or difficult, nor because this is a common problem, but rather in hopes that someone may find it and learn from it.
This Python script makes a few, simple assumptions:
- You are running Python 2.5 and thus have SQLite built in.
- You are in the same directory as the database.
All it does it connect to the database, grab the rows and columns that I need, remove the time zone using a regular expression substitution, and the save the proper dates back to the database.
import re
import sqlite3
# Match the time zone in the string.
tz = re.compile('-[0-2]\d:00$')
# Create a connection to the database.
conn = sqlite3.connect('test.db')
# Create a cursor object to do the interacting.
c = conn.cursor()
# Grab the columns with the time-zoned dates.
old_rows = c.execute('SELECT created,modified,id FROM objects')
# Create an empty list that will hold the new tuples.
new_rows = []
# Iterate over the result, tearing out the time zone as we go.
for row in old_rows:
created = tz.sub('', row[0])
modified = tz.sub('', row[1])
id = row[2]
new_row = (created, modified, id)# Here is the new tuple ...
new_rows.append(new_row)# ... appended to our list.
# Iterate through the new list of tuples and put them in the database.
for row in new_rows:
c.execute('UPDATE objects SET created=?,modified=? WHERE id=?',
(row[0:3]))
# Commit the changes and close everything.
conn.commit()
c.close()
conn.close()