What’s a date?

Leave a comment

What is a date?  This seems like a silly question.  Indeed, if you are an independent business person in Arizona in the USA, it is a silly question.  A date is, for example, the seventh of September, 2011 (“2011-09-07”).  It describes a period of 24 hours that starts at midnight and ends just before midnight. If you only care about dates in Arizona, you’re in good shape.

But Arizona does not switch between daylight savings time and standard time.  So, each day is 24 hours long.  In much of the rest of the world there’s a day in the spring — the “spring forward” day — that’s 23 hours long. The hour from 2am to 3am is simply missing.  There’s another day in the fall that’s 25 hours long. On that day the hour from 2am to 3am repeats itself.

Even if you’re located in Arizona, you probably correspond with people in other time zones. This makes it even harder to pin down what a day is.  So, we need to make it clear when designing databases and the associated software.

When we design, for example, a date_of_sale field for a data base, we need to qualify that by acknowledging that is is in local time at the place the sale was made. Whether or not we explicitly acknowledge that, it’s still true.

So, when we make a sale on 2011-09-07, we can put that particular into a MySQL column with the datatype of DATE.  Then it is available for all kinds of querying and arithmetic.  For example.

The number of days since the sale:

SELECT DATEDIFF(CURDATE(),date_of_sale) AS number_of_days

Sales made on a particular day:

SELECT *
  FROM table
 WHERE date_of_sale = CAST('2011-09-07' AS DATE)

Sales made within seven days of a particular day:

SELECT *
  FROM table
 WHERE date_of_sale >= CAST('2011-09-07' AS DATE)
   AND date_of_sale <  CAST('2011-09-07' AS DATE) + INTERVAL 8 DAY

Notice the way the range of dates is selected in the last example. We might be tempted to use the SQL BETWEEN operator to choose a range of dates.  While this sometimes works, it is a bad habit to get into: it simply doesn’t work when we work with DATETIME information. A particular day starts at midnight, and ends just before midnight on the next day. The form of query in the example recognizes that truth.

Similarly, if we are asking for sales made on a particular day, it would be better practice  to use this query:

SELECT *
  FROM table
 WHERE date_of_sale >= CAST('2011-09-07' AS DATE)
   AND date_of_sale <  CAST('2011-09-07' AS DATE) + INTERVAL 1 DAY

This query will be sure to pull in the whole day.

If we want to display the date of sale in a human-friendly way, we can use, for example,

SELECT DATE_FORMAT(date_of_sale, '%W, %M %e, %Y') AS visible_date
  FROM table

This will display the date as “Wednesday, September 7, 2011”). There are many ways of formatting dates and times, shown here. They adhere to the locale — the local language — in use.  For example, if you issue this command to set the language to Spanish and the locale to Mexico:

SET lc_time_names = 'es_MX'

and then issue the above SQL statement again, you will get “miércoles, septiembre 7, 2011” which is the correct display for Mexican Spanish.

On the other hand, if you issue this command with a text string containing an incorrectly formatted date, you’ll get a NULL result.

SELECT DATE_FORMAT('07/09/2011', '%W, %M %e, %Y') AS visible_date

The important thing is to make sure your date information is stored in your database in a DATE item.  If it is you have access to all kinds of computations.  If your dates are stored as text strings you don’t.

Processing dates and times in SQL

Leave a comment

I’ve spent too many working days figuring out how to handle dates and times in relational data base management systems. From looking at the questions on stackoverflow.com it’s obvious that a lot of people have the same kinds of questions and confusions I have.

I’m a lazy programmer. I’d much rather do this stuff right than do it over, and I don’t think I’m alone.

A big challenge of this date and time work is the lack of useful standardization for the working programmer.  Oracle handles dates and times one way, MySQL handles them a different way, PostgreSQL another way, Microsoft applications another way, Unix and Linux yet another way, and so forth. (My British schoolteacher would have said, etcetera, etcetera, ad nauseam.)  It has ever been so: the ancient Hebrews and Babylonians had their own calendars just like various software systems do now.   Then, on top of that, we have time zones, the international date line, and leap seconds.

The  conflicting ways of denoting the passage of time offer the software developer a stark choice.

  • Swear fealty to one of the systems for marking dates and time, and forget about the others
  • Go all postmodern and multicultural, and try to make software that will work with more than one system

I prefer the postmodern approach.  But, either approach is hard to do right.  The first step to doing any of this the lazy way is to get the concepts right.

There’s a fine book on this topic by Richard T. Snodgrass called Developing Time-Oriented Database Applications in SQL (San Francisco: Morgan Kaufman, 1999).  It’s out of print, but Prof. Snodgrass has generously made it available online.