N., D. and S. have been nice enough to read this blog. But I know, I know, you don’t understand what I’m going on about! It’s not you. It’s just that I don’t really write these things for a general audience.
Anyhow, to make amends, this is a description of something I’ve been working on this morning, for you (if you’re still interested!) and for me (because I’ll doubtless forget this when I come to writing my dissertation). Welcome to SQL, or Structured Query Language, a language used to access data stored in databases.
The problem
Imagine a flat area of land, measuring 30km by 30km, with a town every 10km. A map of Flatland would look something like the image below (although you should be aware that the pesky Flatlanders use a coordinate system with an origin to the top-left, rather than the bottom-left).

Flatland
Incidents happen in each town in Flatland. It doesn’t matter, really, what the incidents are: maybe an incident is the beer supply running low. (Maybe they speak Czech in flatland?) Each incident has an identification number (incidentID), a start date and time (start), an end date and time (end), the town in which the incident happened, expressed as a point in space (point) and the severity of the beer shortage (severity).
Flatland’s residents created a database to store beer shortage incidents. It looked like this:

Initial database (click to enlarge)
Interesting. There are nine beer shortage incidents, in the nine towns of Flatland. Three happen between 12:00 and 13:00, three between 13:00 and 14:00 and three between 14:00 and 15:00. Flatland is very orderly. (Maybe they speak Swiss-German, rather than Czech?)
Note how each row has a unique updateID, which is different from the incidentID. This is because the residents of Flatland made a decision when creating their database to “model” beer shortages in this way. The upshot is that an incident can be recoded over several rows, each row representing an update to the incident.
Look! Some fresh incident data has come in…

Updated database (click to enlarge)
Crikey! There are three new rows in the database: the beer shortages with incidentID 1, 2 and 3 have two rows each. This could cause a problem: which rows best characterise the beer shortages with incidentID 1, 2 and 3? We could “collapse” the duplicated rows into one single row, using the SQL GROUP BY clause:

Badly grouped rows (click to enlarge)
However, note the subtle error: the beer shortages with incidentID 1, 2 and 3 end at 13:00. This should be 14:00.
What’s happened? Well, when two or more rows with the same incidentID are grouped, the values in the first row are retained whilst the values in the second (or third, fourth, fifth etc.) row are lost. The Flatlander’s require some more sophisticated SQL which makes an intelligent assessment about which values to retain and which values to loose. In this case, it would make sense to retain the earliest recorded start, the latest recorded end, and the maximum severity value. (Flatlanders are worst-case-scenario thinkers — an example of the Chicken Licken effect.)

Grouped rows (click to enlarge)
So, there we have it. A slightly better characterisation of a beer shortage incident in Flatland. There are still some issues to consider (is it right to use the maximum incident severity? would a mean (average) be more appropriate?) but they can wait.
If you replace “beer shortage” with “traffic”, multiply the number of incidents by a few thousand and move them about a bit then you’ve a better idea about part of what I’m up to!