-
Help on interesting "edges-values" query
I have this requirement:
We have a table containing a list of values with its timestamps. We must select only the "edges" values rows, that is, those rows that contain a different value from the previos one, considered the values order by TimeStamp.
This is an example of the table
RowId Value TimeStamp
1 1 2009-12-1 10:00:00
2 1 2009-12-1 10:30:00
3 0 2009-12-1 12:15:00
4 1 2009-12-1 14:45:00
5 0 2009-12-1 15:25:00
6 0 2009-12-1 18:00:00
7 1 2009-12-1 22:00:00
This is the expected result for "rising edges":
RowId Value TimeStamp
1 1 2009-12-1 10:00:00
4 1 2009-12-1 14:45:00
7 1 2009-12-1 22:00:00
This is the expected result for "falling edges":
RowId Value TimeStamp
3 0 2009-12-1 12:15:00
5 0 2009-12-1 15:25:00
This is the expected result for "all edges":
RowId Value TimeStamp
1 1 2009-12-1 10:00:00
3 0 2009-12-1 12:15:00
4 1 2009-12-1 14:45:00
5 0 2009-12-1 15:25:00
7 1 2009-12-1 22:00:00
I can think of a stored procedure to calculate the "edges" set, given an algorithm like this.
But I wonder if this can be easily done with a single SQL Query.
Any suggestions?
thanks a lot!!
-
SELECT *
FROM
(
--for every edge date get the starting point
SELECT value, MIN(rowid) as edge_rowid
FROM
(
--for every timestamp find the immediate edge date
SELECT t1.Timestamp, t1.rowid, t1.value,
MIN(t2.Timestamp) AS immediate_edge_date
FROM atable AS t1
LEFT OUTER JOIN atable AS t2 ON t1.Timestamp < t2.Timestamp AND t1.value <> t2.value
GROUP BY t1.Timestamp, t1.rowid, t1.value
) AS dates_values_change
GROUP BY immediate_edge_date, value
) AS edge_rowids
--HTH--
-
Thanks for the quick answer mikr0s!
It works just great!
Seems like it has inspirede me or something, because I've come up with another possible solution :
SELECT i.rowId, i.value, i.Timestamp as initTs
FROM atable i
WHERE i.value <>
(SELECT j.value FROM atable j
WHERE j.Timestamp < i.Timestamp
ORDER BY j.Timestamp DESC
LIMIT 1)
This one however has some issues:
1- it would not select the first row. This is not that important in my real case, it is just a matter of the time-frame used for the Query...
2- It uses the LIMIT 1 clause, which is MySQL dependant. How would you write that in standard SQL?
3- Performance: would you give me any advice on how both of the approaches would scale, from a performance point of view? Which one do you think works best?
Thanks again!
-
assuming there is an index on rowid, and sorting on rowid is the same as sorting on timestamp( a lower rowid points to an earlier date) your suggestion scales much better :
SELECT rowid, value, Timestamp
FROM atable where ROWID = (SELECT MIN(rowID) FROM atable)
UNION
SELECT i.rowId, i.value, i.Timestamp as initTs
FROM atable i
WHERE i.value <>(
select value from atable where rowid = (SELECT max(rowid) FROM atable j WHERE j.rowid < i.rowid)
)
--HTH--
-
although it should verify, in general I cannot assume that sorting on rowid is the same as sorting on timestamp. To get rid of the 'LIMIT 1' clause, I need to do the MAX() as you suggest, but do it on the timeStamp. It can be assumed that we will not have two different values for the same timeStamp:
SELECT i.rowId, i.value, i.Timestamp as initTs
FROM atable i
WHERE i.value <>
(
select value from atable
where Timestamp = (SELECT max(Timestamp) FROM atable j WHERE j.Timestamp < i.Timestamp )
)
I have enriched the query a bit to include the 'pulse' duration:
SELECT i.rowId, i.value, i.Timestamp as initTs,
(SELECT TIMESTAMPDIFF(MINUTE, i.ts, jj.ts) FROM atable j
WHERE j.ts > i.ts
AND j.value <> i.value
ORDER BY j.ts ASC
LIMIT 1) as durationMin
FROM atable i
WHERE i.value <>
(
select value from atable
where Timestamp = (SELECT max(Timestamp) FROM atable j WHERE j.Timestamp < i.Timestamp )
)
But in this case I'm confused on how to get rid of the 'LIMIT 1' clause. Any hint?
Thanks a lot again!!
Last edited by geesas; 12-09-2009 at 08:01 AM.
Reason: forgot to include explanation
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|