BIK Information Services, Inc.

Oracle PL/SQL snippets
BIK - Constructing SQL that selects the latest versions of records
Problem at hand:

We have a table where several versions of the same record may exist (for instance, person_name along with person_position. We may only be interested in the last position the person occupied). Usually transaction date is the determining factor.
We need to select only latest version of records.

Example
Let's consider a table like this (We are using Oracle 9i syntax just because we don't want to actually create a table, but if you still use 8i you can create this table yourself):
with mytable as (
select 1 key1, 2 key2, 'abc' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 1 key1, 2 key2, 'bcd' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 1 key1, 3 key2, 'ert' col1, to_date('20050703','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'qwe' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'plo' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 3 key1, 4 key2, 'dfg' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 4 key1, 4 key2, 'sdf' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
)
SELECT * FROM mytable;
We also assume that the table has two keys that will make our records different once we get rid of older versions).
It would display results like this:

      KEY1       KEY2 COL1 TXN_DT
---------- ---------- ---- ---------
         1          2 abc  01-JUL-05
         1          2 bcd  02-JUL-05
         1          3 ert  03-JUL-05
         2          1 plo  02-JUL-05
         2          1 qwe  01-JUL-05
         3          4 dfg  01-JUL-05
         4          4 sdf  02-JUL-05
notice to records with key1, key2 pair 1,2 (7/1/2005 and 7/2/2005), two records with key1, key2 pair 2,1 (7/2/2005 and 7/1/2005).
What we want is just the latest record for each of them.

Solution1. (Standard).
Let's rewrite this statement as follows:
with mytable as (
select 1 key1, 2 key2, 'abc' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 1 key1, 2 key2, 'bcd' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 1 key1, 3 key2, 'ert' col1, to_date('20050703','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'qwe' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'plo' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 3 key1, 4 key2, 'dfg' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 4 key1, 4 key2, 'sdf' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
)
SELECT * FROM mytable WHERE (key1, key2, txn_dt) IN 
(SELECT key1, key2, max(txn_dt) FROM mytable GROUP BY key1, key2)
;
It produces results like this:
      KEY1       KEY2 COL1 TXN_DT
---------- ---------- ---- ---------
         1          2 bcd  02-JUL-05
         1          3 ert  03-JUL-05
         2          1 plo  02-JUL-05
         3          4 dfg  01-JUL-05
         4          4 sdf  02-JUL-05

Solution. (Unusual).
with mytable as (
select 1 key1, 2 key2, 'abc' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 1 key1, 2 key2, 'bcd' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 1 key1, 3 key2, 'ert' col1, to_date('20050703','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'qwe' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'plo' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 3 key1, 4 key2, 'dfg' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 4 key1, 4 key2, 'sdf' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
)
SELECT * FROM (
	   SELECT mytable.*
   	   ,row_number() OVER (PARTITION BY key1, key2 ORDER BY txn_dt DESC) rn
	   FROM mytable
	) WHERE rn<2
;
This solution produces the following results (note the last column):
      KEY1       KEY2 COL1 TXN_DT            RN
---------- ---------- ---- --------- ----------
         1          2 bcd  02-JUL-05          1
         1          3 ert  03-JUL-05          1
         2          1 plo  02-JUL-05          1
         3          4 dfg  01-JUL-05          1
         4          4 sdf  02-JUL-05          1
This solution is a bit unisual but may have some other uses.
For instance, if you try the inner part of this sql:
with mytable as (
select 1 key1, 2 key2, 'abc' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 1 key1, 2 key2, 'bcd' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 1 key1, 3 key2, 'ert' col1, to_date('20050703','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'qwe' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 2 key1, 1 key2, 'plo' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
union
select 3 key1, 4 key2, 'dfg' col1, to_date('20050701','YYYYMMDD') txn_dt from dual
union
select 4 key1, 4 key2, 'sdf' col1, to_date('20050702','YYYYMMDD') txn_dt from dual
)
--SELECT * FROM (
	   SELECT mytable.*
   	   ,row_number() OVER (PARTITION BY key1, key2 ORDER BY txn_dt DESC) rn
	   FROM mytable
--	) WHERE rn<2
;
you will get results like this:
      KEY1       KEY2 COL1 TXN_DT            RN
---------- ---------- ---- --------- ----------
         1          2 bcd  02-JUL-05          1
         1          2 abc  01-JUL-05          2
         1          3 ert  03-JUL-05          1
         2          1 plo  02-JUL-05          1
         2          1 qwe  01-JUL-05          2
         3          4 dfg  01-JUL-05          1
         4          4 sdf  02-JUL-05          1
Notice what is happening here is that column RN is giving you a sequence number within its group. This may be useful by itself. The outer SQL just selects the first record from each group.


We hope you found this snippet useful.
If so drop us a note to Boris Knizhnik  borisk@bikinfo.com.
We appreciate any links to this site as they show your appreciation and work as free ads.


Last modified on


Back to BIK Information Services Home page Back to BIK Information Services home page