18 November 2013

PIVOT and UNPIVOT

The PIVOT and UNPIVOT operators were introduced in Oracle 11g and the other day I helped out a friend using these operators.
He wanted the data in the columns to be "shifted to the left". To explain a little more, take a look at the following data:
        ID V1    V2    V3    V4    V5
---------- ----- ----- ----- ----- -----
         1 a     b                 e
         2 aaa   bbbb  cccc
         3             a     b     e
         4 a           c           e
         5       b           d
Above is the data as it appeared in the table.
The requirement was to shift the data to the left like the data below:
        ID C1    C2    C3    C4    C5
---------- ----- ----- ----- ----- -----
         1 a     b     e
         2 aaa   bbbb  cccc
         3 a     b     e
         4 a     c     e
         5 b     d

The data in the columns is moved over one or more columns to the left. First things first: setting up the table and some test data:
drop table test purge
;
 
create table test
(id number 
,v1   varchar2(5)
,v2   varchar2(5)
,v3   varchar2(5)
,v4   varchar2(5)
,v5   varchar2(5)
)
;
 
insert into test values (1,'a', 'b', null, null, 'e');
insert into test values (2,'aaa', 'bbbb', 'cccc', null, null);
insert into test values (3,null, null, 'a', 'b', 'e');
insert into test values (4,'a', null, 'c', null, 'e');
insert into test values (5,null, 'b', null, 'd', null);

The question that my friend had was
"I've written this query and it's huge... Can't it be done in a easier way?" Following is the query that he wrote which gets the required result:
select coalesce (v1, v2, v3, v4, v5) v1
--
,      case when v1 is not null
            then coalesce(v2, v3, v4, v5)
            else case when v2 is not null
                      then coalesce(v3, v4, v5)
                      else case when v3 is not null
                                then case when v4 is not null
                                          then v4
                                          else v5
                                     end
                                else case when v4 is not null
                                          then v5
                                          else null
                                     end
                           end
                 end
       end v2
--
,      case when (v1 is null or v2 is null)
            then case when ((v4 is not null)
                        and ((v1 is not null and v3 is not null)
                          or (v2 is not null and v3 is not null)))
                      then v4
                      else case when ((v1 is null and v2 is null and v3 is null)
                                  or  (v1 is null and v2 is null and v4 is null)
                                  or  (v1 is null and v2 is null and v5 is null)
                                  or  (v1 is null and v3 is null and v4 is null)
                                  or  (v1 is null and v3 is null and v5 is null)
                                  or  (v1 is null and v4 is null and v5 is null)
                                  or  (v2 is null and v3 is null and v4 is null)
                                  or  (v2 is null and v3 is null and v5 is null)
                                  or  (v2 is null and v4 is null and v5 is null)
                                  or  (v3 is null and v4 is null and v5 is null))
                                then null
                                else v5
                           end
                 end
            else coalesce (v3, v4, v5)
       end v3
--
,      case when (v1 is null or v2 is null or v3 is null)
            then case when ((v1 is null and v2 is null)
                        or  (v1 is null and v3 is null)
                        or  (v1 is null and v4 is null)
                        or  (v2 is null and v3 is null)
                        or  (v2 is null and v4 is null)
                        or  (v3 is null and v4 is null))
                      then null
                      else v5
                 end
            else coalesce (v4, v5)
       end v4
--
,      case when v1 is null or v2 is null or v3 is null or v4 is null
            then null
            else v5
       end v5
--
from   test
;
Quite an enormous query.

PIVOT and UNPIVOT


Now using the PIVOT and UNPIVOT operators. First the UNPIVOT: to get all the data down the page instead of across:
SQL> select id
  2        ,value_type
  3        ,cols
  4        ,row_number() over (partition by id order by cols) rn
  5    from test
  6    unpivot
  7    (cols
  8      for value_type in ( v1, v2, v3, v4, v5)
  9    )
 10  /

        ID VA COLS          RN
---------- -- ----- ----------
         1 V1 a              1
         1 V2 b              2
         1 V5 e              3
         2 V1 aaa            1
         2 V2 bbbb           2
         2 V3 cccc           3
         3 V3 a              1
         3 V4 b              2
         3 V5 e              3
         4 V1 a              1
         4 V3 c              2
         4 V5 e              3
         5 V2 b              1
         5 V4 d              2

14 rows selected.
The UNPIVOT clause will get the data from each of the columns and lists them in a single column - named "cols".
To keep track of the order of the data that it needs to appear in in the final result, I added an analytic function (line 4 of the SQL statement).
After unpivotting the results, we want to pivot them back again to get the desired results.
Moving the previous query to an inline view and applying the PIVOT would be the next step.
SQL> select id
  2        ,c1
  3        ,c2
  4        ,c3
  5        ,c4
  6        ,c5
  7    from (select id
  8                ,value_type
  9                ,cols
 10                ,row_number() over (partition by id order by cols) rn
 11            from test
 12          unpivot
 13            (cols
 14                for value_type in ( v1, v2, v3, v4, v5)
 15            )
 16         )
 17   pivot (max (cols) for
 18            rn in (1 as c1
 19                  ,2 as c2
 20                  ,3 as c3
 21                  ,4 as c4
 22                  ,5 as c5)
 23         )
 24   order by id
 25 /

        ID C1    C2    C3    C4    C5
---------- ----- ----- ----- ----- -----
         1 a
         1       b
         1             e
         2 aaa
         2       bbbb
         2             cccc
         3 a
         3       b
         3             e
         4 a
         4       c
         4             e
         5 b
         5       d

14 rows selected.
As you can see the data appears in the correct columns, but there are too many rows shown for each ID.
This is caused by the value_type being selected on line 8. When we remove this column from the query, we end up with the required result:
SQL> select id
  2        ,c1
  3        ,c2
  4        ,c3
  5        ,c4
  6        ,c5
  7    from (select id
  8                ,cols
  9                ,row_number() over (partition by id order by cols) rn
 10            from test
 11          unpivot
 12            (cols
 13                for value_type in ( v1, v2, v3, v4, v5)
 14            )
 15         )
 16   pivot (max (cols) for
 17            rn in (1 as c1
 18                  ,2 as c2
 19                  ,3 as c3
 20                  ,4 as c4
 21                  ,5 as c5)
 22         )
 23   order by id
 24 /

        ID C1    C2    C3    C4    C5
---------- ----- ----- ----- ----- -----
         1 a     b     e
         2 aaa   bbbb  cccc
         3 a     b     e
         4 a     c     e
         5 b     d

No comments:

Post a Comment