Notes on Oracle: SQL

Showing posts with label SQL. Show all posts

16 October 2024

Display JSON in APEX using APEX_MARKDOWN and JSON_TRANSFORM #JoelKallmanDay

Markdown is a plain text format for writing structured documents, and in APEX there is a built-in package that can transform Markdown notation into HTML.
With Markdown it is also possible to show pieces of code or JSON documents in a fixed size font. Having a fixed font to display JSON documents in APEX is easier to read than a non-fixed font.
The idea is to add Markdown to the JSON document and use the APEX_MARKDOWN package to convert the whole string in HTML.
Something like:

  ```json
  {"this":"that"}

The first line in the Markdown starts with three ticks (`) followed by "json", then the JSON document start on a new line.

To setup the example, use the following script to create a table and add some sample data. The sample data consist of three different JSON documents. Note that the first document includes a PDF file, the Base64 value representind the PDF has been shortened for brevity. In reality the Base54 value is enormous, but there is a solution for that in this post.

create table t 
(outgoing_message json) -- Oracle Database 23ai introduced the JSON datatype
/
insert into t (outgoing_message) 
values (
  '{
    "orderNumber" : "APX000328573",
    "invoiceNumber" : 2023300028,
    "salesChannel" : "Amazon",
    "file" :
    {
      "mime" : "application/pdf",
      "data" : "255044462D312E340A25C3A4C3BCC3B6C39F0A322030206F626A0A3C3C2F4C656E6774682033203020522F46696C7465722F466C6"
    }
  }'
);

insert into t (outgoing_message) 
values (
  '{
    "status" : "success",
    "message" : "The order update was successfully received."
  }
  '
);

insert into t (outgoing_message) 
values (
  '{
    "orderNumber" : "APX000328573",
    "timeStamp" : "2023-04-13T14:49:50.453269Z",
    "salesChannel" : "Amazon",
    "financialStatus" : "paid",
    "logisticStatus" : "ready for production",
    "payments" :
    [
      {
        "provider" : "Mollie",
        "date" : "2023-04-13T16:49:50",
        "amount" : 569.95,
        "currency" : "EUR"
      }
    ]
  }');
commit;

Displaying the complete Base64 string representing the PDF document does not add any value for the user looking at the JSON document and is most likey irrelevant.
Using the JSON_TRANSFORM function it is trivial to replace the comple Base64 string into something that is good enough for the user, in this example we will replace it with "<Base64Encoded PDF Document>".
Depending on the version of the database that you're using, the syntax has changed slightly from Oracle Database 19c to Oracle Database 23ai, use the following snippet:

-- Oracle Database 19c 
select json_transform (ann.outgoing_message
          ,set '$.file.data' = '<Base64Encoded PDF Document>'  ignore on missing
          returning clob 
         ) as outgoing_message
  from t ann


-- Oracle Database 23ai
select json_transform (ann.outgoing_message
          ,nested path '$' (
             set '@.file.data' = '<Base64Encoded PDF Document>' ignore on missing
            )
         ) as outgoing_message
  from t ann
;

The next step is to wrap the resulting JSON document with the Markdown tags and a new line break:

apex_markdown.to_html ('```json'||chr(10)||

This results in the following query and result:

   select json_transform (ann.outgoing_message
            ,nested path '$' (
               set '@.file.data' = '<Base64Encoded PDF Document>' ignore on missing
            )
         ) as outgoing_message
  from t ann;
  
  
<pre><code class="language-json">{
  "orderNumber" : "APX000328573",
  "invoiceNumber" : 2023300028,
  "salesChannel" : "Amazon",
  "file" :
  {
    "mime" : "application/pdf",
    "data" : "<Base64Encoded PDF Document>"
  }
}
</code></pre>

01 February 2023

Parse CSV-file in different Character Set with APEX_DATA_PARSER

When there are special characters in a CSV-file, and the file is in a different characterset than which you expect, parsing it can give unexpected results.
From different suppliers we get CSV-file, which need to be parsed and stored in the database. This is done by using APEX_DATA_PARSER.
APEX_DATA_PARSER is very flexible and allows several file-formats: JSON, XML, CSV and even native Excel XLSX.
The files are uploaded to the database (in the background) and processed. And this works like a charm,... until... one of the suppliers provided a CSV-file in a different character set, with special characters.

Below is an example of this file:

This file is uploaded to a database table with the following structure:

  Name         Null? Type          
------------ ----- ------------- 
MIME_TYPE          VARCHAR2(255) 
FILENAME           VARCHAR2(400) 
BLOB_CONTENT       BLOB

The following query is used to parse the file and extract the data:

SQL> select p.col001
  2        ,p.col002
  3        ,p.col003
  4    from a
  5   cross
  6    apply apex_data_parser.parse (p_content => a.blob_content
  7                                 ,p_file_name => a.filename
  8                                 ) p
  9* /

COL001                     COL002     COL003     
__________________________ __________ __________ 
R�sselsheim am Main        3.000      624.760    
Gnutz                      1.000      139.490    
�ach-Palenberg             1.000      139.490    
Syke                       1.000      242.600    
H�kirchen-Siegertsbrunn    1.000      136.980    
Eberhardzell               1.000      233.810    
heinsberg                  1.000      67.840     
Hohndorf                   3.000      304.800    
Nidderau                   3.000      385.890    
N�n-Hardenberg             1.000      239.680    
Kl��1.000                  185.000               
K                          2.000      237.880    
W�rselen                   1.000      196.500    
Gl�ckstadt                 1.000      136.980    
Dessau-Ro�au               1.000      5.900      

15 rows selected.

As you can see in the results above, the special characters have been removed. When a word in the first column ends in a special character, the parsing of the rest of the line is also messed up, there is a NULL in Column 3 while there should be a value.
Of course this has everything to do with the characterset of the uploaded file.

With the following command in a Terminal-window on Mac, it is possible to get information about the file, including the characterset:

  file -I name of the file

In my case that would yield the following results:

file -I charset.csv
charset.csv: text/plain; charset=iso-8859-1

This means that the file is in the characterset: ISO-8859-1

To find out which characterset this maps to in the database, the following query can help:

  SQL> select value
  2    from v$nls_valid_values
  3   where parameter  ='CHARACTERSET'
  4     and value like '%8859%'
  5* /

VALUE             
_________________ 
WE8ISO8859P1      
EE8ISO8859P2      
SE8ISO8859P3      
NEE8ISO8859P4     
CL8ISO8859P5      
AR8ISO8859P6      
EL8ISO8859P7      
IW8ISO8859P8      
WE8ISO8859P9      
NE8ISO8859P10     
WE8ISO8859P15     
BLT8ISO8859P13    
CEL8ISO8859P14    
AZ8ISO8859P9E     

14 rows selected.

From this list WE8ISO8859P1 is selected and passed in into the APEX_DATA_PARSER function:

SQL> select p.col001
  2        ,p.col002
  3        ,p.col003
  4    from all_csvs a
  5   cross
  6    apply apex_data_parser.parse (p_content => a.blob_content
  7                                 ,p_file_name => a.filename
  8                                 ,p_file_charset => 'WE8ISO8859P1'
  9                                 ) p
 10* /

COL001                        COL002    COL003     
_____________________________ _________ __________ 
Rüsselsheim am Main           3.000     624.760    
Gnutz                         1.000     139.490    
Übach-Palenberg               1.000     139.490    
Syke                          1.000     242.600    
Höhenkirchen-Siegertsbrunn    1.000     136.980    
Eberhardzell                  1.000     233.810    
heinsberg                     1.000     67.840     
Hohndorf                      3.000     304.800    
Nidderau                      3.000     385.890    
Nörten-Hardenberg             1.000     239.680    
Klüß                          1.000     185.000    
Köln                          2.000     237.880    
Würselen                      1.000     196.500    
Glückstadt                    1.000     136.980    
Dessau-Roßlau                 1.000     5.900      

15 rows selected.

And now all is well.

11 October 2022

Intervals go ISO-8601 #JoelKallmanDay

Another year, and it's time to write an entry for the #JoelKallmanDay.

My recent discovery in Oracle Database 21c, the innovation release, is how intervals can also be in ISO-8601 format.
The ISO-8601 format is an international standard for describing Date and Time formats in data exchanges.
There are two flavors of interval in the Oracle Database, "Day to Second" and "Year to Month" and there are specialized functions to convert strings in a certain format to an interval. The following example creates a "Day to Second" interval of 4 days, 3 hours, 25 minutes and 42 seconds:

SQL> select to_dsinterval ('4 3:25:42')
  2    from dual
  3* /

TO_DSINTERVAL('43:2
-------------------
+04 03:25:42.000000

Feeding this "Day to Second" interval into a JSON_SCALAR function will return the ISO-8601 standard for this interval:

SQL> select json_scalar (to_dsinterval ('4 3:25:42'))
  2    from dual
  3* /

JSON_SCALAR(TO_DSINTERVAL('43:25:42'))                                   
-------------------------------------------------------------------------
"P4DT3H25M42S"

The reverse is also true, it is possible to feed an ISO-8601 interval into the TO_DSINTERVAL function and get an Oracle interval returned:

SQL> select to_dsinterval ('P7DT4H12M53S')
  2    from dual
  3* /

TO_DSINTERVAL('P7DT
-------------------
+07 04:12:53.000000

This works the same for "Year to Day" intervals.

02 March 2022

Modify a JSON structure with a single Update

Recently there was a question on the Oracle Community Forum related to JSON, more specifically how to change the structure of it. The link to the Question is at the bottom of this post.
The starting point is the following JSON:

  {
   "tktKey": "123456",
   "attlist1": [
      {
         "attkey": "123",
         "attval": "abc"
      },
      {
         "attkey": "456",
         "attval": "def"
      }
   ],
   "attlist2": [
      {
         "attkey": "098",
         "attval": "xyz"
      }
   ]
}

The task at hand was to combine both attlist1 and attlist2 into a single attlist.
To do this the JSON object that is contained in attlist2 need to be extracted and place in attlist1. After that the element attlist2 can be removed from the structure. When this is done, there is an element with attlist1 with three JSON objects in it. Finally the element attlist1 needs to be renamed to attlist.
All of these actions can be done with a single SQL statement using JSON_TRANSFORM.

Let's first create a table to hold the original JSON

   SQL> create table js
  2  (str clob 
  3  )
  4* /

Table JS created.

And insert the JSON Object in there.

   SQL> insert into js values ('{ "tktKey": "123456", "attlist1": [ { "attkey": "123",   "attval": "abc"  },  { "attkey": "456","attval": "def"  } ],  "attlist2": [{"attkey": "098", "attval": "xyz"  } ]}')
  2* /

1 row inserted.

Here comes the magic moment:

   SQL> update js
  2     set str= json_transform
  3                (js.str
  4                ,append '$.attlist1'
  5                    = json_query (js.str, '$.attlist2[*]')
  6                ,remove '$.attlist2'
  7                ,rename '$.attlist1' ='attlist')
  8* /

1 row updated.

With JSON_QUERY (line 5) I extract the JSON Object from attlist2, which is added (APPEND, line 4) to attlist1.
Attlist2 is removed from the JSON object (line 6) and finally on line 7 attlist1 is renamed to attlist.

And to view the results, in a nice readable format:

SQL> select json_query (j.str, '$' returning clob pretty) as result
  2    from js j
  3* /

RESULT 
_______ 
{
  "tktKey" : "123456",
  "attlist" :
  [
    {
      "attkey" : "123",
      "attval" : "abc"
    },
    {
      "attkey" : "456",
      "attval" : "def"
    },
    {
      "attkey" : "098",
      "attval" : "xyz"
    }
  ]
}

The more I use the JSON functionality that the Oracle database has to offer, the more I like them!

14 July 2021

SQLCL 21.2: APEX LIST Command gives error (and a work around)

Recently I've upgraded to the latest SQLCl release, currently 21.2, and found a small bug.
To quickly export APEX applications I often use the "APEX"-commands in SQLCl. Exporting an APEX application is done by issuing the following command:

  apex export 1000

where you pass in the number of the application that you want to export.
This still works fine, but I don't always remember the application id, so I use "APEX list" to show a list of APEX-applications. This command is failing in this release unfortunatelly.
To demonstrate that this is happening:

SQL> show version
Oracle SQLDeveloper Command-Line (SQLcl) version: 21.2.0.0 build: 21.2.0.169.1529

The "APEX" and "APEX list" commands interchangable.

SQL> apex
2021-07-14 08:55:35.327 SEVERE oracle.dbtools.raptor.newscriptrunner.ScriptExecutor run java.base/java.util.Collections.nCopies(Collections.java:5027)
java.lang.IllegalArgumentException: List length = -3
        at java.base/java.util.Collections.nCopies(Collections.java:5027)
        at oracle.dbtools.raptor.utils.AnsiColorListPrinter.printHeaders(AnsiColorListPrinter.java:90)
        at oracle.dbtools.raptor.utils.ListPrinter.print(ListPrinter.java:96)
        at oracle.dbtools.raptor.utils.ListPrinter.print(ListPrinter.java:68)
        at oracle.dbtools.raptor.utils.ListPrinter.print(ListPrinter.java:62)
        at oracle.dbtools.raptor.utils.ListPrinter.printListofList(ListPrinter.java:208)
        at oracle.dbtools.raptor.newscriptrunner.commands.ApexCmd.printList(ApexCmd.java:250)
        at oracle.dbtools.raptor.newscriptrunner.commands.ApexCmd.handleEvent(ApexCmd.java:173)
        at oracle.dbtools.raptor.newscriptrunner.CommandRegistry.fireListeners(CommandRegistry.java:346)
        at oracle.dbtools.raptor.newscriptrunner.ScriptRunner.run(ScriptRunner.java:226)
        at oracle.dbtools.raptor.newscriptrunner.ScriptExecutor.run(ScriptExecutor.java:344)
        at oracle.dbtools.raptor.newscriptrunner.ScriptExecutor.run(ScriptExecutor.java:227)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.processLine(SqlCli.java:341)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.runSqlcl(SqlCli.java:1049)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.main(SqlCli.java:369)

As you can see using the "APEX" command in SQLCl raises some kind of Java-exception.
And if you run into an exception, what do you do? Of course you run to tweet about it... knowing that Jeff Smith, Product Manager at Oracle is always listening (or at least it seems that way). He replied with a solution to my problem

And using the ALTER SESSION command to set the NLS_DATE_FORMAT, things work again.

SQL> alter session set nls_date_format = 'DD-MON-RR HH:MI:SSpm'
  2* /

Session altered.

SQL> apex
       WORKSPACE_ID    WORKSPACE    APPLICATION_ID                           APPLICATION_NAME       BUILD_STATUS         LAST_UPDATED_ON 
___________________ ____________ _________________ __________________________________________ __________________ _______________________ 
1500550789501687    ALEX      1000              Master Application                         Run and Develop    18-JUN-21 11:42:49am

And no, it doesn't have to be that particular format mask for NLS_DATE_FORMAT. You can also a different format mask, like 'dd-mm-yyyy hh24:mi'. It changes how the last column is formatted in your output.

Thanks to Turloch O'Tierney to come up with this soluttion.

12 November 2020

Automatically Not Null when adding a DEFAULT Clause

Today I learned something new about the DEFAULT ON NULL clause.
Since Oracle database release 12 it is possible to define a DEFAULT ON NULL clause which is used when a NULL is provided as value. The thing I didn't know is that the column is automatically defined as NOT NULL.
Is this documented behaviour? Yes, it is. Quote from the documentation:

When you specify ON NULL, the NOT NULL constraint and NOT DEFERRABLE constraint state are implicitly specified.

SQL> select banner
  2    from v$version
  3  /

BANNER
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

SQL> create table t 
  2  (col1 varchar2(10) default on null 'hello')
  3  /

Table T created.

SQL> desc t
Name Null?    Type         
---- -------- ------------ 
COL1 NOT NULL VARCHAR2(10)

And in case you are wondering if you can change the column afterwards to NULL, the answer is "no".

SQL> alter table t 
  2  modify (col1 null)
  3  /

Error starting at line : 1 in command -
alter table t 
modify (col1 null)
Error report -
ORA-01451: column to be modified to NULL cannot be modified to NULL
01451. 00000 -  "column to be modified to NULL cannot be modified to NULL"
*Cause:    the column may already allow NULL values, the NOT NULL constraint
           is part of a primary key or check constraint.
*Action:   if a primary key or check constraint is enforcing the NOT NULL
           constraint, then drop that constraint.

05 June 2020

Conditional Compilation; One Dynamic Package Specification

When you want to make a distinction between different databases (like Development, Test, Acceptance, and Production) and have a single code base, conditional compilation can help out. Most of the time I read about conditional compilation when the use case is to eliminate code from the production environment, like instrumentation, which is of course a legit scenario. Some functionality should never run on Production, but likewise some functionality should never run on Dev, Test or Acceptance.

For example; we sent out emails to customers about their order statusses. This functionality should only be run on the Production database, and never-ever on any of the other databases. Initially there was a toggle in the application dictating if the emails should go to customers or to a dummy email address within the company. How fool-proof is this solution? Short answer: it's not. Sooner or later there will be someone who sets the toggle to "Sent to Customers" in the wrong environment. Of course there are many more examples like this one.

Anyway, to determine the environment which can be used for conditional compilation, I use a package specification which is compiled dynamically. Using the database name, which can be found in the USERENV context, the value is determined. To output a textual representation of a boolean value, I use LOGGER.TOCHAR to output "true", "false" or "null", but of course you can also write your own boolean-to-text-translation.

declare
   l_pack_spec varchar2(32767);
   l_env varchar2(25);
begin
   l_env := sys_context ('userenv', 'db_name');
   l_pack_spec := 'create or replace package environment_pkg'||chr(10);
   l_pack_spec := l_pack_spec ||'is'||chr(10);
   l_pack_spec := l_pack_spec ||'   --=='||chr(10);
   l_pack_spec := l_pack_spec ||'   -- Environment Information, useful for conditional compilation'||chr(10);
   l_pack_spec := l_pack_spec ||'   development constant boolean := '||lower (logger.tochar (l_env = 'DEV'))||';'||chr(10);
   l_pack_spec := l_pack_spec ||'   test        constant boolean := '||lower (logger.tochar (l_env = 'TST'))||';'||chr(10);
   l_pack_spec := l_pack_spec ||'   acceptance  constant boolean := '||lower (logger.tochar (l_env = 'ACC'))||';'||chr(10);
   l_pack_spec := l_pack_spec ||'   production  constant boolean := '||lower (logger.tochar (l_env = 'PRD'))||';'||chr(10);
   l_pack_spec := l_pack_spec ||'   --=='||chr(10);
   l_pack_spec := l_pack_spec ||'end environment_pkg;'||chr(10);
   execute immediate l_pack_spec;
end;

When you run the code above, the resulting code will be something like:

package environment_pkg
is
   --==
   -- Environment Information, useful for conditional compilation
   development constant boolean := true;
   test        constant boolean := false;
   acceptance  constant boolean := false;
   production  constant boolean := false;
   --==
end environment_pkg;

Indicating that the current database is the Development database.

Now you can use conditional compilation to include or exclude certain code sections, like:

$if environment_pkg.production $then
   -- Here comes the code to send an email to customers
   ....
$else
   -- Here comes the code to send an email to a dummy internal address
   ....
$end

To see the effect of conditional compilation, what does the code actually look like, you can use the built in package DBMS_PREPROCESSOR.PRINT_POST_PROCESSED_SOURCE which output the result to DBMS_OUTPUT

set serveroutput on
begin
   dbms_preprocessor.print_post_processed_source (object_type    => 'PACKAGE BODY'
                                                 ,schema_name    => user
                                                 ,object_name    => 'MAIL_PKG' -- package where you use the conditional compilation
                                                 );
end;
/

24 April 2020

First and Last Day of the ISO-year

The customer had the request for a report regarding the sales figures Year to Date, per week. More specifically: ISO-week.
More information about ISO-week, go to Wikipedia: ISO-week
Initially I didn't think too much about it, I know Oracle provides the format mask "IW" to get the ISO-week number, and thought that would do. But there is a little bit more to consider.
More information on format masks in Oracle Database 19c: Format Masks

Formatting the date to reflect the ISO-week is easy enough:

to_char (sysdate, 'iw')

which would give you output like '03' or '14'.
Just formatting the dates with this format mask could be sufficient... but there might be a snag to it.

The customer in my case wanted to see the data per year, so the first selection would be the year. Then a report would be shown with the number of order per ISO-week.
To get all the relevant data from the database I would add a filter like this one:

where order_date between to_date ('01-01-'||:P_YEAR, 'dd-mm-yyyy')
                     and to_date ('31-12-'||:P_YEAR, 'dd-mm-yyyy')

to select all the data for a given calendar year. The benefit of using calendar years is that you know they will always start on January first and end on December 31. Simply concatenate the year that the customer selected and turn that string into a proper DATE.

While I was testing the reports for the customer, I noticed that when I selected 2019 as the year, I would also see data for week 01. What's so strange about that? Well, the system didn't go live until August of 2019. There shouldn't be a data for the first week of 2019. Searching though the database backed up my assumption: the earliest orders in the system started in August 2019.

And this is the trap I fell into.. the combination of a Calendar year with ISO-weeks.
Let's take a closer look at the end of 2019 and the start of 2020.

SQL> with test as 
  2  (select date '2019-12-29' as dt from dual
  3  union all select date '2019-12-30' as dt from dual
  4  union all select date '2019-12-31' as dt from dual
  5  union all select date '2020-01-01' as dt from dual
  6  union all select date '2020-01-02' as dt from dual
  7  union all select date '2020-01-03' as dt from dual
  8  union all select date '2020-01-04' as dt from dual
  9  union all select date '2020-01-05' as dt from dual
 10  )
 11  select dt
 12        ,to_char (dt, 'iw') as iso_week
 13        ,to_char (dt, 'ww') as calendar_week
 14   from test
 15 /

DT       ISO_WEEK   CALENDAR_WEEK  
-------- ---------- ---------------
29-12-19 52         52             
30-12-19 01         52             
31-12-19 01         53             
01-01-20 01         01             
02-01-20 01         01             
03-01-20 01         01             
04-01-20 01         01             
05-01-20 01         01             

8 rows selected.

Note the difference in the ISO_WEEK and the CALENDAR_WEEK columns. The first ISO_WEEK starts on December 30 2019, while the first CALENDAR_WEEK starts on January 01 2020.
Also note that Calendar week 53 in 2019 is 1 day long.

Because I was selecting all order data in the selected year, up to December 31 2019, and then format it using ISO-weeks, it would show 01 in the report... It was just not week 01 in 2019, but in 2020.

Getting the First Day of an ISO-year

I decided that to get all the data correctly aligned, I would use the first and last day of the ISO-year. The disadvantage is that the first day of an ISO-year is not fixed, it might start in the year before. The same is true for the last day of the ISO-year, it might end in the year after the one you selected.
Using the format masks (in the documentation link at the top of this blog) it is also possible to pass in an ISO-year.

SQL> select trunc (date '2020-01-01', 'iyyy')
  2    from dual
  3  /

TRUNC(DA
--------
30-12-19

This seems to work, but it doesn't ... or at least not for the purposes that I need it.
If I pass in January 01 2020 I get the first day of the ISO-year, which is December 30 2019. That's correct.
When passing in January 01 2021, I get the following result:

SQL> select trunc (date '2021-01-01', 'iyyy')
  2    from dual
  3  /

TRUNC(DA
--------
30-12-19

And this is not what I expected... Turns out that January 01 2021 is still in ISO-year 2020, and the first day of the that ISO-year is December 30 2019.

The "trick" is not to pass in a date that might not be in the same year as the ISO-year. Don't choose any date in the last calendar week of December or the first calendar week of January.
Picking a random date in the middle of the year will give you the first day ISO-year:

SQL> select trunc (date '2021-08-05', 'iyyy')
  2    from dual
  3  /

TRUNC(DA
--------
04-01-21

Getting the Last Day of an ISO-year

Do the same trick as with getting the first day of the ISO-year.
Get the first day of the following ISO-year that you're interested in and subtract on second:

SQL> select trunc (date '2022-08-05', 'iyyy') - interval '1' second
  2    from dual
  3  /

TRUNC(DA
--------
02-01-22

And the WHERE-clause with the bind variable from the page:

between trunc (to_date (:P_YEAR||'-08-05', 'yyyy-mm-dd'), 'iyyy')
    and trunc (to_date (to_char (to_number (:P_YEAR) + 1)||'-08-05', 'yyyy-mm-dd'), 'iyyy') - interval '1' second

25 March 2020

Taming The Queue - when the third party can only handle so many connections

We had a problem with the awesome power that the Oracle database offers... or to be more specific the third party had a problem with that awesome power.

Setting the stage

Our applications handles many, many orders per day and mostly they are processed in waves during the day. Meaning there are peaks and valleys in the amount of orders that need to be processed. When the orders are processed they are send to a third party application by using a REST Service. Informing the third party doesn't have to be "real time", as long as they get the information.
The way we set up our side of the application is as follows:

when the order has a certain status, enough information is placed on a Queue (Advanced Queueing)
A callback function on the Queue will construct the JSON and do the REST Call (using APEX_WEBSERVICE)

The advantage of this method is that the operational system can continue working, without having to wait for the REST service to complete. Informing the other system is done in the background.

The Problem

The advantage of having the database of the Queue is that it "auto-scales". Because it runs in the background it can spin up as many "sessions" as it wants to perform the callback function and clear the queue. If the system gets busy, it will scale down the processing of the Queue so that the operational system doesn't slow down to a crawl. Simply put: Awesome.
However the third party could only handle a limited number of connections, four to be precise. The consequence of this was that when the system would spin up many sessions, a lot of them would get an Error: "Concurrent request limit exceeded. Request blocked." and the orders had to be tried again.

Solutions?

As we know that the third party could only handle four "sessions" we prefer to reduce the number of sessions that the database would spin up. However there is (as far as I could tell) no way to tell Oracle "you can only use at most four sessions to clear the Queue".
Reaching out on Twitter suggested adjusting the AQ_TM_PROCESSES parameter but that controls the time, not the number of "sessions".
Another suggestion was to tweak JOB_QUEUE_PROCESSES, but we didn't try that to be honest. There are other processes that utilize DBMS_SCHEDULER (e.g. for importing orders) that we didn't want to interfere with this parameter.
Thanks to Vidar Eidissen and Philipp Salvisberg for your responses on Twitter.

Our Solution

The method that we implemented was as follows:

Get an exclusive lock
Construct the necessary JSON and do the REST Call
Release the lock

To facilitate this method, I created a table which is just to coordinate the locks:

create table lock_handles
   (handle varchar2(10)
   ,nr    number
   ,constraint lh_pk primary key (handle)
   );

insert into lock_handles values ('LOCK1', 0);
insert into lock_handles values ('LOCK2', 0);
insert into lock_handles values ('LOCK3', 0);
insert into lock_handles values ('LOCK4', 0);

commit;

The only purpose that this table serves is just for locking, thinking about it a little bit more I doubt that I need more than one column though.

Next a function to get a lock, of course this function is placed in a package (for the example, the name of the package is "MYPKG".

  function get_lock
    return lock_handles.handle%type
  is
    e_locked exception;
    pragma exception_init (e_locked, -54);
    function get_lock (p_handle in lock_handles.handle%type)
        return lock_handles.handle%type
    is
        l_nr  lock_handles.nr%type;
    begin
        select lh.nr
          into l_nr
          from lock_handles lh
        where lh.handle = p_handle
          for update nowait;
        return p_handle;            
    end get_lock;
  begin
     for i in 1..4
     loop
        begin
           return get_lock ('LOCK'||to_char (i));
        exception
        when e_locked
        then
          continue;
        end;
      end loop;
      -- wait for a little bit before attempting 
      -- to get the lock again by recursively 
      -- calling this function
      sys.dbms_session.sleep (10);
     return mypkg.get_lock;
  end get_lock;

When calling this function, at first it will attempt to lock the row in the table with handle "LOCK1". Instead of waiting (FOR UPDATE NOWAIT) when that row is locked, an exception is raised. With the Pragma Exception Init, the exception is named E_LOCKED and thus can be handled by name.
When the exception is raised, another session has a lock on that row, it will attempt to lock the row in the table with handle "LOCK2", and so on.
If no locks can be obtained, tried and exception is raised each time, the function will wait for a little bit (10 seconds) and than call the function recursively to start all over again.
Let's say that a certain function call obtains a lock on the row, the calling program can continue and do the REST call.
When the REST Call is completed, the lock on the table is released by ending the transaction by issuing a commit.

Does it work?

Until sofar the theory, but does it work? Which is of course a valid question and I was shocked by my first tests. I tried processing a mere 15 orders and to my surprise I discovered that only the first lockhandle, LOCK1, was used.
It dawned that the queue could easily be handled by a single session, so the callback would just serialize the processing.
Increasing the load on the Queue did indeed show that all lockhandles were used. Also nice to see that at after a certain time, the usage of the lockhandle also decreases. It scales to whatever Oracle deems a good method of processing the Queue.

Could it be handled differently? Yes, probably and I'm open for suggestions.
Could the code be improved? Yes, probably and I'm open for suggestions.

Doh!

Just as I finishing up this blogpost, Connor McDonald chimed in on Twitter.

Well, yes Connor, yes, it certainly does...
The rewritten method is as follows

   function get_lock
      return varchar2
   is
      l_lock integer;
   begin
      loop
         for i in 1..4 -- max number of concurrent sessions
         loop
            l_lock := dbms_lock.request (id                => to_number ('223344'||to_char (i)) -- arbitrary locknr
                                        ,lockmode          => dbms_lock.x_mode -- exclusive mode
                                        ,timeout           => 0 -- don't wait to obtain a lock
                                        ,release_on_commit => true -- release when transaction ends
                                        );
            if l_lock = 1 -- Timeout
            then
               continue; -- try the next lock
            else
               return 'LOCK'||to_char (i);
            end if;
         end loop;
         -- A variation to the suggestion that
         -- Connor made in the comments.
         -- I've added an extra pause here
         -- before attempting to get another lock
         sys.dbms_session.sleep (10);
      end loop;
   end get_lock;

No need for the extra table, and the code is a lot shorter.

26 November 2019

Human readable JSON, stored in BLOB

Currently Oracle (we're using Oracle 18c at the moment) is still recommending to store your JSON document in a BLOB column. More efficient, no character conversions and probably some more advantages.

When there is a need to look at the JSON document, having it as a BLOB is not very useful.

An easy way to make the JSON Blob readable is to use the following query:

select json_query (i.json_payload, '$' returning clob pretty)
  from imports i

Omitting "Pretty" will output the JSON object without formatting. Including Pretty makes it look... well... pretty.

10 October 2019

OGB Appreciation Day: APEX_DATA_PARSER: Flexible, Powerful, Awesome (#ThanksOGB #OrclAPEX)

For this years OGB Appreciation Day I wanted to highlight the very awesome APEX_DATA_PARSER.

At my current project one of the requirements is that CSV-files are to be uploaded by the user and the application has to figure out what type of file it is and process accordingly.
So the challenge with this is:

Which type of file is it?
What delimiter was used?

Uploading the file is straight forward enough with Application Express (APEX) and out of scope of this blog. The files are stored in a APEX Collection before they are processed.

The original procedure to determine what type of file it was and how to process it was quite cumbersome.
The file is parsed and the first line was extracted. The string which effectively was the header line of the CSV file was compared with a predefined string and based on this comparison it was determined what type of file it was.
The problem was that sometimes the header line would look like this:

      DATE_,PAYMENT_METHOD,CURRENCY,AMOUNT,STATUS,ID,DESCRIPTION

and sometimes it would like this:

      "DATE_","PAYMENT_METHOD","CURRENCY","AMOUNT","STATUS","ID","DESCRIPTION"

or maybe even sometime like this:

      "DATE_";"PAYMENT_METHOD";"CURRENCY";"AMOUNT";"STATUS";"ID";"DESCRIPTION"

The header line could be enclosed in quotes, or not.. The delimiter was a comma, or not...
I've simplified the example, the real header lines where a lot longer.
You can imagine that parsing the rest of the file was quite tricky, as the file content could have commas as delimiter or semi-colons or ...
Because of all the variations the procedure was quite lengthy and error prone.

Simplifying with APEX_DATA_PARSER

With APEX_DATA_PARSER things got a lot cleaner, not to mention a lot faster. The example below will only give you the gist of the procedure.

First the different header lines where declared as a record type of datatype APEX_T_VARCHAR2.

         l_file1_header apex_t_varchar2 := apex_t_varchar2 ('DATE_'
                                                           ,'PAYMENT_METHOD'
                                                           ,'CURRENCY'
                                                           ,'AMOUNT'
                                                           ,'STATUS'
                                                           ,'DESCRIPTION'
                                                           );
         l_file2_headers apex_t_varchar2 := apex_t_varchar2 ('ID'
                                                            ,'REF'
                                                            ,'ORDER_'
                                                            ,'STATUS'
                                                            ,'LIB'
                                                            );

Second we need to get the header line of the file that was uploaded through the APEX application.

         select column_name
           bulk collect
           into l_file_header
           from table (apex_data_parser.get_columns
                         (apex_data_parser.discover (p_content   => r_files.file_content
                                                    ,p_file_name => r_files.filename
                                                    )));

The APEX_DATA_PARSER offers a function to discover information about the file. The function return a file-profile and contains information about the file-encoding, file-type, how many rows it parsed, what type of delimited is used in case of CSV, etc. etc.. This function is aptly named DISCOVER.
With the APEX_DATA_PARSER.GET_COLUMNS function uses the file profile created by the DISCOVER-function to obtain information about the content: Which columns are in the file, what might the datatype be, what is the format mask that was used, in which order are the columns.
With the simple SELECT statement above we know the header columns in the uploaded file.

We're almost there..

Now that the header columns are known, we can determine how to proceed from here with a straightforward CASE statement.
Why a CASE statement you might ask. It gives a easy opportunity to raise an exception in case an unknown file is uploaded... i.e. it will raise a CASE_NOT_FOUND exception.

         case l_file_header
         when l_file1_header
         then
            process_file1;
         when l_file2_header
         then
            process_file2;
         end case;

The uploaded files are moved from the APEX collection to a regular heap table called ALL_UPLOADED_FILES (omitted from this blogpost).

To process the content of the file we can now do an INSERT-SELECT and drop the content into the destination table:

         insert into destination_tbl
            (date_
            ,payment_method
            ,currency
            ,amount
            ,status
            ,description
            )
         select to_date (col001, 'yyyy-mm-dd hh24:mi:ss')
               ,col002
               ,col003
               ,col005
               ,col006
               ,col007
           from all_uploaded_files fle
               ,table (apex_data_parser.parse (p_content   => fle.csv
                                              ,p_file_name => fle.csv_filename
                                              ,p_skip_rows => 1 -- skip the header line
                                              ))

On a sidenote: we're using DropZone, so the file(s) are in an APEX collection which is read with the following query:

         select c001    as filename
               ,c002    as mime_type
               ,d001    as date_created
               ,n001    as file_id
               ,blob001 as file_content
           from apex_collections
          where collection_name = 'DROPZONE_UPLOAD'

A big thank you to the APEX team for implementing this awesome functionality! Thanks ODC!

11 September 2018

JSON, BLOB column and Check Constraint

Last week I attended Neil Chandler's session on JSON during the POUG conference in beautiful Sopot, Poland.
The JSON Developer's Guide recommends using BLOB for storing data, and this is what Neil also recommended.
I was under the (FALSE!) impression that it was not possible to put an IS JSON check constraint on a BLOB column, simply because I tried once and got an exception. After the session I asked Neil if this was the case, but (of course) he said that it was possible.
"Trust, but verify", so here is my test case.

My tests are done on a 12.1.0.2.0 release of the Oracle database, as can be seen in the output below.

SQL> select banner
  2    from v$version
  3  /

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0      Production
TNS for Solaris: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

First of all create a table with a BLOB column to hold the JSON data.

SQL> create table t
  2  (json_data blob)
  3  /

Table T created.

The most easiest way of placing an IS JSON check constraint on the column is to use the following statement:

SQL> alter table t add constraint json_chk check (json_data is json)
  2  /

Error starting at line : 1 in command -
alter table t add constraint json_chk check (json_data is json)
Error report -
ORA-40499: no format defined for binary data type

Because it failed, I was under the assumption that it is not possible to place an IS JSON check constraint on the column.
Turns out that I was not complete in my check constraint definition (guess I should have read the exception and documentation more closely).

When you place the check constraint in the following manner on the column, the IS JSON constraint is added.

SQL> alter table t add constraint json_chk check (json_data is json format json)
   2  /
 
 Table T altered.

In later versions of the database (I've also tried on 12.2 and 18c) it is possible to use the former syntax, there is no need to use "FORMAT JSON" in the check constraint definition.

Let's see the BLOB column with the check constraint in action.
First attempt to insert a valid JSON-string into the table:

   SQL> insert into t values ('{"this":"that"}' )
      2  /
    
    Error starting at line : 1 in command -
    insert into t values ('{"this":"that"}' )
    Error report -
    ORA-01465: Ongeldig hexadecimaal getal.

Even though the string is valid JSON, it is not a binary representation. For this to work UTL_RAW is required:

      SQL> insert into t values (utl_raw.cast_to_raw('{"this":"that"}') )
      2  /
    
    1 row inserted.

Trying to create an entry in the table which is not JSON will result in a violation of the JSON check constraint.

      SQL> insert into t values (utl_raw.cast_to_raw('just a string') )
      2  /
    
    Error starting at line : 1 in command -
    insert into t values (utl_raw.cast_to_raw('just a string') )
    Error report -
    ORA-02290: CHECK-beperking (ALEX.JSON_CHK) is geschonden.

When the value that you're trying to insert is not a binary value, it will always fail with "ORA-01465: invalid hex number":

      SQL> insert into t values ('just a string' )
      2  /
    
    Error starting at line : 1 in command -
    insert into t values ('just a string' )
    Error report -
    ORA-01465: Ongeldig hexadecimaal getal.

One of the benefits of placing an IS JSON check constraint on the column is that you can use a "dot-notation" to query out the data:

      SQL> select tst.json_data.this
      2    from t tst
      3  /
    
    THIS
    --------------------------------------------------------------------------------
    that

Keep in mind that you must use a table alias for this to work. Omitting the table alias will result in an error.

      SQL> select json_data.this
      2    from t tst
      3  /
    
    Error starting at line : 1 in command -
    select json_data.this
      from t tst
    
    Error at Command Line : 1 Column : 8
    Error report -
    SQL Error: ORA-00904: "JSON_DATA"."THIS": ongeldige ID

09 January 2018

JSON_TABLE and the Top 2000

Oracle database 12c provides native JSON parsing. It is relatively easy to go from data in a JSON format to a relational representation.
In this example I will use data from the Top 2000. Every year in The Netherlands a playlist is produced by popular vote and fully broadcast between Christmas and the last day of the year, with the number one being played just before the New Year. You can see the complete list of 2017 on the official NPO website.
The Top 2000 list that I'll use for this example is the one from 2017.

The data from the JSON file looks like this:

Only the first part of the JSON file is shown, with the first two songs.
At the highest level there are three name-value pairs ("success", "message" and "messages") and an array named "data".
The "data" array contains another array with JSON objects containing information about the songs.
Each of these JSON objects contain name-value pairs, with very short none descriptive names, like "s" and "a". You might guess what these names would mean.
Even though the "data" attribute is a nested array, at the top level it is only one level deep.
The complete JSON-file can be downloaded by following this link.

At first I want to get the JSON file into the database, before I extract the values that I need.
First create the table and a check constraint to make sure that is JSON.

create table temp2000 
(complete_list clob);

alter table temp2000
add constraint list_is_json check (complete_list is json)
/

This table will hold the complete JSON file containing the Top2000. The check constraint on the column will verify that only correctly formatted JSON is allowed.
Now add the JSON-file to the table, the easiest way to do this is by adding a row using SQL Developer and copy-paste the complete JSON.

What I want to extract from the complete JSON file is the information about the artist, songtitle, release year, current position in the list, as well as the position in the list from last year.
My guess would be that the "a", "s", "yr", "pos", "prv" are the attributes that I need.
With a CTAS (Create Table as SELECT) and the JSON_TABLE operator I can transform the JSON to relational representation.

create table top2000
as
select songtitle
      ,artist
      ,release_year
      ,curr_position
      ,prev_position
  from temp2000 t
      ,json_table (t.complete_list format json, '$.data[0][*]'
         columns (
           songtitle     varchar2(150) path '$.s'
          ,artist        varchar2(150) path '$.a'
          ,release_year  number path '$.yr'
          ,curr_position number path '$.pos'
          ,prev_position number path '$.prv'
         )
      )
/

Because the song information is contained in the "data" array, and only in the nested array, I need to address that array as follows:

'$.data[0][*]'

Zero (in square brackets) representing the first array in the "data" attribute, and a wildcard (the asterisk) to address the other objects in the nested array.

To help with the discovery of the structure (and the paths to the values) of the JSON, Oracle Database 12c Release 2 introduced the JSON_DATAGUIDE function. Unfortunatelly I don't have Release 2 at my disposal right now, so I leave that for a later time.

10 October 2017

ODC Appreciation Day: EMP #ThanksODC

Here is my very short entry for the Oracle Developer Community Appreciation Day 2017.

Very often I use the good ol' EMP table to demonstrate even the latest features of SQL and PL/SQL.
Everybody seems to know the EMP table, and some even know some of the content off the top of their head (yes, I'm guilty of that too). Whenever I need to write a hierarchical query and am not really sure what which column goes on which side in the CONNECT BY clause, I will use EMP to test and verify my assumptions. Something I did just this afternoon.

I found this old screenshot showing that the EMP table was around in Oracle database version 4.1.1.. history doesn't tell whatever happened to Carter though...

Update

Apparently I'm not the only one who's a fan of the EMP, there is even a store and a museum...

28 March 2017

Good old BIN_TO_NUM to check the overall status

A good while ago Chris Saxon, member of the AskTom answer team, asked on twitter which datatype you use when defining tables when you need a Boolean-representation. As you might know there is no Boolean datatype in SQL.
A lot of discussion followed which I'm not going to repeat.
Usually I use a VARCHAR2(1) with a check constraint for Y and N, but for a recent requirement I decided to use a NUMBER instead.

The requirement that I needed to implement was the following:

A number of tasks need to be fulfilled, the order doesn't matter, and they need to be checked by a supervisor. Before the supervisor gives his/her stamp of approval, the tasks all need to be completed.

For this example I will leave the authorization out, it is irrelevant.

create table tasks
(task_date       date   not null
,task_a          number not null check (task_a in (0,1))
,task_b          number not null check (task_b in (0,1))
,task_c          number not null check (task_c in (0,1))
,tasks_checked   number not null check (tasks_checked in (0,1))
)
/

The table contains a date for which the tasks need to be completed, indicators for the individual tasks (A,B, and C) and the Tasks Checked indicator.
All indicators are NOT NULL, datatype NUMBER and only allowed to contain a zero or one. An zero indicates FALSE, a one indicates TRUE.

The requirement states that the verification of the tasks should take place before the tasks are done. There are several ways to implement this requirement, and this time I'm going to use the combination of zeros and ones to create a binary representation.
I will add to overall status as a virtual column to the table using the built-in function BIN_TO_NUM.

alter table tasks
add (bin_status as (bin_to_num (task_a, task_b, task_c, tasks_checked)))
/

Now the (not so) magic part:
When the binary status equals 14, all tasks are completed and can be checked by a supervisor.
When the binary status is an odd number, the checking of the tasks is done before the tasks are completed and this is not allowed. The only exception is when the binary status equals fifteen (15), then all tasks are done and it is checked by a supervisor.
When the binary status an even number, all is good.
All this can easily be captured in a CHECK constraint:

alter table tasks
add constraint chk check (
   case when bin_status = 15 then 1
        when mod (bin_status, 2) = 0 then 1 
        else 0 end = 1
)
/

Finally two inserts to show that it works as expected.

insert into tasks
   (task_date
   ,task_a
   ,task_b
   ,task_c
   ,tasks_checked   
   )
values
   (sysdate
   ,0
   ,0
   ,1
   ,0
 14     );

1 row created.

insert into tasks
   (task_date
   ,task_a
   ,task_b
   ,task_c
   ,tasks_checked   
   )
values
   (sysdate
   ,1
   ,0
   ,1
   ,1
   );

insert into tasks
*
ERROR at line 1:
ORA-02290: check constraint (ALEX.CHK) violated

For the first time I've used LiveSQL to create the scripts that go along this blogpost, you can find that right here.

11 October 2016

OTN Appreciation Day: Analytic Functions

This is my contribution to the OTN Appreciation Day, which was initiated by Tim Hall.

One of my favorite features of the Oracle Database are Analytic Functions. They were introduced with Oracle Database 8.1.6 Enterprise Edition, and have been in the Standard Edition since version 9.

With analytic functions you can add inter-row calculations, aggegrate over multiple dimensions, rank assignments based on values. All this without a GROUP BY clause.
The syntax might take some getting used to, but you'll find that it's not that hard.

While I was still working at AMIS, I followed their 7UP training (all new features starting from Oracle 7 onwards) and that's when I first learned about Analytic Functions.
A little bit after that training, while doing consulting work for T-Mobile, I was triggered by a colleague to come up with an analytic function solution to the problem at hand. That was the moment that I really fell in love with Analytic Functions. It provided a very elegant and extremely performant solution.
The blog I wrote back then is still available at the AMIS-site: Analytic Power

#ThanksOTN

26 September 2016

Celebrate Batman Day with SQL

Because today is Batman Day, it calls for a celebration of the SQL type:

SQL> select listagg (b)
  2        within group (order by null) ||' Batman'
  3    from (select 0f/0 b
  4        from dual
  5      connect by level <= 14
  6      );

LISTAGG(B)WITHINGROUP(ORDERBYNULL)||'BATMAN'
--------------------------------------------------------------------------------
NanNanNanNanNanNanNanNanNanNanNanNanNanNan Batman

01 September 2016

RegExp: Constraint to prevent spaces at the beginning or end.

Even though a space is a regular character, the client didn't want spaces at the beginning or end of a string. Any spaces in the middle were fine.
Of course this could be handled by the application, but it must also be implemented in the database. Using a check constraint with a regular expression will prevent the end user from entering unwanted data.

To try things out, let's just start with a simple table with a single column.

   create table test
   (str varchar2(10));

Now the magic part: add a check constraint using a regular expression

   alter table test
   add constraint str_chk check (
      regexp_like (str, '^[^[:space:]].+[^[:space:]]$')
    );

The regular expression reads: The string should start (the first caret) with any character which is not in the character class of [:space:], followed by one or more characters (the period) and it should end with (the dollar) any character as long as it's not in the character class of [:space:].

UPDATE - 07-Sept-2016

The expression that is used in the above constraint will also prevent from a single allowed character or double character to be entered. This omission has been corrected by David Grimberg. The correct regular expression should be:

^[^[:space:]](.*[^[:space:]])?$

To test the constraint, the following insert statement were used.

   insert into test values ('hello');
   insert into test values ('hel lo');
   -- Not allowed:
   --   starting with a space
   insert into test values (' hello');
   --   ending with a space
   insert into test values ('hello ');
   --   just a space
   insert into test values (' ');
   --   multiple spaces
   insert into test values ('   ');
   --   Tab
   insert into test values (chr(9));
   --   Line feed
   insert into test values (chr(10));
   --   Carrige Return
   insert into test values (chr(13));

31 May 2016

Top N- queries: using the 12c syntax.

One of the new features with Oracle database 12c is the new syntax for Top N queries and pagination. Did we really need this? Should you choose for the new syntax over the way we used to do it, with an inline view? I think so, it simply adds syntactic clarity to the query, and in this blogpost I will show the difference between the "old" and the "new".

For the examples I will use my all time favourite demo data: the EMP table.

SQL> select ename
  2        ,sal
  3    from emp
  4   order by sal desc
  5  /

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850
CLARK            2450
ALLEN            1600
TURNER           1500
MILLER           1300
WARD             1250
Widlake          1250
ADAMS            1100
JAMES             950
SMITH             800

14 rows selected.

As you can tell from the output above, KING has the highest salary and FORD and SCOTT have the same salary which is the second highest.

If you wanted to write a Top N query before Oracle database 12c, let's say the Top 2 of most earning EMP, you would probably have written something with an inline view.

SQL> select ename
  2        ,sal
  3    from (select ename
  4                ,sal
  5            from emp
  6           order by sal desc
  7         )
  8   where rownum <= 2
  9  /

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000

This query might do the job most of the time, but the results might not be what you were looking for. In this case the requirement is "the Top 2 of most earning EMP", SCOTT and FORD should have both been in the results as they have the same salary.
To resolve this, you would have to rewrite your query using an Analytic Ranking Function in the inline view:

SQL> select ename
  2        ,sal
  3    from (select ename
  4                ,sal
  5                ,rank() over (order by sal desc) rn
  6            from emp
  7         )
  8   where rn <= 2
  9  /

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000

Using the new Top N syntax in Oracle database 12c, the query is a lot easier to understand.

SQL> select ename
  2        ,sal
  3    from emp
  4   order by sal desc
  5   fetch first 2 rows with ties
  6  /

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000

On line 4 the results are sorted based on the salary (highest on top) and line 5 instructs to get the first two rows. Because of the addition "with ties" both SCOTT and FORD are shown in the results.
To see what happens under the covers, an explain plan is created for this query.

SQL> explain plan for
  2  select ename
  3        ,sal
  4    from emp
  5   order by sal desc
  6   fetch first 2 rows with ties
  7  /

Explained.

SQL> select *
  2    from table (dbms_xplan.display())
  3  /

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3291446077

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |    14 |   644 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |      |    14 |   644 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |    14 |   154 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | EMP  |    14 |   154 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rank"<=2)
   2 - filter(RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SAL") DESC )<=2)

16 rows selected.

The output above shows that the Analytic Function RANK is used.

18 May 2016

Rounding amounts, divide cents over multiple lines

In previous articles I wrote about dealing with a missing cent when you need to divide a certain amount over multiple lines. In these articles, links are at the bottom, I described a method to calculate the difference on the last row.
Then a question arose (as a comment):

What if for example i have 42 records and i wish to divide 100 by 42. I would get a rounded value of 2.38. If i multiply this by 42 it would amount to just 99.96. What if want to spread the .04 difference on 4 records, not just the last record. In effect i'll be having 4 records with 2.39. Right now i'm doing this via cursor. Im kinda hoping i can do this using sql or analytic functions

Let's create a table first, called T:

create table t
as
select rownum id
  from dual
 connect by level <= 42

In order to determine the number of rows, we need three pieces of information:

The amount that we need to divide
The total number of rows in the set
The difference between the rounded amount and the amount that we need to divide

 select id
      ,100 amount
      ,count(*) over () entries
      ,round (100 / count(*) over (), 2) rounded
    from t

To calculate how many rows will have the extra cent added or subtracted, the following formula is used.

abs (amount - (entries * rounded)) * 100

When you want the rounding done on the "first" rows, a simple CASE expression can be used to mark the rows

case 
when rownum <= abs ((amount - (entries * rounded)) * 100)
then 'x'
end as indicator

The query now looks like the following, with the first ten rows:

with amounts
as
( select id
      ,100 amount
      ,count(*) over () entries
      ,round (100 / count(*) over (), 2) rounded
    from t
)
select id
      ,amount
      ,entries
      ,rounded
      ,case 
       when rownum <= abs ((amount - (entries * rounded)) * 100)
       then 'x'
       end as indicator
 from amounts
;
 ID     AMOUNT  ENTRIES    ROUNDED F
---------- ---------- ---------- ---------- -
  1   100       42       2.38 x
  2   100       42       2.38 x
  3   100       42       2.38 x
  4   100       42       2.38 x
  5   100       42       2.38
  6   100       42       2.38
  7   100       42       2.38
  8   100       42       2.38
  9   100       42       2.38
 10   100       42       2.38

The last piece of the puzzle is to determine if we need to add or subtract the cent. Using the SIGN function is an easy way to determine this.

sign (amount - (entries * rounded)) as pos_neg

Putting everything together will give you the following query (with the first 10 rows)

with amounts
as
( select id
      ,100 amount
      ,count(*) over () entries
      ,round (100 / count(*) over (), 2) rounded
    from t
)
,indicators as 
(
select id
      ,amount
      ,entries
      ,rounded
      ,case 
       when rownum <= abs ((amount - (entries * rounded)) * 100)
       then 'x'
       end as indicator
      ,sign (amount - (entries * rounded)) as pos_neg
 from amounts
)
select id
      ,rounded +
       (nvl2 (indicator, 0.01, 0) * pos_neg) final_amount
  from indicators
;
 ID FINAL_AMOUNT
---------- ------------
  1    2.39
  2    2.39
  3    2.39
  4    2.39
  5    2.38
  6    2.38
  7    2.38
  8    2.38
  9    2.38
 10    2.38

And there it is. The rounding is divided over the first four rows.

Don't want to use the first rows, but the last rows instead? Use the following expression to set the indicator

case 
when rownum >
 case sign ((amount - (entries * rounded)))
   when -1 then entries - abs ((amount - (entries * rounded)) * 100)
   else entries - (amount - (entries * rounded)) * 100
   end
then 'x'
end as indicator

Pages

16 October 2024

Links

01 February 2023

11 October 2022

02 March 2022

Links

14 July 2021

12 November 2020

05 June 2020

24 April 2020

Getting the First Day of an ISO-year

Getting the Last Day of an ISO-year

25 March 2020

Setting the stage

The Problem

Solutions?

Our Solution

Does it work?

Doh!

26 November 2019

10 October 2019

Simplifying with APEX_DATA_PARSER

11 September 2018

09 January 2018

Links

10 October 2017

Update

28 March 2017

Links

11 October 2016

26 September 2016

01 September 2016

UPDATE - 07-Sept-2016

31 May 2016

18 May 2016

Links