25 March 2020

Taming The Queue - when the third party can only handle so many connections

We had a problem with the awesome power that the Oracle database offers... or to be more specific the third party had a problem with that awesome power.

Setting the stage

Our applications handles many, many orders per day and mostly they are processed in waves during the day. Meaning there are peaks and valleys in the amount of orders that need to be processed. When the orders are processed they are send to a third party application by using a REST Service. Informing the third party doesn't have to be "real time", as long as they get the information.
The way we set up our side of the application is as follows:

  • when the order has a certain status, enough information is placed on a Queue (Advanced Queueing)
  • A callback function on the Queue will construct the JSON and do the REST Call (using APEX_WEBSERVICE)
The advantage of this method is that the operational system can continue working, without having to wait for the REST service to complete. Informing the other system is done in the background.

The Problem

The advantage of having the database of the Queue is that it "auto-scales". Because it runs in the background it can spin up as many "sessions" as it wants to perform the callback function and clear the queue. If the system gets busy, it will scale down the processing of the Queue so that the operational system doesn't slow down to a crawl. Simply put: Awesome.
However the third party could only handle a limited number of connections, four to be precise. The consequence of this was that when the system would spin up many sessions, a lot of them would get an Error: "Concurrent request limit exceeded. Request blocked." and the orders had to be tried again.


As we know that the third party could only handle four "sessions" we prefer to reduce the number of sessions that the database would spin up. However there is (as far as I could tell) no way to tell Oracle "you can only use at most four sessions to clear the Queue".
Reaching out on Twitter suggested adjusting the AQ_TM_PROCESSES parameter but that controls the time, not the number of "sessions".
Another suggestion was to tweak JOB_QUEUE_PROCESSES, but we didn't try that to be honest. There are other processes that utilize DBMS_SCHEDULER (e.g. for importing orders) that we didn't want to interfere with this parameter.
Thanks to Vidar Eidissen and Philipp Salvisberg for your responses on Twitter.

Our Solution

The method that we implemented was as follows:

  1. Get an exclusive lock
  2. Construct the necessary JSON and do the REST Call
  3. Release the lock

To facilitate this method, I created a table which is just to coordinate the locks:

create table lock_handles
   (handle varchar2(10)
   ,nr    number
   ,constraint lh_pk primary key (handle)

insert into lock_handles values ('LOCK1', 0);
insert into lock_handles values ('LOCK2', 0);
insert into lock_handles values ('LOCK3', 0);
insert into lock_handles values ('LOCK4', 0);

The only purpose that this table serves is just for locking, thinking about it a little bit more I doubt that I need more than one column though.

Next a function to get a lock, of course this function is placed in a package (for the example, the name of the package is "MYPKG".

  function get_lock
    return lock_handles.handle%type
    e_locked exception;
    pragma exception_init (e_locked, -54);
    function get_lock (p_handle in lock_handles.handle%type)
        return lock_handles.handle%type
        l_nr  lock_handles.nr%type;
        select lh.nr
          into l_nr
          from lock_handles lh
        where lh.handle = p_handle
          for update nowait;
        return p_handle;            
    end get_lock;
     for i in 1..4
           return get_lock ('LOCK'||to_char (i));
        when e_locked
      end loop;
      -- wait for a little bit before attempting 
      -- to get the lock again by recursively 
      -- calling this function
      sys.dbms_session.sleep (10);
     return mypkg.get_lock;
  end get_lock;
When calling this function, at first it will attempt to lock the row in the table with handle "LOCK1". Instead of waiting (FOR UPDATE NOWAIT) when that row is locked, an exception is raised. With the Pragma Exception Init, the exception is named E_LOCKED and thus can be handled by name.
When the exception is raised, another session has a lock on that row, it will attempt to lock the row in the table with handle "LOCK2", and so on.
If no locks can be obtained, tried and exception is raised each time, the function will wait for a little bit (10 seconds) and than call the function recursively to start all over again.
Let's say that a certain function call obtains a lock on the row, the calling program can continue and do the REST call.
When the REST Call is completed, the lock on the table is released by ending the transaction by issuing a commit.

Does it work?

Until sofar the theory, but does it work? Which is of course a valid question and I was shocked by my first tests. I tried processing a mere 15 orders and to my surprise I discovered that only the first lockhandle, LOCK1, was used.
It dawned that the queue could easily be handled by a single session, so the callback would just serialize the processing.
Increasing the load on the Queue did indeed show that all lockhandles were used. Also nice to see that at after a certain time, the usage of the lockhandle also decreases. It scales to whatever Oracle deems a good method of processing the Queue.

Could it be handled differently? Yes, probably and I'm open for suggestions.
Could the code be improved? Yes, probably and I'm open for suggestions.


Just as I finishing up this blogpost, Connor McDonald chimed in on Twitter.
Well, yes Connor, yes, it certainly does...
The rewritten method is as follows
   function get_lock
      return varchar2
      l_lock integer;
         for i in 1..4 -- max number of concurrent sessions
            l_lock := dbms_lock.request (id                => to_number ('223344'||to_char (i)) -- arbitrary locknr
                                        ,lockmode          => dbms_lock.x_mode -- exclusive mode
                                        ,timeout           => 0 -- don't wait to obtain a lock
                                        ,release_on_commit => true -- release when transaction ends
            if l_lock = 1 -- Timeout
               continue; -- try the next lock
               return 'LOCK'||to_char (i);
            end if;
         end loop;
         -- A variation to the suggestion that
         -- Connor made in the comments.
         -- I've added an extra pause here
         -- before attempting to get another lock
         sys.dbms_session.sleep (10);
      end loop;
   end get_lock;
No need for the extra table, and the code is a lot shorter.

26 November 2019

Human readable JSON, stored in BLOB

Currently Oracle (we're using Oracle 18c at the moment) is still recommending to store your JSON document in a BLOB column. More efficient, no character conversions and probably some more advantages.

When there is a need to look at the JSON document, having it as a BLOB is not very useful.

An easy way to make the JSON Blob readable is to use the following query:

select json_query (i.json_payload, '$' returning clob pretty)
  from imports i
Omitting "Pretty" will output the JSON object without formatting. Including Pretty makes it look... well... pretty.

10 October 2019

OGB Appreciation Day: APEX_DATA_PARSER: Flexible, Powerful, Awesome (#ThanksOGB #OrclAPEX)

For this years OGB Appreciation Day I wanted to highlight the very awesome APEX_DATA_PARSER.

At my current project one of the requirements is that CSV-files are to be uploaded by the user and the application has to figure out what type of file it is and process accordingly.
So the challenge with this is:

  • Which type of file is it?
  • What delimiter was used?

Uploading the file is straight forward enough with Application Express (APEX) and out of scope of this blog. The files are stored in a APEX Collection before they are processed.

The original procedure to determine what type of file it was and how to process it was quite cumbersome.
The file is parsed and the first line was extracted. The string which effectively was the header line of the CSV file was compared with a predefined string and based on this comparison it was determined what type of file it was.
The problem was that sometimes the header line would look like this:

and sometimes it would like this:
or maybe even sometime like this:
The header line could be enclosed in quotes, or not.. The delimiter was a comma, or not...
I've simplified the example, the real header lines where a lot longer.
You can imagine that parsing the rest of the file was quite tricky, as the file content could have commas as delimiter or semi-colons or ...
Because of all the variations the procedure was quite lengthy and error prone.

Simplifying with APEX_DATA_PARSER

With APEX_DATA_PARSER things got a lot cleaner, not to mention a lot faster. The example below will only give you the gist of the procedure.

First the different header lines where declared as a record type of datatype APEX_T_VARCHAR2.

         l_file1_header apex_t_varchar2 := apex_t_varchar2 ('DATE_'
         l_file2_headers apex_t_varchar2 := apex_t_varchar2 ('ID'
Second we need to get the header line of the file that was uploaded through the APEX application.
         select column_name
           bulk collect
           into l_file_header
           from table (apex_data_parser.get_columns
                         (apex_data_parser.discover (p_content   => r_files.file_content
                                                    ,p_file_name => r_files.filename
The APEX_DATA_PARSER offers a function to discover information about the file. The function return a file-profile and contains information about the file-encoding, file-type, how many rows it parsed, what type of delimited is used in case of CSV, etc. etc.. This function is aptly named DISCOVER.
With the APEX_DATA_PARSER.GET_COLUMNS function uses the file profile created by the DISCOVER-function to obtain information about the content: Which columns are in the file, what might the datatype be, what is the format mask that was used, in which order are the columns.
With the simple SELECT statement above we know the header columns in the uploaded file.

We're almost there..

Now that the header columns are known, we can determine how to proceed from here with a straightforward CASE statement.
Why a CASE statement you might ask. It gives a easy opportunity to raise an exception in case an unknown file is uploaded... i.e. it will raise a CASE_NOT_FOUND exception.
         case l_file_header
         when l_file1_header
         when l_file2_header
         end case;
The uploaded files are moved from the APEX collection to a regular heap table called ALL_UPLOADED_FILES (omitted from this blogpost).

To process the content of the file we can now do an INSERT-SELECT and drop the content into the destination table:
         insert into destination_tbl
         select to_date (col001, 'yyyy-mm-dd hh24:mi:ss')
           from all_uploaded_files fle
               ,table (apex_data_parser.parse (p_content   => fle.csv
                                              ,p_file_name => fle.csv_filename
                                              ,p_skip_rows => 1 -- skip the header line

On a sidenote: we're using DropZone, so the file(s) are in an APEX collection which is read with the following query:
         select c001    as filename
               ,c002    as mime_type
               ,d001    as date_created
               ,n001    as file_id
               ,blob001 as file_content
           from apex_collections
          where collection_name = 'DROPZONE_UPLOAD'

A big thank you to the APEX team for implementing this awesome functionality! Thanks ODC!

12 March 2019

The elegant NUMTODSINTERVAL to transform numbers

There is a database table with three columns into which data from an external source is loaded.
The data is loaded as a CSV file and contains three columns to represent the date, hour and minutes. The CSV file is loaded into a staging table with three columns:

  • one for the date, datatype: DATE
  • one for the hours, datatype NUMBER
  • one for the minutes, datatype NUMBER

Loading the information in the destination table, there is only one column to hold the data. These three columns should be transformed into a "real" date, meaning a DATE datatype as we know it.

One way is to concatenate all the columns together and apply the TO_DATE function with the appropriate format mask. With this method you need to take care of leading zeros for the hours and minutes.
This method would result in something like the following:

to_date( to_char( e.csv_date, 'DD-MM-YYYY') || ' ' || to_char( lpad( e.hour, 2, '0')) || ':' || to_char( lpad( e.minute, 2, '0')), 'DD-MM-YYYY HH24:MI') as clock_time

As you can imagine, it was only because of exceptions before realizing that the hours and minutes should be left padded with zeros.
There is another, easier, way to transform the three columns into a single DATE column:

e.csv_date + numtodsinterval (e.hour, 'HOUR') + numtodsinterval (e.minute, 'MINUTE') as clock_time
Using the NUMTODSINTERVAL function for the hours and the minutes makes the transformation trivial. No need to worry about leading zeros (or lack thereof).

NUMTODSINTERVAL is a great little function that makes code so much more elegant.

11 October 2018

ODC Appreciation Day - Comparing JSON, they thought of everything...

Oracle 12c introduced JSON in the database and one of the most recent additions is a function to compare JSON, JSON_EQUAL.

From the documentation:

Oracle recommends that you use BLOB storage.
When I first learned about this recommendation it surprised me. Looking at a BLOB doesn't even remotely look like JSON, so how does JSON_EQUAL deal with a comparison between a BLOB and a string?
I mean how does this:
compare to this:
Let's try and find out.

create table t  
   (json_data blob)
insert into t values ( 
   utl_raw.cast_to_raw ( 
select case  
       when json_equal (json_data, '{"testing":"information"}') 
       then 'same' else 'different' 
       end as result 
  from t

select case  
       when json_equal (json_data, '{"testing":"different"}') 
       then 'same' else 'different' 
       end as result 
  from t

How cool is that?

Later I learned from Chris Saxon about the reason behind the BLOB recommendation: it avoids character set conversions!
So, there are no excuses for not using BLOB as your default choice when storing JSON documents.

A big thank you to the team that implemented JSON in the database and for thinking of everything! Thanks ODC!


11 September 2018

JSON, BLOB column and Check Constraint

Last week I attended Neil Chandler's session on JSON during the POUG conference in beautiful Sopot, Poland.
The JSON Developer's Guide recommends using BLOB for storing data, and this is what Neil also recommended.
I was under the (FALSE!) impression that it was not possible to put an IS JSON check constraint on a BLOB column, simply because I tried once and got an exception. After the session I asked Neil if this was the case, but (of course) he said that it was possible.
"Trust, but verify", so here is my test case.

My tests are done on a release of the Oracle database, as can be seen in the output below.

SQL> select banner
  2    from v$version
  3  /

Oracle Database 12c Enterprise Edition Release - 64bit Production
PL/SQL Release - Production
CORE      Production
TNS for Solaris: Version - Production
NLSRTL Version - Production

First of all create a table with a BLOB column to hold the JSON data.

SQL> create table t
  2  (json_data blob)
  3  /

Table T created.

The most easiest way of placing an IS JSON check constraint on the column is to use the following statement:

SQL> alter table t add constraint json_chk check (json_data is json)
  2  /

Error starting at line : 1 in command -
alter table t add constraint json_chk check (json_data is json)
Error report -
ORA-40499: no format defined for binary data type
Because it failed, I was under the assumption that it is not possible to place an IS JSON check constraint on the column.
Turns out that I was not complete in my check constraint definition (guess I should have read the exception and documentation more closely).

When you place the check constraint in the following manner on the column, the IS JSON constraint is added.

SQL> alter table t add constraint json_chk check (json_data is json format json)
   2  /
 Table T altered.
In later versions of the database (I've also tried on 12.2 and 18c) it is possible to use the former syntax, there is no need to use "FORMAT JSON" in the check constraint definition.

Let's see the BLOB column with the check constraint in action.
First attempt to insert a valid JSON-string into the table:

   SQL> insert into t values ('{"this":"that"}' )
      2  /
    Error starting at line : 1 in command -
    insert into t values ('{"this":"that"}' )
    Error report -
    ORA-01465: Ongeldig hexadecimaal getal.
Even though the string is valid JSON, it is not a binary representation. For this to work UTL_RAW is required:
      SQL> insert into t values (utl_raw.cast_to_raw('{"this":"that"}') )
      2  /
    1 row inserted.
Trying to create an entry in the table which is not JSON will result in a violation of the JSON check constraint.
      SQL> insert into t values (utl_raw.cast_to_raw('just a string') )
      2  /
    Error starting at line : 1 in command -
    insert into t values (utl_raw.cast_to_raw('just a string') )
    Error report -
    ORA-02290: CHECK-beperking (ALEX.JSON_CHK) is geschonden.
When the value that you're trying to insert is not a binary value, it will always fail with "ORA-01465: invalid hex number":
      SQL> insert into t values ('just a string' )
      2  /
    Error starting at line : 1 in command -
    insert into t values ('just a string' )
    Error report -
    ORA-01465: Ongeldig hexadecimaal getal.

One of the benefits of placing an IS JSON check constraint on the column is that you can use a "dot-notation" to query out the data:

      SQL> select tst.json_data.this
      2    from t tst
      3  /
Keep in mind that you must use a table alias for this to work. Omitting the table alias will result in an error.
      SQL> select json_data.this
      2    from t tst
      3  /
    Error starting at line : 1 in command -
    select json_data.this
      from t tst
    Error at Command Line : 1 Column : 8
    Error report -
    SQL Error: ORA-00904: "JSON_DATA"."THIS": ongeldige ID

09 January 2018

JSON_TABLE and the Top 2000

Oracle database 12c provides native JSON parsing. It is relatively easy to go from data in a JSON format to a relational representation.
In this example I will use data from the Top 2000. Every year in The Netherlands a playlist is produced by popular vote and fully broadcast between Christmas and the last day of the year, with the number one being played just before the New Year. You can see the complete list of 2017 on the official NPO website.
The Top 2000 list that I'll use for this example is the one from 2017.

The data from the JSON file looks like this:

Only the first part of the JSON file is shown, with the first two songs.
At the highest level there are three name-value pairs ("success", "message" and "messages") and an array named "data".
The "data" array contains another array with JSON objects containing information about the songs.
Each of these JSON objects contain name-value pairs, with very short none descriptive names, like "s" and "a". You might guess what these names would mean.
Even though the "data" attribute is a nested array, at the top level it is only one level deep.
The complete JSON-file can be downloaded by following this link.

At first I want to get the JSON file into the database, before I extract the values that I need.
First create the table and a check constraint to make sure that is JSON.

create table temp2000 
(complete_list clob);

alter table temp2000
add constraint list_is_json check (complete_list is json)
This table will hold the complete JSON file containing the Top2000. The check constraint on the column will verify that only correctly formatted JSON is allowed.
Now add the JSON-file to the table, the easiest way to do this is by adding a row using SQL Developer and copy-paste the complete JSON.

What I want to extract from the complete JSON file is the information about the artist, songtitle, release year, current position in the list, as well as the position in the list from last year.
My guess would be that the "a", "s", "yr", "pos", "prv" are the attributes that I need.
With a CTAS (Create Table as SELECT) and the JSON_TABLE operator I can transform the JSON to relational representation.

create table top2000
select songtitle
  from temp2000 t
      ,json_table (t.complete_list format json, '$.data[0][*]'
         columns (
           songtitle     varchar2(150) path '$.s'
          ,artist        varchar2(150) path '$.a'
          ,release_year  number path '$.yr'
          ,curr_position number path '$.pos'
          ,prev_position number path '$.prv'
Because the song information is contained in the "data" array, and only in the nested array, I need to address that array as follows:
Zero (in square brackets) representing the first array in the "data" attribute, and a wildcard (the asterisk) to address the other objects in the nested array.

To help with the discovery of the structure (and the paths to the values) of the JSON, Oracle Database 12c Release 2 introduced the JSON_DATAGUIDE function. Unfortunatelly I don't have Release 2 at my disposal right now, so I leave that for a later time.