11 October 2018

ODC Appreciation Day - Comparing JSON, they thought of everything...

Oracle 12c introduced JSON in the database and one of the most recent additions is a function to compare JSON, JSON_EQUAL.

From the documentation:

Oracle recommends that you use BLOB storage.
When I first learned about this recommendation it surprised me. Looking at a BLOB doesn't even remotely look like JSON, so how does JSON_EQUAL deal with a comparison between a BLOB and a string?
I mean how does this:
7B2274686973223A2274686174227D
compare to this:
{"testing":"information"}
Let's try and find out.

create table t  
   (json_data blob)
/
insert into t values ( 
   utl_raw.cast_to_raw ( 
      '{"testing":"information"}' 
   ) 
);
        
select case  
       when json_equal (json_data, '{"testing":"information"}') 
       then 'same' else 'different' 
       end as result 
  from t
/

RESULT
---------
same
        
select case  
       when json_equal (json_data, '{"testing":"different"}') 
       then 'same' else 'different' 
       end as result 
  from t
/

RESULT
----------
different
        
How cool is that?

Later I learned from Chris Saxon about the reason behind the BLOB recommendation: it avoids character set conversions!
So, there are no excuses for not using BLOB as your default choice when storing JSON documents.

A big thank you to the team that implemented JSON in the database and for thinking of everything! Thanks ODC!

Links

11 September 2018

JSON, BLOB column and Check Constraint

Last week I attended Neil Chandler's session on JSON during the POUG conference in beautiful Sopot, Poland.
The JSON Developer's Guide recommends using BLOB for storing data, and this is what Neil also recommended.
I was under the (FALSE!) impression that it was not possible to put an IS JSON check constraint on a BLOB column, simply because I tried once and got an exception. After the session I asked Neil if this was the case, but (of course) he said that it was possible.
"Trust, but verify", so here is my test case.

My tests are done on a 12.1.0.2.0 release of the Oracle database, as can be seen in the output below.

SQL> select banner
  2    from v$version
  3  /

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0      Production
TNS for Solaris: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

First of all create a table with a BLOB column to hold the JSON data.

SQL> create table t
  2  (json_data blob)
  3  /

Table T created.

The most easiest way of placing an IS JSON check constraint on the column is to use the following statement:

SQL> alter table t add constraint json_chk check (json_data is json)
  2  /

Error starting at line : 1 in command -
alter table t add constraint json_chk check (json_data is json)
Error report -
ORA-40499: no format defined for binary data type
Because it failed, I was under the assumption that it is not possible to place an IS JSON check constraint on the column.
Turns out that I was not complete in my check constraint definition (guess I should have read the exception and documentation more closely).

When you place the check constraint in the following manner on the column, the IS JSON constraint is added.

SQL> alter table t add constraint json_chk check (json_data is json format json)
   2  /
 
 Table T altered.
In later versions of the database (I've also tried on 12.2 and 18c) it is possible to use the former syntax, there is no need to use "FORMAT JSON" in the check constraint definition.

Let's see the BLOB column with the check constraint in action.
First attempt to insert a valid JSON-string into the table:

   SQL> insert into t values ('{"this":"that"}' )
      2  /
    
    Error starting at line : 1 in command -
    insert into t values ('{"this":"that"}' )
    Error report -
    ORA-01465: Ongeldig hexadecimaal getal.
Even though the string is valid JSON, it is not a binary representation. For this to work UTL_RAW is required:
      SQL> insert into t values (utl_raw.cast_to_raw('{"this":"that"}') )
      2  /
    
    1 row inserted.
Trying to create an entry in the table which is not JSON will result in a violation of the JSON check constraint.
      SQL> insert into t values (utl_raw.cast_to_raw('just a string') )
      2  /
    
    Error starting at line : 1 in command -
    insert into t values (utl_raw.cast_to_raw('just a string') )
    Error report -
    ORA-02290: CHECK-beperking (ALEX.JSON_CHK) is geschonden.
When the value that you're trying to insert is not a binary value, it will always fail with "ORA-01465: invalid hex number":
      SQL> insert into t values ('just a string' )
      2  /
    
    Error starting at line : 1 in command -
    insert into t values ('just a string' )
    Error report -
    ORA-01465: Ongeldig hexadecimaal getal.

One of the benefits of placing an IS JSON check constraint on the column is that you can use a "dot-notation" to query out the data:

      SQL> select tst.json_data.this
      2    from t tst
      3  /
    
    THIS
    --------------------------------------------------------------------------------
    that
Keep in mind that you must use a table alias for this to work. Omitting the table alias will result in an error.
      SQL> select json_data.this
      2    from t tst
      3  /
    
    Error starting at line : 1 in command -
    select json_data.this
      from t tst
    
    Error at Command Line : 1 Column : 8
    Error report -
    SQL Error: ORA-00904: "JSON_DATA"."THIS": ongeldige ID

09 January 2018

JSON_TABLE and the Top 2000

Oracle database 12c provides native JSON parsing. It is relatively easy to go from data in a JSON format to a relational representation.
In this example I will use data from the Top 2000. Every year in The Netherlands a playlist is produced by popular vote and fully broadcast between Christmas and the last day of the year, with the number one being played just before the New Year. You can see the complete list of 2017 on the official NPO website.
The Top 2000 list that I'll use for this example is the one from 2017.

The data from the JSON file looks like this:

Only the first part of the JSON file is shown, with the first two songs.
At the highest level there are three name-value pairs ("success", "message" and "messages") and an array named "data".
The "data" array contains another array with JSON objects containing information about the songs.
Each of these JSON objects contain name-value pairs, with very short none descriptive names, like "s" and "a". You might guess what these names would mean.
Even though the "data" attribute is a nested array, at the top level it is only one level deep.
The complete JSON-file can be downloaded by following this link.

At first I want to get the JSON file into the database, before I extract the values that I need.
First create the table and a check constraint to make sure that is JSON.

create table temp2000 
(complete_list clob);

alter table temp2000
add constraint list_is_json check (complete_list is json)
/
This table will hold the complete JSON file containing the Top2000. The check constraint on the column will verify that only correctly formatted JSON is allowed.
Now add the JSON-file to the table, the easiest way to do this is by adding a row using SQL Developer and copy-paste the complete JSON.

What I want to extract from the complete JSON file is the information about the artist, songtitle, release year, current position in the list, as well as the position in the list from last year.
My guess would be that the "a", "s", "yr", "pos", "prv" are the attributes that I need.
With a CTAS (Create Table as SELECT) and the JSON_TABLE operator I can transform the JSON to relational representation.

create table top2000
as
select songtitle
      ,artist
      ,release_year
      ,curr_position
      ,prev_position
  from temp2000 t
      ,json_table (t.complete_list format json, '$.data[0][*]'
         columns (
           songtitle     varchar2(150) path '$.s'
          ,artist        varchar2(150) path '$.a'
          ,release_year  number path '$.yr'
          ,curr_position number path '$.pos'
          ,prev_position number path '$.prv'
         )
      )
/
Because the song information is contained in the "data" array, and only in the nested array, I need to address that array as follows:
'$.data[0][*]'
Zero (in square brackets) representing the first array in the "data" attribute, and a wildcard (the asterisk) to address the other objects in the nested array.

To help with the discovery of the structure (and the paths to the values) of the JSON, Oracle Database 12c Release 2 introduced the JSON_DATAGUIDE function. Unfortunatelly I don't have Release 2 at my disposal right now, so I leave that for a later time.

Links