11 June 2015

Deadlock with a Virtual Column

Update: There is already a bug filed for this issue, it is registered under number: 22591494

Virtual Columns are really cool. I like them a lot. If you've never heard of them, shame on you, learn about them.
In short: a Virtual Column is not a real column, it's an expression that looks like a column... more or less.
While using the Virtual Columns, we ran into a little oddity with them.

First of all let's start with the version of the database that I tested this on. Yes, I know it's an 11 database that's because the client is still running on this release.
These tests were run on the Virtual Box image that is provided by Oracle.
I still need to run these tests on Oracle 12c.
I just ran the script on my Oracle 12c database (in a PDB) and the same deadlock occurs.

   SQL> select *
     2    from v$version
     3  /

   BANNER
   ----------------------------------------------------------------------
   Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
   PL/SQL Release 11.2.0.2.0 - Production
   CORE 11.2.0.2.0 Production
   TNS for Linux: Version 11.2.0.2.0 - Production
   NLSRTL Version 11.2.0.2.0 - Production

The setup for this test is based on a copy of the EMP table.

SQL> create table emp
  2  as
  3  select *
  4    from scott.emp
  5  /

Table created.

To create a Virtual Column on my copy of the EMP table, I need a deterministic function.
This function takes two arguments, one for the ENAME and one for the EMPNO. And what does the function do? Actually nothing, it returns NULL.

   SQL> create or replace
     2  function vc
     3    (p_ename in emp.ename%type
     4    ,p_empno in emp.empno%type
     5    )
     6   return varchar2 deterministic
     7  is
     8  begin
     9   return null;
    10  end vc;
    11  /

   Function created.

The function needs to be deterministic because that is required when you want to define a Virtual Column.
Now we can add the Virtual Column (called VC) to my copy of the EMP table.

SQL> alter table emp
  2  add descr as (vc (ename, empno))
  3  /

Table altered.

So far, no problems. It all works.
The trouble began when you execute a TRUNCATE TABLE statement.

   SQL> truncate table emp
     2  /
   truncate table emp
          *
   ERROR at line 1:
   ORA-04020: deadlock detected while trying to lock object ALEX.EMP
   

To be honest, this is not the first deadlock that I created and it probably won't be the last :)
The snag with deadlocks is trying to figure out what caused it in the first place.
Of all things that I thought would happen, a deadlock is not one of them.
How can it? It is my own personal VirtualBox and I am the only one using it.

The first step investigating a deadlock is usually the alert.log, however there was nothing in it regarding the deadlock... honest.

After a bit of googling, I found a note on deadlocks by Yong Huang (link at the bottom) describing the causes of deadlocks.
In that article he points out that you can get more insight if you set a certain event, and that's what I did.

   SQL> alter session set events '4020 trace name processstate forever, level 10'
  2  /
  
Session altered.
   
To find out where the trace file was located, and the name of it, I used a query by Tanel Poder (link at the bottom).
      SQL> select value ||'/'||(select instance_name from v$instance) ||'_ora_'||
     2 (select spid||case when traceid is not null then '_'||traceid else null end
     3       from v$process where addr = (select paddr from v$session
     4       where sid = (select sid from v$mystat
     5           where rownum = 1
     6      )
     7         )
     8 ) || '.trc' tracefile
     9* from v$parameter where name = 'user_dump_dest'
   SQL> /

   TRACEFILE
   -------------------------------------------------------------------------------------------------
   /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3791.trc
   

In that trace file was the following information:

   A deadlock among DDL and parse locks is detected.
   This deadlock is usually due to user errors in
   the design of an application or from issuing a set
   of concurrent statements which can cause a deadlock.
   This should not be reported to Oracle Support.
   The following information may aid in finding
   the errors which cause the deadlock:
   ORA-04020: deadlock detected while trying to lock object ALEX.EMP
   --------------------------------------------------------
    object   waiting  waiting       blocking blocking
    handle   session     lock mode   session     lock mode
   --------  -------- -------- ----  -------- -------- ----
   0x31ae0d7c  0x3aab3cf4 0x31afc4c0    X  0x3aab3cf4 0x31aeb718    S
   
As you can see in the text (taken from the trace file), you can see that the sessions involved in the deadlock is the same, both the waiting and the blocking session are 0x3aab3cf4.

So at least my assumptions were correct, I was blocking myself.
Not that it got me any further...

After quite a long time fiddling around, I discovered the following.
If I change the function like below, the deadlock doesn't occur. See if you can spot the difference.

   SQL> create or replace
  2  function vc
  3    (p_ename in varchar2
  4    ,p_empno in number
  5    )
  6   return varchar2 deterministic
  7  is
  8  begin
  9   return null;
10  end vc;
11  /

Function created.

SQL> truncate table emp
  2  /

Table truncated.

Did you spot the difference?
The function at first used anchored datatypes for the arguments (%TYPE) and later on just a simple type (NUMBER and VARCHAR2).
Using the simple types, the truncate works.
There are some oddities when it exactly occurs and I haven't figured out yet when the deadlock occurs exactly. It seems that when an argument is anchored (%TYPE) and the underlying datatype is a NUMBER, the deadlock occurs...
Like I said I haven't really figured out what causes it.

Links

  1. Two common Deadlocks by Yong Huang
  2. Tanel Poder: Querying the current tracefile name, using SQL – with tracefile_identifier
  3. Oracle Base on Virtual Columns

6 comments:

  1. Nice catch, Alex.
    I've just repeated your test case on 11.2.0.4, and the problem still occurs there too.
    By the way, since 11g you can get the trace file name for the current session by

    select value from v$diag_info where name='Default Trace File'

    Thanks,
    Oren.

    ReplyDelete
    Replies
    1. Excellent tip regarding the v$diag_info,... it's also in the update on Tanel's blog :) ... guess it's a matter of reading the whole thing, instead of just copy-paste what you need :)

      Delete
  2. Interesting.
    Thanks for highlighting.
    Sounds like bug.
    Have you raised SR?

    ReplyDelete
    Replies
    1. Long story short: no, not yet.
      Currently working for the government and it takes some time to get through to the DBA and get them to take action.

      Delete
  3. One more interesting behavior on "library cache lock".

    It seems a 'library cache lock' (cycle) deadlock.
    During the lock (about 9 seconds), try to query v$wait_chains,
    or a select on v$libcache_locks (see Foreign Keys and Library Cache Locks: Analysis,
    http://ksun-oracle.blogspot.ch/2016/01/foreign-keys-and-library-cache-locks.html).

    ReplyDelete
  4. Coming late to the game but, as a throwaway comment, when I've hit ORA-04020 in the past one of the methods I use to supply more information is to create a DDL trigger that captures the contents of x$kgllk and x$kglpn before and after the call.

    ReplyDelete