For this example I use the dataset as AnthonyJ used in the comments. The explanation for the regular expression can be found in the original post.
SQL> with test as
2 (
3 select 1 id, 'joey,anthony,marvin' str from dual union all
4 select 5 id, 'tony,glenn' str from dual union all
5 select 8 id, 'john' str from dual
6 )
7 select id
8 , str
9 , regexp_substr (str, '[^,]+', 1, rn) split
10 from test
11 cross
12 join (select rownum rn
13 from (select max (length (regexp_replace (str, '[^,]+'))) + 1 mx
14 from test
15 )
16 connect by level <= mx
17 )
18 where regexp_substr (str, '[^,]+', 1, rn) is not null
19 order by id
20 ;
ID STR SPLIT
---------- ------------------- -------------------
1 joey,anthony,marvin joey
1 joey,anthony,marvin marvin
1 joey,anthony,marvin anthony
5 tony,glenn tony
5 tony,glenn glenn
8 john john
The trick here is in lines 11 through 17. The Cross Join is there to create multiple records, but no more than the longest intended individual words (three in this case).
Because each ID would result in three records, which is alright for ID 1, it will also create three records for ID 5 and ID 8. Line 18 removes these extra records.
If you use Oracle 11g, you can also use REGEXP_COUNT instead of the combination of REGEXP_REPLACE and LENGTH, which would look like this:
cross
join (select rownum rn
from (select max (regexp_count (str, ',') + 1) mx
from test
)
connect by level <= mx
)
Regexp_Count


I was looking for something exactly like this. Thanks a lot, it works like a charm.
ReplyDeleteHi there Alex. Thank you for sharing this. I found your post when looking for material for writing about delimited strings in PL/SQL. I am linking to your post in my tutorial (http://oracletuts.net/tutorials/how-to-tokenize-or-parse-a-string-in-plsql/)
ReplyDeleteI have been on your blog some times earelier. Keep up the good work!
Thank you very much for this, it was just what I needed.
ReplyDeleteThanks... works good!
ReplyDeleteThanks, very helpful for me.
ReplyDeleteIn my case I also needed it to have outer join behaviour so made a slight variation as below:
with test as
(
select 1 id, 'joey,anthony,marvin' str from dual union all
select 5 id, 'tony,glenn' str from dual union all
select 8 id, 'john' str from dual union all
select 9 id, null str from dual
)
select id
, str
, regexp_substr (str, '[^,]+', 1, rn) split
from test
left outer join (select rownum rn
from (select max (length (regexp_replace (str, '[^,]+'))) + 1 mx
from test
)
connect by level <= mx) splits on splits.rn <= length (regexp_replace (str, '[^,]+'))
order by id
;
This could also be done using an or rn = 1 in the where clause instead
Thank you Anonymous for your suggestion with the Left Outer Join. However, from the example you posted "John" is missing in the split-column. When you amend the second to last line to:
ReplyDeleteon splits.rn <= nvl (length(regexp_replace(str,'[^,]+')), 1)
you will see John show up.
The NVL was added to that line.
Hi,
ReplyDeletedumb I know but I have a dataset with some records that have a second name in the field EG: Rob (Bobby). So I need to leave Rob in the FNAME field and put Bobby in the AFNAME field.
I have landed on
TRIM(REGEXP_REPLACE(FNAME, '\((.*?)\)', '') )'
To update the FNAME field removing " (Bobby)
I was trying to use:
REGEXP_SUBSTR(FNAME, ' \((.*?)\)') but that of course return "(Bobby)"
which is where I landed on your blog...
I need an elegant way to only return Bobby without the delimiters.
Update table
Set AFNAME= ???,
SET FNAME= TRIM(REGEXP_REPLACE(FNAME, '\((.*?)\)', '') )'
Where FNAME like ‘%(%’
…
Update:
DeleteNot necessarily elegant but it works:
trim(translate(REGEXP_SUBSTR(FNAME, ' \((.*?)\)'), '()', ' ' ))
This query saved my bacon. There are plenty of examples out there using CONNECT BY, but it only works for one row at a time. Thank you so much.
ReplyDeleteHi. One Q regardig the first split post in http://nuijten.blogspot.no/2009/07/splitting-comma-delimited-string-regexp.html
ReplyDeleteNow the output is shown in one column and rows.
Is it possible to send them into a variabel for each of them ?
Cheers Mate
I'm not sure what you mean. If I understand correctly: you want the results of the query and place each value that you get back into a separate variable? Then you would have to know ahead of time how many variables you would need... unless you select them into a collection (like an Associative Array, Nested Table or a Varray).
Delete