I ran into a requirement today where I had to combine a number of strings (varchars) into a delimited list. Some strings would be NULL and should be ignored.
I played around and found a bunch of different approaches which I’d like to share.
Let’s assume we have a package info_util
with the function person
which will combine a number of information about a person to a delimited list (to simplify the example all information is passed by parameter):
dbms_output.put_line(info_util.person(
i_name => 'Luke Skywalker',
i_alignment => 'light',
i_comment => 'Most powerful jedi of all times'));
-- Output: Luke Skywalker, light, Most powerful jedi of all times
dbms_output.put_line(info_util.person(
i_name => 'Vader',
i_title => 'Darth',
i_alignment => 'dark',
i_comment => 'Pretty evil'));
-- Output: Vader, Darth, dark, Pretty evil
Basic String concatenation
The simplest solution is just to go over everything with some if-logic and concatenating the strings
function person_concatenate(
i_name varchar2,
i_title varchar2,
i_alignment varchar2,
i_comment varchar2
) return varchar2 deterministic
as
l_result varchar2(32000);
begin
if i_name is not null then
l_result := l_result || i_name;
end if;
if i_title is not null then
if l_result is not null then
l_result := l_result || ', ';
end if;
l_result := l_result || i_title;
end if;
if i_alignment is not null then
if l_result is not null then
l_result := l_result || ', ';
end if;
l_result := l_result || i_alignment;
end if;
if i_comment is not null then
if l_result is not null then
l_result := l_result || ', ';
end if;
l_result := l_result || i_comment;
end if;
return substr(l_result, 1, 4000);
end;
But this is very hard to read, understand and maintain. If you don’t believe me, how long do you need to make sure there’s no difference between handling of i_alignment
and i_comment
?
Its also tedious and boring.
Using LISTAGG
The LISTAGG
functionality does more or less exactly what we want here, so why don’t we use it?
function person_listagg(
i_name varchar2,
i_title varchar2,
i_alignment varchar2,
i_comment varchar2
) return varchar2 deterministic
as
/* We have to collect all parts
in a nested table collection first
*/
l_parts sys.odcivarchar2list := new sys.odcivarchar2list();
l_result varchar2(4000);
/* This function makes our code more
readable: just extend the collection and add an item
*/
procedure add_part( i_string varchar2 )
as
begin
l_parts.extend;
l_parts(l_parts.last) := i_string;
end;
begin
/* Now we can simple add the parts */
add_part(i_name);
add_part(i_title);
add_part(i_alignment);
add_part(i_comment);
/* And can use LISTAGG to aggregate */
select listagg(column_value, ', ') within group (order by rownum)
into l_result
from table(l_parts);
return l_result;
end;
Just remember: whenever you are switching between SQL and PL/SQL, you will need to factor in the cost of a context switch.
However, unless the function is called a lot or you require really fast response times, it may not be a real issue.
Concatenate in a PL/SQL Loop
So maybe we can handle the concatenation in a PL/SQL Loop instead of using LISTAGG.
function person_plsql_comb(
i_name varchar2,
i_title varchar2,
i_alignment varchar2,
i_comment varchar2
) return varchar2 deterministic
as
/* So lets take the thing that improves readability */
l_parts sys.odcivarchar2list := new sys.odcivarchar2list();
l_result varchar2(4000);
procedure add_part( i_string varchar2 )
as
begin
l_parts.extend;
l_parts(l_parts.last) := i_string;
end;
begin
add_part(i_name);
add_part(i_title);
add_part(i_alignment);
add_part(i_comment);
/* But handle the concatenation in PL/SQL */
for i in l_parts.first..l_parts.last loop
if l_parts(i) is not null then
if l_result is not null then
l_result := l_result || ', ';
end if;
l_result := l_result || l_parts(i);
end if;
end loop;
return l_result;
end;
That looks kinda nice to me. But what about the performance?
The following code runs the function under test 20’000 times (10’000 for each use-case). To eliminate any advantage of the DETERMINISTIC
keyword, I added the run to each name, so no caching involved.
declare
l_ts timestamp := current_timestamp;
l_info varchar2(4000);
begin
for i in 1..10000 loop
l_info := info_util.person_concatenate('Luke Skywalker'||to_char(i), null, 'light', 'Most powerful jedi of all times');
l_info := info_util.person_concatenate('Vader'||to_char(i), 'Darth', 'dark', 'Pretty evil');
end loop;
dbms_output.put_line('Basic concatenation: ' || to_char(current_timestamp-l_ts));
end;
/
Basic concatenation | 0.02 seconds |
LISTAGG | 0.8 seconds |
PL/SQL Loop | 0.1 seconds |
So we see that the context-switches of LISTAGG indeed slow down our functionality. The looping and collection initialization stuff also comes at a performance cost, but it’s not that bad.
What if we use the DETERMINISTIC
advantage and basically just call 2 different statements over and over again?
Basic concatenation | 0.0008 seconds |
LISTAGG | 0.0009 seconds |
PL/SQL Loop | 0.0008 seconds |
There is hardly any difference, which means that if your data is homogenous, the additional performance cost of the Collection-with-Loop approach won’t be noticable (Performance always depends on your data).
It also shows that in many situations, concerns about performance-loss due to context-switches between SQL and PL/SQL won’t have the huge negative impact we might expect.
PL/SQL Object
What we could do now is to extract the whole aggregation part into a PL/SQL object (and yes, of course we can improve the readability and still not use objects, but they are very handy, especially in these cases):
create or replace type t_string_aggregator force is object
(
/* The nested table is now part of our object */
c_parts sys.odcivarchar2list,
constructor function t_string_aggregator return self as result,
member procedure add_string( i_string varchar2 ),
member function get_aggregate( i_delimiter varchar2 default ', ' )
return varchar2
);
/
create or replace type body t_string_aggregator as
constructor function t_string_aggregator return self as result
as
begin
/* Lets not forget to initialize the collection */
c_parts := new sys.odcivarchar2list();
return;
end;
/* This is basically the same as the internal procedure
we used in the other approaches
*/
member procedure add_string( i_string varchar2 )
as
begin
c_parts.extend;
c_parts(c_parts.last) := i_string;
end;
/* We can even make the delimiter dynamic */
member function get_aggregate( i_delimiter varchar2 )
return varchar2
as
l_result varchar2(4000);
begin
/* Little tweak if we dont have any items */
if c_parts.count < 0 then
return null;
end if;
for i in c_parts.first..c_parts.last loop
if c_parts(i) is not null then
if l_result is not null then
l_result := l_result || i_delimiter;
end if;
l_result := l_result || c_parts(i);
end if;
end loop;
return l_result;
end;
end;
/
The final function would then look like this:
function person_plsql_obj(
i_name varchar2,
i_title varchar2,
i_alignment varchar2,
i_comment varchar2
) return varchar2 deterministic
as
l_aggregator t_string_aggregator := new t_string_aggregator();
begin
l_aggregator.add_string(i_name);
l_aggregator.add_string(i_title);
l_aggregator.add_string(i_alignment);
l_aggregator.add_string(i_comment);
return l_aggregator.get_aggregate();
end;
Pretty small and readable once we know what t_string_aggregator
is good for, right?
Performance-wise, it is about 10% slower than the PL/SQL Loop approach – which again is not noticable when using the DETERMINISTIC
advantage – depending on your data.
Warning: This solution has severe problems once the sum of the string-parts exceed 4000 chars. But that’s another topic and another example.
You can run the full example on LiveSQL and get the sourcecode from my GitHub repository.
Thoughts
So, why do I like this last approach via PL/SQL Objects so much?
From my point of view, it modularizes and encapsulates my functionality pretty good. This leads to the following advantages:
- it’s easier to understand when I use it in other places
- it has a clearly defined scope
- it is easier to expand (e.g. overflow-mechanics)
- it enables DRY (Don’t repeat yourself)
- it is very easy to test.
Looking forward to hear what you folks think about it.
0 Comments