The UDF pragma tells the compiler that the PL/SQL unit is a user defined function that is used primarily in SQL statements, which might improve its performance.I though it would be very cool to try it out with my HyperLogLog post I did recently and see if it results in any measurable performance improvement.
Test Table
I'll use the same test table as I did in my original post:
SQL> create table z_hll_test as 2 select dbms_random.string('x', 4)||rpad('x', 500, 'x') n 3 from dual 4 connect by level <= 1000000; Table created SQL> alter table z_hll_test cache; Table alteredNote that I'm explicitly setting the table to cache in order to make in-memory PQ kick in and eliminate disk I/O as a factor from my test case. For each test I made sure that no physical I/O has happened (including temp I/O for native distinct test).
Regular Function
create or replace function num_zeroes( p_n binary_integer ) return binary_integer deterministic is l_t binary_integer; l_b binary_integer; begin --assume 32-bit hash value, 10-bits for bucket if (p_n = 0) then return 22; end if; l_t := 1; l_b := 0; while ( bitand(p_n,l_t) = 0 ) loop l_t := l_t*2; l_b := l_b+1; end loop; return l_b; end num_zeroes; SQL> select 2 case 3 when hll <= 2560 and zeroes > 0 then round(1024*ln(1024*1/zeroes)) 4 when hll > 1/30 * power(2,32) then round(-power(2,32) * ln(1 - hll/power(2,32))) 5 else round(hll) 6 end num_distinct 7 from ( 8 select 0.72054 * (1048576/(current_sum+zeroes)) hll, zeroes 9 from ( 10 select sum(1/power(2, val)) current_sum, (1024-count(*)) zeroes 11 from ( 12 select /*+ parallel(z 4) */ mod(ora_hash(n), 1024) bucket, 13 max(num_zeroes(trunc(ora_hash(n)/1024)))+1 val 14 from z_hll_test z 15 group by mod(ora_hash(n), 1024) 16 ) 17 ) 18 ); NUM_DISTINCT ------------ 748175 Executed in 0.889 secondsPragma UDF Function
create or replace function num_zeroes( p_n binary_integer ) return binary_integer deterministic is pragma udf; l_t binary_integer; l_b binary_integer; begin --assume 32-bit hash value, 10-bits for bucket if (p_n = 0) then return 22; end if; l_t := 1; l_b := 0; while ( bitand(p_n,l_t) = 0 ) loop l_t := l_t*2; l_b := l_b+1; end loop; return l_b; end num_zeroes; SQL> select 2 case 3 when hll <= 2560 and zeroes > 0 then round(1024*ln(1024*1/zeroes)) 4 when hll > 1/30 * power(2,32) then round(-power(2,32) * ln(1 - hll/power(2,32))) 5 else round(hll) 6 end num_distinct 7 from ( 8 select 0.72054 * (1048576/(current_sum+zeroes)) hll, zeroes 9 from ( 10 select sum(1/power(2, val)) current_sum, (1024-count(*)) zeroes 11 from ( 12 select /*+ parallel(z 4) */ mod(ora_hash(n), 1024) bucket, 13 max(num_zeroes(trunc(ora_hash(n)/1024)))+1 val 14 from z_hll_test z 15 group by mod(ora_hash(n), 1024) 16 ) 17 ) 18 ); NUM_DISTINCT ------------ 748175 Executed in 0.593 secondsPragma UDF gives us ~33% performance boost which is not too bad considering we didn't have to do anything else. However, that's not the most interesting part -- let's take a look at the native distinct next.
Native Distinct
SQL> select /*+ parallel(z 4) */ count(distinct n) from z_hll_test z; COUNT(DISTINCTN) ---------------- 753204 Executed in 0.983 seconds
Note that this was an optimal execution!
Summary
Let's summarize results in a table:
Regular Function | Pragma UDF Function | Native Distinct |
0.889 | 0.593 | 0.983 |
As you can see pragma udf actually beats native implementation by a considerable margin which is very impressive given the fact that distict had an optimal execution.