0

I have a stored procedure that I just converted to Snowflake Javascript from PL/SQL. It inserts about 100 records a minute. The total record count is about 700. Because it is so difficult to know where a problem is in Snowflake, I insert log statements as the overall functionality progresses. I also push messages to an array that gets returned at the bottom. However, I do the insert into log table type things in PL/SQL and it barely makes a performance difference. I'll admit that my progress loading slows down the process, but am doubtful that it's the primary contributor.

The script makes a table that, given a date, shows the fiscal quarter that it corresponds to. This is helpful for other queries not shown. I have a simple loop that goes from the beginning of the first quarter to the end of the last and puts the corresponding quarter in the lookup table.

It took 9 minutes to run as written, but in Oracle, takes less than a second.

I'd like to know how to make this run faster:

create or replace procedure periodic_load()
    RETURNS varchar
    LANGUAGE javascript
    execute as owner 
    as 
    $$
    var result = "";
    var messages = new Array();
    try {
        /**
        Constants shared between functions
        */
        var SINGLE_QUOTE_CHAR="'";
        var DOUBLE_QUOTE_CHAR="\"";
        var COMMA_CHAR=",";
        var LEFT_PARENTHESIS="(";
        var RIGHT_PARENTHESIS=")";
        var ESCAPED_SINGLE_QUOTE_CHAR="\\'";
        var ESCAPED_DOUBLE_QUOTE_CHAR="\\\"";
        var CONSOLE_LOG_USED = true;
        var IS_SNOWFLAKE = false; 


        /*
        Execute Snowflake SQL or simulate the execution thereof
        @parmam  sqlTextIn,binds...
        sqlTextIn: String of the sql command to run.
        binds: zero or more parameters to bind to the execution of the command.
        */
        function execute_with_log() {
            var result = null;
            messages.push('@@@'+"execute_with_log()");
            messages.push('@@@'+"EXECUTE_WITH_LOG(BP1)");

            var argumentsArray = Array.prototype.slice.apply(arguments); 
            var sqlTextIn = argumentsArray[0];
            messages.push('@@@'+'EXECUTE_WITH_LOG argument count: '+arguments.length);
            if(!IS_SNOWFLAKE) { 
                messages.push('@@@'+ "EXECUTE_WITH_LOG(BP2)");
                console.log('SKIPPING SNOWFLAKE SQL: '+sqlTextIn);
            } else {
                messages.push('@@@'+ " EXECUTE_WITH_LOG(BP3)");
                var statementResult;
                var logMessage = sqlTextIn;
                if(argumentsArray.length==1) {
                    messages.push('@@@'+ " EXECUTE_WITH_LOG(BP4)");
                    messages.push('@@@'+" ** NO BIND PARAMETERS DETECTED **");
                } else {
                    messages.push('@@@'+ " EXECUTE_WITH_LOG(BP5)");
                    for(var bindParmCounter = 1; bindParmCounter < argumentsArray.length; bindParmCounter++) {
                        messages.push('@@@'+" ,"+argumentsArray[bindParmCounter]);    
                    }
                }
                messages.push('@@@'+ " EXECUTE_WITH_LOG(BP6)");
                log_message('I',logMessage);
                if(argumentsArray.length===1) {
                    messages.push('@@@'+ " EXECUTE_WITH_LOG(BP7)");
                    statement = snowflake.createStatement( { sqlText: sqlTextIn });
                } else {
                    messages.push('@@@'+ " EXECUTE_WITH_LOG(BP8)");
                    var bindsIn = argumentsArray.slice(1,argumentsArray.length);
                    for(var bindParmCounter = 0; bindParmCounter < bindsIn.length; bindParmCounter++) {
                        messages.push('@@@bindsIn['+bindParmCounter+"]="+bindsIn[bindParmCounter]);    
                        messages.push('@@@bindsIn['+bindParmCounter+"] type ="+bindsIn[bindParmCounter].getName());    
                    }
                    statement = snowflake.createStatement(
                        {
                            sqlText: sqlTextIn,
                            binds: bindsIn
                        }
                    );
                }

                messages.push('@@@'+ " EXECUTE_WITH_LOG(BP9) sqlTextIn="+sqlTextIn);
                result = statement.execute();
                messages.push('@@@'+ " After execute BP10 =");
                commit();
                messages.push('@@@'+ " After commit BP11 =");

            }
            return result;
        }

        function commit() {
            messages.push('@@@'+ " commit");
            statement = snowflake.createStatement(
                {
                    sqlText: 'commit'
                }
            );
            statement.execute();
            return messages;
        }

        function log_message(severity,message) {
            messages.push('@@@'+"log_message(severity,message): severity="+severity+" message="+message);
            var result = null;
            if(!IS_SNOWFLAKE) {
                console.log(severity+": "+message);
                messages.push('@@@'+severity+": "+message);
            } else {
                var record = {'severity': severity,'date_time': {value: 'current_timestamp::timestamp_ntz',useQuote:false},message:message};
                try {
                    var escapeStep1=message.replaceAll(SINGLE_QUOTE_CHAR,ESCAPED_SINGLE_QUOTE_CHAR);
                    var escapeStep2=escapeStep1.replaceAll(DOUBLE_QUOTE_CHAR,ESCAPED_DOUBLE_QUOTE_CHAR);
                    quotedValue=SINGLE_QUOTE_CHAR+escapeStep2+SINGLE_QUOTE_CHAR;

                    var quotedSeverity = SINGLE_QUOTE_CHAR+severity+SINGLE_QUOTE_CHAR;

                    var sql_command = "insert into LOG_MESSAGES(severity,date_time,message) values("+quotedSeverity+",current_timestamp::timestamp_ntz,"+quotedValue+")";
                    statement = snowflake.createStatement( { sqlText: sql_command});

                    var sql_command = "commit";
                    statement = snowflake.createStatement( { sqlText: sql_command});
                } catch(error) {
                    messages.push('@@@'+'FAILURE: '+error);
                }
            }
            return result;
        }

        function truncate_table(tableName) {
            messages.push('@@@'+"(truncate_table()");
            var result = execute_with_log("truncate table "+tableName); 
            messages.push('@@@'+'I','End truncate_table()');
            return result;
        }

        function fql() {
            messages.push('@@@'+"begin fql()");
            log_message('I','Begin fql()');
            var table_name='fiscal_quarter_list';
            truncate_table(table_name);
            execute(
                "insert into fiscal_quarter_list (fiscal_quarter_id,fiscal_quarter_name,fiscal_year,start_date,end_date,last_mod_date_stamp) ("
                +"    select fiscal_quarter_id,fiscal_quarter_name,fiscal_year,min(start_date) start_date,max(end_date) end_date,current_date from cdw_fiscal_periods cfp"
                +"    where (cfp.start_date >= add_months(sysdate(),-24) and  sysdate() >= cfp.end_date ) or "
                +"          (cfp.start_date <= sysdate() and sysdate() < cfp.end_date)  "
                +"    group by fiscal_quarter_id,fiscal_quarter_name,fiscal_year "
                +"    order by fiscal_quarter_id desc "
                +"    fetch first 8 rows only "
                +")"
            );
            log_message('I','End fql()');
        }


        /*
        Function to increment a Date object by one standard day
        Sourced from https://stackoverflow.com/questions/563406/add-days-to-javascript-date
        */
        function addDaysInJs(dateIn, days) {
            var result = new Date(dateIn);
            result.setDate(result.getDate() + days);
            return result;
        }


        function dtfq()  {
            messages.push('@@@'+"dtfq()");
            tableName = 'date_to_fiscal_quarter';
            var firstDate;
            var runningDate;

            log_message('I','Begin dtfq');
            truncate_table(tableName);
            var result = null;
            var resultSet = execute_with_log(" SELECT FISCAL_QUARTER_ID, FISCAL_QUARTER_NAME,try_to_date(START_DATE) as START_DATE, try_to_date(END_DATE)  as END_DATE"
                                                    + " FROM FISCAL_QUARTER_LIST "
                                                    + " ORDER BY START_DATE  ");

            log_message('D','resultSet ='+resultSet);
            log_message('D','resultSet typeof='+typeof resultSet);
            while(resultSet.next()) {
                messages.push('@@@'+"bp1 dtfq() loop start_date="+resultSet.getColumnValue("START_DATE")+" end_date="+resultSet.getColumnValue("END_DATE"));
                firstDate = resultSet.getColumnValue("START_DATE");
                lastDate = resultSet.getColumnValue("END_DATE");
                runningDate=new Date(firstDate);
                lastDate = new Date(lastDate);
                log_message('D','Start date='+firstDate);
                while (runningDate <= lastDate) {
                    var fiscalQuarterId=resultSet.getColumnValue("FISCAL_QUARTER_ID")
                    var fiscalQuarterName=resultSet.getColumnValue("FISCAL_QUARTER_NAME")
                    messages.push('@@@'+"bp2 dtfq() runningDate="+runningDate+' fiscalQuarterId='+fiscalQuarterId+' fiscalQuarterName='+fiscalQuarterName);
                    log_message('D','Fiscal quarter id='+fiscalQuarterId);
                    /*
                    execute_with_log(" insert into sc_hub_date_to_fiscal_quarter(date_stamp,) "
                                            +" values(try_to_date(?)) "
                                            ,runningDate.toISOString());
                                            */
                    execute_with_log(" insert into sc_hub_date_to_fiscal_quarter(date_stamp,fiscal_quarter_id,fiscal_quarter_name) "
                                            +" values(?,?,?)"
                                            ,runningDate.toISOString()
                                            ,fiscalQuarterId
                                            ,fiscalQuarterName);
                                         
                    runningDate = addDaysInJs(runningDate, 1);
                }

            }

            log_message('I','End dtfq Success');
            return result;
        }

        /*
        Execute Snowflake SQL or simulate the execution thereof
        @parmam  sqlTextIn,binds...
        sqlTextIn: String of the sql command to run.
        binds: zero or more parameters to bind to the execution of the command.
        */
        function execute() {
            messages.push('@@@'+"execute():");
       
            var result = null;
            var argumentsArray = Array.prototype.slice.apply(arguments); 
            var sqlTextIn = argumentsArray[0];
            if(!IS_SNOWFLAKE) { 
                console.log('SKIPPING SNOWFLAKE SQL: '+sqlTextIn);
                messages.push('@@@'+'SKIPPING SNOWFLAKE SQL: '+sqlTextIn);
            } else {
                messages.push('@@@'+'USING SNOWFLAKE SQL: '+sqlTextIn);
                var statementResult;
                if(argumentsArray.length>2) {
                    messages.push('@@@'+'Has bind arguments: ');
                    var bindsIn = argumentsArray.slice(2,argumentsArray.length);
                    statement = snowflake.createStatement(
                        {
                        sqlText: sqlTextIn,
                        binds: bindsIn
                        }
                    );
                } else {
                    messages.push('@@@'+'Has no bind arguments: ');
                    messages.push('@@@'+'###sqlText='+sqlTextIn+'###');
                    statement = snowflake.createStatement( { sqlText: sqlTextIn });
                }
                result = statement.execute();
                messages.push('@@@'+'statement.execute succeeded');
                log_message('I',sqlTextIn);
            }
            return result;
        }
        

        String.prototype.replaceAll = function(target, replacement) {
          return this.split(target).join(replacement);
        };

        Object.prototype.getName = function() { 
           var funcNameRegex = /function (.{1,})\(/;
           var results = (funcNameRegex).exec((this).constructor.toString());
           return (results && results.length > 1) ? results[1] : "";
        };

        dtfq(); 
    } catch(error) {
        messages.push('@@@'+error);
    } finally {
        result = messages.join("\n");
    }

    

    return result;
    $$
    ;


call periodic_load()




Woodsman
  • 901
  • 21
  • 61
  • It appears as if you're doing row-by-row inserts in the logging and perhaps elsewhere. Whether you do that wrapped with begin and commit or not, the performance will be slow. If you cache them in a variable and put all the inserts in a single transaction the performance will be faster. If you get too many rows to insert all at once, you can bundle them 1000 or 100 at a time depending on how large the rows are. – Greg Pavlik Jun 30 '20 at 00:12

1 Answers1

0

The use-case isn't entirely stated here, but it appears that your stored procedure merely generates (explodes) and inserts a series of dates into a table, for each date range encountered in a source table input row.

This can be achieved with SQL (with recursive CTEs) directly, which would run far more efficiently than a linear stored procedure iteration:

create table destination_table (fiscal_quarter_id integer, fiscal_quarter_name string, date_stamp date);

insert into destination_table
with source_table(fiscal_quarter_id, fiscal_quarter_name, start_date, end_date) as (

  select 1, 'Q1', '2020-01-01'::date, '2020-03-31'::date union all
  select 2, 'Q2', '2020-04-01'::date, '2020-06-30'::date union all
  select 3, 'Q3', '2020-07-01'::date, '2020-09-30'::date union all
  select 4, 'Q4', '2020-10-01'::date, '2020-12-31'::date

), recursive_expand as (

  select
    fiscal_quarter_id, fiscal_quarter_name, start_date, end_date,
    start_date as date_stamp
  from source_table
  
  union all
  
  select 
    fiscal_quarter_id, fiscal_quarter_name, start_date, end_date,
    dateadd(day, 1, date_stamp)::date date_stamp
  from recursive_expand
  where date_stamp < end_date

)

select fiscal_quarter_id, fiscal_quarter_name, date_stamp
from recursive_expand
order by date_stamp asc;

The example inserts 366 rows into the destination_table (2020 being a leap year) covering dates of all four quarters.

@Greg Pavlik's comment covers why the stored procedure is slow due to executing whole statements (each independently submitted, compiled, planned, executed, and returned from the snowflake query processing service adds a lot of overhead). If you'd still like to proceed with the stored procedures API for your use-case, an idea is to make two specific changes:

  1. Store all generated data rows into an array instead of inserting them directly, like so (this is only practical for a few hundred rows, not beyond, due to memory constraints):
function dtfq() {
  var all_rows = [];

  // … iteration and other logic here …
      all_rows.push([fiscalQuarterId, fiscalQuarterName, runningDate]);
  // … iteration and other logic ends here (minus inserts) …

  return all_rows;
}
  1. Insert the list of n rows generated using a single generated INSERT statement with n value containers. An example of such code can be seen in this answer.