1

I have a bash script (load_data.sh) that invokes the sqlldr command to load data from a .csv into a table (data_import). On a recent invocation, I noticed that even though the command execution was completed, the table didn't contain the data from the .csv file. I say this because the subsequent statement (process_data.sh) in the bash script tried to run a stored procedure that threw the error
ORA-01403: no data found.

I learned that the commit happens right after the file load. So, I'm wondering what's causing this error and how I can avoid it in the future.

Here are my scripts:

load_data.sh

#!/usr/bin/bash -p

# code here #

if [[ -f .st_running ]]
then
echo "Exiting as looks like another instance of script is running"
exit
fi

touch .st_running

# ... #

# deletes existing data in the table
./clean.sh

sqlldr user/pwd@host skip=1 control=$CUR_CTL.final data=$fpath log=${DATA}_data.log rows=10000 direct=true errors=999

# accesses the newly loaded data in the table and processes it
./process_data.sh

rm -f .st_running

clean.sh/process_data.sh

# code here #
# ... #

sqlplus user/pwd@host  <<EOF
set serveroutput on
begin
schema.STORED_PROC;
commit;
end;
/
exit;
EOF

# code here #
# ... #

STORED_PROC run by process_data.sh:

SELECT count(*) INTO l_num_to_import FROM data_import;

IF (l_num_to_import = 0) THEN RETURN;
END IF;

/* the error (`ORA-01403: no data found`) happens at this statement: */
SELECT upper(name) INTO name FROM data_import WHERE ROWNUM = 1;

Control file

LOAD DATA
APPEND
INTO TABLE DATA_IMPORT
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
...
...
)

Edits

  1. The input file had 8 rows and the logs from both runs stated that 8 rows were successfully inserted.
  2. Interesting behavior: The script ran fine (without complaining about the error) the 2nd time I ran it on the same file. So, during the first run, the sqlldr command doesn't seem to complete before the next sqlplus command is executed.
Community
  • 1
  • 1
mauryat
  • 1,610
  • 5
  • 29
  • 53
  • Did the `sqlldr` appear to complete - did you check the logs to see if it ran, and how many rows it inserted/rejected? You're not checking the exit code at the moment, which would have been useful. – Alex Poole Jan 09 '14 at 17:42
  • @BobJarvis No, the input file had 8 rows. – mauryat Jan 09 '14 at 17:42
  • OK, then back to Alex's question - what did the log file have to say? And it might be useful to edit the question and include the data file and the sqlldr control file. – Bob Jarvis - Слава Україні Jan 09 '14 at 17:43
  • @AlexPoole Yes, I checked the logs which stated that 8 rows were inserted. See edit. – mauryat Jan 09 '14 at 17:45
  • @BobJarvis Added control file. Can't mention the data fields as it is proprietary. Also can't upload the data file as it is confidential. – mauryat Jan 09 '14 at 17:53
  • It said 8 inserted on the first run, or the successful second one - sorry, not clear from your naming whether the first log would have been preserved, or if you checked it. If they were inserted, is it possible something else deleted the records in between the `sqlldr` and `sqlplus` calls (even this script running twice simultaneously)? There is no way that data actually added by `sqlldr` would not be visible after the program exits, so this is rather confusing. (Well, unless it inserts to a temporary table, but then the second run wouldn't have worked). – Alex Poole Jan 09 '14 at 17:54
  • @AlexPoole It said it inserted 8 rows on both runs. The first log no longer exists, but I remember seeing it. Yes the records get deleted each time the `clean.sh` script is run. So, the 2nd run is like running it afresh. – mauryat Jan 09 '14 at 17:57
  • Hmmm...any chance there were two invocations of this process running simultaneously, where the second invocation could have deleted the data inserted by the first invocation? – Bob Jarvis - Слава Україні Jan 09 '14 at 18:00
  • @BobJarvis Not that I know of. However, even if the 2nd invocation had deleted the data inserted by the 1st invocation, it would have still continued to load data afresh and process it. And since it's the one that's running 2nd, the logs should reflect the 2nd invocation. – mauryat Jan 09 '14 at 18:05
  • Is there perhaps another process which also deletes data from the same temp table, which might have been running at the same time? – Bob Jarvis - Слава Україні Jan 09 '14 at 18:35
  • @BobJarvis There's no other process running. I added a few more lines of the code for the STORED_PROC. It reaches the select statement in question only if there are more than 0 rows in the table. It's strange how it passes those if statements, but then complains that there's no data in the table. – mauryat Jan 09 '14 at 18:52
  • @user640378 - that is only possible if another process (or session, or job) deleted from the table between the two `select` calls. It's a small window - even if you've omitted other code in between, and assuming *that* doesn't delete - but certainly feasible. Is it possible you have a clean-up script that's supposed to run after this, but it ran early - either this time or always and you've just been lucky up until now? – Alex Poole Jan 09 '14 at 18:57
  • Y'know, the `WHERE ROWNUM=1` makes me kind of nervous. I can't say exactly why - but something about this looks questionable. @AlexPoole - got any thoughts on this? What am I missing here..? – Bob Jarvis - Слава Україні Jan 09 '14 at 19:03
  • Thinking about this a bit - what's the point of the `WHERE ROWNUM=1`? If the intent is to get the row corresponding to the first line in the file, it's not guaranteed to do so. The database is free to return rows in any order, unless there's an ORDER BY clause. No `ORDER BY` here, thus no guarantee of the results being ordered. But that still doesn't explain the "no rows found". Hmmm... – Bob Jarvis - Слава Україні Jan 09 '14 at 19:09
  • @AlexPoole There is no clean-up script other than the `clean.sh`. I've edited the original script in the question with code that seems to be disallowing multiple instances of the scripts from running. – mauryat Jan 09 '14 at 19:10
  • @BobJarvis - the `rownum` check seems fine - it will just return the first row, same as `where rownum < 2`, and I don't see how that can fail if there is any data in the table. Where 'first' is arbitrary, as you said. I can't imagine how those two selects can behave like that without a deletion in between. – Alex Poole Jan 09 '14 at 19:21
  • @user640378 - you have a race condition in the 'lock' check too; there is a window (albeit small) where the script can pass the `test` but someone else can have created the file by the time you do the `touch`. That isn't really protecting you. You'd probably know if you'd submitted it manually twice that close together; but maybe this is run through `cron` and you have two clashing crontab entries, so it was submitted twice simultaneously? (Maybe it always is but the test has protected you so far). – Alex Poole Jan 09 '14 at 19:26
  • 1
    @user640378 - [this might be of interest](http://stackoverflow.com/q/169964/266304). – Alex Poole Jan 09 '14 at 19:35
  • @AlexPoole Thanks for the link on flock. I'll try to implement it. We don't have it on cron. So, I guess somebody ran it simultaneously (unlikely, but since we don't have any other thing to point fingers at!). – mauryat Jan 09 '14 at 19:43

1 Answers1

0

If you capture the PID of the sqlldr command and wait for it to complete then you will be sure its complete. You can add a datestamp to the log file or timestamp if its run multiple times a day and do a while loop and sleep and check to see when the log prints its last line of completion. Then run the next step.

ssalih
  • 26
  • 1