0

Question regarding best practices for queries and perl.

Is there a performance or 'bad practice' of having a select inside a perl for loop? Is it an issue to send so many selects in rapid fire to the DB?

code is quasi pseudo code

@line has 5000 lines

foreach my $elem( @line ){
    SQL = SELECT IGNORE INTO <table> ( column1, .. , column 10 ) VALUES ( 'a', .. , 'j' )
}

What about deletes and/or updates?

foreach my $elem( @line ) {
    my $UN = substr($elem, 0, 10 );
    SQL = UPDATE <table> SET <column> = $UN;
}

foreach my $elem( @line ) {
    my $UN = substr($elem, 0, 10 );
    SQL = DELETE from <table> WHERE <column> = $UN
 }

Also, I have a question in the same arena, I have 5000 items I am checking and my Database has anywhere from 1 element to 5000 elements at any given time. Is it acceptable to loop through my 5000 items in perl and delete the ID in the Database or should there be a check at first to see if the ID exists before issues the delete command.

foreach my $elem ( @line ){
    $ID = substr( $elem, 5, 0 );
    SQL = DELETE FROM <table> WHERE id = $ID;
}

or should it be something like:

foreach my $elem ( @line ){
    $ID = substr( $elem, 5, 0 );
    SQL = DELETE FROM <table> WHERE id = $ID if ID exists;
}

Thanks, --Eherr

eherr9633
  • 47
  • 9

3 Answers3

3

As for inserts in rapid succession, not a problem. The server is tailored to handle that.

Caution should be taken with insert ignore for other reasons though, such as program logic that should address failure that otherwise would not be able to address a failure you just ignored.

As for your particular update you showed, that does not make a ton of sense in a loop (or perhaps at all) because you are not specifying a where clause. Meaning, why loop, say, 1000 times, each doing an update to all the rows due to no where clause? Maybe that was just a typo of yours.

As for deletes, there is no problem with that, either, in a loop, in general. If you are looking to empty a table, look into truncate table, faster, and not logged if that is ever a desire. Note though that truncate is disallowed on tables that are the referenced table in foreign key constraint situations. In those situations there are the concepts of the referencing table and the referenced.

Other general comments: care should be taken to ensure that any referential integrity in place or that should be in place is honored. Doing insert ignore, update, or delete can fail due to foreign key constraints. Also, checking for the existence of a row that you are about to delete anyway may be overkill idk. It is marching down a btree anyway to find it. Why do it twice (the marching part). Marching might not be a good word, perhaps flying. But on a tablescan, it would be added pain.

Lastly, when you are in a situation of massive bulk insert, loops are never up to the task in any programming language as compared to LOAD DATA INFILE performance. Several of your peers have seen 8 to 10 hour operations reduced to 2 minutes by using LOAD DATA (references to links available if you ask). Ok This Link is one.

Mysql Manual Page below:

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • yeah it was a typo in the update, I was trying to get across 5000 rapid updates. I will edit that accordingly. – eherr9633 Dec 04 '15 at 18:11
  • insert ignore is used with a unique key in the table already so instead of insert.. on duplicate UPDATE, I went with the INSERT IGNORE – eherr9633 Dec 04 '15 at 18:12
  • Note taken for the general comments for future applications, however in this particular case, my DB is very minimal and does not contain Foreign Keys or any relations in said environment. It merely stores username, IP address, and status for a specific application with the username being the unique key – eherr9633 Dec 04 '15 at 18:14
  • Might want to add that if using `prepare`, it should be called before the loop. Although on second thought, the OP didn't mention DBI, so maybe they're not even aware of it. – ThisSuitIsBlackNot Dec 04 '15 at 18:26
  • 1
    hey @ThisSuitIsBlackNot "I like !!" – Drew Dec 04 '15 at 18:27
  • ` my $sql = " INSERT IGNORE INTO ( username, ip, status ) VALUES ( '$UN[0]', '$HOST', '$STATUS' )"; my $sth = $dbh->prepare( $sql ); $sth->execute or die "SQL ERROR: $DBI::errstr\n";
    – eherr9633 Dec 04 '15 at 18:47
  • [Programming Perl with DBI](http://rio.murmansk.ru/doc_/dbi/ch05_03.htm), in particular Binding Parameters to Statements – Drew Dec 04 '15 at 18:47
  • so with the bind_param, are you saying to through the sql statement in the prepare clause outside the foreach loop and then inside the loop construct all the $sth->bind_param($UN[0], $HOST, $STATUS) and then outside the foreach loop issue one $sth->execute() ? – eherr9633 Dec 04 '15 at 19:05
  • I see that @ThisSuitIsBlackNot is a perl guy. He is best to answer that one. I just gave a link to read about. Would have to do with a re-used `prepare`, that does not have to re-endure the prepare overhead inside of a loop, but rather just the bind part inside a loop – Drew Dec 04 '15 at 19:10
1

As per my opinion,it is a bit slow to make multiple queries.Better construct a single update,insert,select and delete query and fire.

There are few tips before using multiple quesris or single query.

1) If Database is configured to kill all queries that takes more than spcecified time, then using a single query if it is too large, can lead to killing of query. 2) Also, if user is waiting for response, then it can be done using pagination,i.e., fetch few records now...and subsequent later but not one by one.

Naruto
  • 4,221
  • 1
  • 21
  • 32
  • There's also [`max_allowed_packet`](https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_allowed_packet) to worry about. For large inserts, `LOAD DATA INFILE` will generally be the fastest approach, as [Drew mentioned](http://stackoverflow.com/a/34094639/176646). – ThisSuitIsBlackNot Dec 04 '15 at 19:06
  • but true, chunks of 100 or 1000 are the way to go by all means (short of a load data infile). insert into tableA (colA,colB) values (1,1), (7,4), (8,-19), .... got my support for sure – Drew Dec 04 '15 at 19:08
0

5000 queries into a database shouldn't be a performance bottleneck. You're fine. You can always benchmark a read-only run.

lee
  • 417
  • 3
  • 13