I have a scenario where I have to obfuscate data(=scramble, for testing purposes, so it is not possible to see the real data, there is no need on unscramble/unobfuscate it) in database. There are several tables that are referencing the address_table. I can not obfuscate the address_table, so I figured that I simply change the references in those tables with random other address_table ID-s. The address_table contains 6M+ records. So I would create a temp table with all the address ID-s and then, when needed call some function to get a random one from there. So I could possibly generate a random value and take that row like:
Select * From (
Select Id, Rownum Rn From myTempTable )
WHERE RN = x;
where x is some random value generated by dbms_random. Now, although this is what I need, it does not perform anything near to what I expect. Other thing I have tried is to call the sample() function, this (at least on small table) performs I bit better, but it is not good enough.
I know there are several threads on this matter like this or this on mySql, but they do not directly answer it in terms of performance.
Also, I am not limited in using pl/sql. I know a very little of pl/sql, how is it in terms of performance? I mean, it is just another process in DB server processing queue, perhaps i could get better performance doing the processing (i mean generating the update scripts, populating randoms etcetc) on client side using something like python, even considering network latency etc? Does anybody have any experience on this?