Recently, I have been trying to use the Google DLP API in Python 3 to classify the content of tables. I first started by testing the API on small examples, which all worked perfectly. However, as I attempted to send larger tables (1000 rows x 18 columns which is smaller than the 50 000 quota), the request would crash. After reducing the size of the table to 100 rows, I did manage to make it run, however a single request of 100 rows takes approximately 10 seconds. Most values are fairly short, you find some of the columns bellow:
- Address
- Date of birth
- First Name
- Gender
- Job Position
- Last Name
Furthermore, after further experimentation, I have noticed that if the same table is provided as a string in a CSV format (columns separated by "," and rows by "\n"), running time is reduced by a factor of 10.
Is this a normal behaviour? Or am I perhaps using the api poorly leading to such poor running performances?
I hope my question is clear enough, Thanks for taking the time to read this ! :)