0

I have some variables which I will need to insert into a database (PostGres). The variables are 4-D and each has the following attributes: time, level, latitude, longitude.

For example:

print(sulphate_aerosol[0][1][400][367])
>> 3.539193384838e-06

I have 13 variables (for now!) and I need to iterate over each one, extract the data and insert it into a Postgres database.

I could do:

for i in datalength:
    for j in latlenth:
        for k in longlength:
            for l in levellength:
                insert(myVar[i][j][k][l])

But that is probably going to be slower than some of the faster methods you Python gurus can come up with.

I also think that it would probably be a good idea to store the values in an array and do a bulk insert, like shown here, so any advice on that would also be appreciated.

Community
  • 1
  • 1
pookie
  • 3,796
  • 6
  • 49
  • 105

1 Answers1

2

Well if you cannot improve insert(..) such that it works with bulk data, the time complexity will of course remain the same: you cannot iterate over a full 4d array without iterating over each element in them.

You can however improve the constant factor a bit, since here, for each element you perform:

myVar[i][j][j][l]

that is thus four index lookups. Those are not necessary since in the inner loop, you know that myVar[i][j][j] will always remain the same. You can thus short circuit access a bit with:

for myvar_i in myvar:
    for myvar_ij in myvar_i:
        for myvar_ijk in myvar_ij:
            for myvar_ijkl in myvar_ijk:
                insert(myVarijkl)

As for the bulk insert, you can indeed construct a list first. Something like:

result = []
for myvar_i in myvar:
    for myvar_ij in myvar_i:
        for myvar_ijk in myvar_ij:
            result += myvar_ijk

And then call it with:

bulk_insert(result)
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555