1

Follow up from Read SHP file from SFTP using pysftp for more context.

I am trying to use pyshp and pysftp to read a shapefile and convert to a GeoPandas geodataframe. This has worked successfully for all files tested until the following error occurred.

Code:

from shapely.geometry import shape

r = shapefile.Reader(shp=shp, shx=shx, dbf=dbf)
fields = [field[0] for field in r.fields[1:]]

attributes = []
geometry = []

for row in r.shapeRecords():
     geometry.append(shape(row.shape.__geo_interface__))
     attributes.append(dict(zip(fields, row.record)))

Error:


~\miniconda3\lib\site-packages\shapefile.py in __shape(self)
   1039         record = Shape()
   1040         nParts = nPoints = zmin = zmax = mmin = mmax = None
-> 1041         (recNum, recLength) = unpack(">2i", f.read(8))
   1042         # Determine the start of the next record
   1043         next = f.tell() + (2 * recLength)

~\miniconda3\lib\site-packages\paramiko\file.py in read(self, size)

~\miniconda3\lib\site-packages\paramiko\sftp_file.py in _read(self, size)

~\miniconda3\lib\site-packages\paramiko\sftp_client.py in _request(self, t, *arg)

~\miniconda3\lib\site-packages\paramiko\sftp_client.py in _async_request(self, fileobj, t, *arg)

~\miniconda3\lib\site-packages\paramiko\message.py in add_int64(self, n)

error: int too large to convert

Is there a way to convert this int or perform this in chunks to avoid the error? The file that caused the error is not especially large (<2MB).

jtownsend
  • 55
  • 4

2 Answers2

1

The issue seemed to be the handling of NULL geometries in the shapefile.

While I don't fully understand why this caused an issue, according to the documentation (here) pyshp "handles shapes with no coords and represent as geojson with no coords (GeoJSON null-equivalent)".

Skipping the null geoms when adding records to the 'attributes' and 'geometry' lists seemed to fix the error:

 fields = [field[0] for field in r.fields[1:]]
        
 attributes = []
 geometry = []

 for row in r.shapeRecords():
      if row.shape.shapeType == 0:
           continue
      else:
           geometry.append(shape(row.shape.__geo_interface__))
           attributes.append(dict(zip(fields, row.record)))

Not the most satisfactory solution but may be useful or someone may be able to elaborate.

jtownsend
  • 55
  • 4
0

This maybe due to variable crosses sys max size. Check sys max size vs variable size using

import sys
sys.maxsize

Output I got

9223372036854775807

To check variable size

import sys
sys.getsizeof(r)

In order to fix this. Pls re declare Variable type to int64.

dtype=variable.int64

If this not solved the error pls, Let me know