-1

I've been using python 2.7 scrape data from the web and store the data in a MongoDB. Some of the data needs to be serialized (pandas data frames), so I've been pickling the files in python 2.7.

I've now written some new scripts in python 3, but I am having compatibility issues unpickling the data (as noted in other posts Unpickling a python 2 object with python 3). Since the data is coming directly out of Mongo these solutions are not working, since they focus on methods for reading the file from the HD.

Here is some example code:

storing data in 2.7

pickled_data = pickle.dumps(scraped_data)
local_city.update({'location_name':'Boston'}, {"$set": {"Weather": pickled_data}})

(attempted) unpickling of data in python 3

db_cursor = local_city.find_one({"location_name": 'Boston'})
unpickled_data = pickle.loads(db_cursor["Weather"], fix_imports=True)

I tried using:

unpickled_data = pickle.loads(db_cursor["Weather"], fix_imports=True)

Error msg

TypeError: a bytes-like object is required, not 'str'

unpickled_data = pickle.loads(db_cursor["Weather"], fix_imports=True, encoding='bytes'))

Error msg

TypeError: file must have 'read' and 'readline' attributes

unpickled_data = pickle.loads(db_cursor["Weather"], fix_imports=True, encoding='latin1'))

Error msg

TypeError: file must have 'read' and 'readline' attributes

So I am wondering if there is a way to pickle in 2.7 (and store the file in Mongo) that can easily be unpickled in Python 3.

Thanks

  • A hint of how you're handling your pickles and what problems exactly you're facing wouldn't hurt. In other words a [mcve]. – Ilja Everilä Jan 06 '18 at 12:22
  • # storing data in 2.7 pickled_data = pickle.dumps(scraped_data) local_city.update({'location_name':'Boston'}, {"$set": {"Weather": pickled_data}}) # (attempted) unpickling of data in python 3 db_cursor = local_city.find_one({"location_name": 'Boston'}) unpickled_data = pickle.loads(db_cursor["Weather"], fix_imports=True) # Error msg TypeError: a bytes-like object is required, not 'str' – Thomas Carlson Jan 06 '18 at 21:31
  • Added example code – Thomas Carlson Jan 06 '18 at 21:34

1 Answers1

0

If I understood your question correctly, you can store the scraped data in Mongo using python 2.7, then get the record of database data into a text file (careful about the structure of the data when you are writing to a text file, use a standard format). You can use that txt file to extract data when you are using python 3.

mongoexport --host localhost --db dbname --collection collectionname --type=csv --out name.txt --fields name,id,etc (write without spaces and separated from commas)

This code will save the file as a txt file(txt files are capable of storing heavy content), using standard csv format.

jas01203
  • 3
  • 2