I posted a similar question about an hour ago, but have since deleted it after realising I was asking the wrong question. I have the following pickled defaultdict
:
ccollections
defaultdict
p0
(c__builtin__
list
p1
tp2
Rp3
V"I love that"
p4
(lp5
S'05-Aug-13 10:17'
p6
aS'05-Aug-13 10:17'
When using Hadoop, the input is always read in using:
for line in sys.stdin:
I tried reading the pickled defaultdict
using this:
myDict = pickle.load(sys.stdin)
for text, date in myDict.iteritems():
But to no avail. The rest of the code works as I tested it locally using .load('filename.txt'). Am I doing this wrong? How can I load the information?
Update:
After following an online tutorial, I can amend my code to this:
def read_input(file):
for line in file:
print line
def main(separator='\t'):
myDict = read_input(sys.stdin)
This prints out each line, showing it is successfully reading the file - however, no semblence of the defaultdict
structure is kept, with this output:
p769
aS'05-Aug-13 10:19'
p770
aS'05-Aug-13 15:19'
p771
as"I love that"
Obviously this is no good. Does anybody have any suggestions?