1
In [1]: import json

In [2]: path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'

In [3]: from pandas import DataFrame, Series;

In [4]: records = [json.loads(line) for line in open(path)]

In [5]: frame = DataFrame(records)

In [6]: frame['tz'][:10]
Segmentation fault: 11

Any access to frame results in a segfault. I have already upgraded to Python 2.7.6 RC1. Also happened in 2.7.5, also happens outside ipython. What am I to do?

rjurney
  • 4,824
  • 5
  • 41
  • 62
  • How big is the `usagov` data? Could you make a smaller version of it that can reproduce the problem? – David Robinson Nov 01 '13 at 07:44
  • This is a tutorial in a book. That is not the issue, the data is not big. The data is visible here: https://github.com/pydata/pydata-book/blob/master/ch02/usagov_bitly_data2012-03-16-1331923249.txt – rjurney Nov 01 '13 at 09:03
  • Can't reproduce on Mac OS 10.7.5, Python 2.7.3. – David Robinson Nov 01 '13 at 09:21
  • This is a mavericks specific issue, OS X 10.9 :( http://stackoverflow.com/questions/19531969/segmentation-fault-11-in-os-x – rjurney Nov 01 '13 at 09:27
  • This does not appear to be related to the 10.9 readline segfault problem noted in the other question. Most likely it has something to do with how pandas or numpy or some other package used by pandas was built and installed. – Ned Deily Nov 01 '13 at 17:13

1 Answers1

3

This appears to be a bug in Numpy that has been fixed recently. See https://github.com/numpy/numpy/issues/3962

Ned Deily
  • 83,389
  • 16
  • 128
  • 151