They both seem exceedingly similar and I'm curious as to which package would be more beneficial for financial data analysis.
3 Answers
pandas provides high level data manipulation tools built on top of NumPy. NumPy by itself is a fairly low-level tool, similar to MATLAB. pandas on the other hand provides rich time series functionality, data alignment, NA-friendly statistics, groupby, merge and join methods, and lots of other conveniences. It has become very popular in recent years in financial applications. I will have a chapter dedicated to financial data analysis using pandas in my upcoming book.

- 97
- 1
- 2
- 8

- 101,437
- 32
- 142
- 108
-
228You ought to have mentioned that you're the primary author of pandas. :) The book in question: http://shop.oreilly.com/product/0636920023784.do – Yktula Aug 21 '13 at 04:45
-
3Would it be fair to say that numpy primarily provides efficient arrays, whereas pandas provides efficient dictionaries? (In both cases, limited to consistent data type rather than free form.) To me (I am just beginning to look into it now), this strikes me as the underlying difference: handling of label-paired data (in 1d aka dicts and 2d aka tables). Data alignment, join, etc all become *possible* due to this, but for people who don't grok that underlying difference it's not even clear what those mean (e.g., what is "data alignment" of two numpy arrays?). – Brandyn Jul 22 '14 at 19:56
-
7may be a goofy question but what do you mean by `NA-friendly statistics`, mentioned in your answer. – Adil Abbasi Sep 16 '16 at 05:40
-
7I think, he refers to statistics taking into account of missing data (NA , "Not Available" ) – Siva-Sg Sep 22 '16 at 09:46
-
4Cold thread, but what about performance differences bw a complex operation in numpy, for instance, but simplified syntactically in pandas? Is there a performance cost to going the high-level, easy syntax path? – 3pitt Mar 24 '18 at 20:14
-
Failed to mention differences between numpy and pandas, like that numpy is much faster for many low-level applications. I am unsure why this was upvoted. – Matthaeus Gaius Caesar Jul 11 '21 at 18:54
Numpy is required by pandas (and by virtually all numerical tools for Python). Scipy is not strictly required for pandas but is listed as an "optional dependency". I wouldn't say that pandas is an alternative to Numpy and/or Scipy. Rather, it's an extra tool that provides a more streamlined way of working with numerical and tabular data in Python. You can use pandas data structures but freely draw on Numpy and Scipy functions to manipulate them.

- 242,874
- 37
- 412
- 384
Pandas offer a great way to manipulate tables, as you can make binning easy (binning a dataframe in pandas in Python) and calculate statistics. Other thing that is great in pandas is the Panel class that you can join series of layers with different properties and combine it using groupby function.

- 1
- 1

- 1,440
- 3
- 20
- 37