0

I have stored 4 ndarrays in a dictionary dictPrices and would like to generate another boolean ndarray for each of the 2 cases: (1) element-wise, if number in any of the 4 ndarrays exceed x (2) element-wise, if number in all of the 4 ndarrays exceed x

dictPrices[1] >= x works but when i tried (dictPrices[1] >= x | dictPrices[2] >= x), it fails. (dictPrices[1] >= x or dictPrices[2] >= x) failed too.

As the ndarrays can be huge (from monte carlo), I was hoping for vectorization rather than to loop through each ndarray element-wise.

Thank you!

AiRiFiEd
  • 311
  • 2
  • 12
  • Are you sure that all 4 arrays the same shape? In what way does the example you tried fail? – wim Dec 14 '16 at 02:55
  • hi wim, yup they are of shape (7, 250000) as i was simulating 4 different price sets. error thrown was `ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()`. `print((dictPrices[1]>=x or dictPrices[2]>=x).any())` does not work either – AiRiFiEd Dec 14 '16 at 03:06

1 Answers1

2

I think you want this:

np.logical_or.reduce([prices >= x for prices in dictPrices.values()])

This is explained in some detail here: Numpy `logical_or` for more than two arguments

And of course for the second case you can use logical_and instead of logical_or.

Community
  • 1
  • 1
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • hey John, thanks for pointing me in the right direction! The code works and it seems extremely efficient...with run time increasing only by 0.006s to loop through 4 sets of (7, 250000). I am still trying to figure out from the link how np.logical_or.reduce works for multidimensional arrays... also, it seems from your answer that there is actually a for loop going in there. but from my limited experience, when i try to loop using the normal for loop, it usually is very expensive - can I just ask why is this method so fast to execute (I understand this as "functional programming")? Thank you! – AiRiFiEd Dec 14 '16 at 03:17
  • @AiRiFiEd: Well there is a "for" loop in my code only over the 4 elements of `dictPrices`. A for loop over 4 items is not slow at all--what's slow is if you iterate over thousands or millions of rows. If you want to eliminate the for loop completely you can rework your data structure to be a single 2D array with the new dimension being 4. But copying your data into that won't be worth it if the only reason is to avoid one or two loops. – John Zwinck Dec 14 '16 at 04:36
  • Thanks for the explanation! As I was reading a book on python recently on "functional programming", i tried to change your solution a little by trying to use the `map` function - `np.logical_and.reduce( map(lambda prices:prices>=x, dictPrices.values()))` - currently its returning a memory address (``) but by any chance do you know if this would work and would you expect an improvement in performance? Thanks so much for your help with this!! – AiRiFiEd Dec 14 '16 at 05:48
  • @AiRiFiEd: Don't bother. `reduce()` expects a concrete sequence, not a generator like `map()` gives you. You won't gain anything by heading down this path. But if you insist, you can do `reduce(list(map(...)))`. – John Zwinck Dec 14 '16 at 07:27
  • apologies for the late reply. You are right - i tried both methods and runtimes were about the same - definitely not work the method. thanks again for your guidance on these matters! – AiRiFiEd Dec 14 '16 at 14:52