0

I have the following nested-for-loop:

for i in range(len(prices)):
    for j in allCats:
        if prices[i] == 1.0 and j in categories[i]:
            prices[i] = allCats[j][1]

How can I refactor this to improve time complexity/performance?

Edit: prices is a list of float values, categories is a list of strings and allCats is a dict of lists.

javaMan
  • 35
  • 5
  • 1
    Make `categories[i]` a set. That will substantially improve the search time. I do not think you can improve anything else. – DYZ Jul 17 '21 at 23:16
  • Maybe have a look at multi threading, you could run multiple iterations of the loop in separate threads. Allowing iterations to execute at the same time –  Jul 17 '21 at 23:26
  • 1
    You probably could make the inner loop go through `categories[i]`, which I guess is shorter then allCats. But it is difficult to say exactly without seeing the structure of your data. – zch Jul 17 '21 at 23:36
  • 2
    Note that edit history is maintained on your post, so you do not need to write "edit", and can instead just edit your post as needed. However, please try to provide a [mre] as describing what your variables are is vague. Try to edit your code block so that it says `prices = [...]`, etc... – Kraigolas Jul 17 '21 at 23:50
  • 1
    javaMan are you sure you want to use `j in categories[i]` and not `j == categories[i]`? In your code, if `j` is 'apple' and `categories[i]` is 'pineapple', the price will be updated (assuming the price equaled 1.0) – Bart van Otterdijk Jul 18 '21 at 01:19

1 Answers1

2

List comprehensions are usually faster than equivalent logic in for loops, you can find lots of discussions on this topic, these are just examples: Why list comprehension can be faster than map() in Python? and https://www.linkedin.com/pulse/list-comprehension-python-always-faster-than-alex-falkovskiy/.

So my first thought is to rewrite your logic as a list comprehension.

If I understood your code correctly, you want to update prices for items priced at 1.0 to what is the new category price for that item.

Item category is in the categories list, and new prices for categories is in the AllCats dict, which you are retrieving using AllCats[j][0] where j is category name.

Next, I will use numpy arrays to create a filter based on list price, np_filter = np_prices == 1.0. Filter is then used in a list comprehension to retrieve updated price

np_prices[np_filter] = [allCats[c][0] for c in np_categories[np_filter]]

full code:

allCats = { 'a' : [7,2,3], 'b':[2,3,4]}
prices=[2,1.0,1.1,1.0,1.0]
categories = ['b','a','b','b','a']

import numpy as np
np_prices = np.array(prices)
np_categories = np.array(categories)

np_filter = np_prices == 1.0
np_prices[np_filter] = [allCats[c][0] for c in np_categories[np_filter]]

ofcourse you need to ensure that your allCats has entries for all values in categories so you dont get an out of bound error.

Mutaz-MSFT
  • 756
  • 5
  • 20