1

I have a long list of floats (a few Millions). What is the fastest way to add a constant value to a specific number of floats like every eighth float? As far as I found out it seems to be a for-loop, is this really true? I tried to do it with slices, but I was not able to find a Syntax where I could manipulate data. Maybe there is a better data container available than a list, unfortunately I am not able to use numpy.

I am looking for something like this, but I could not find out how to do it. float_values[slice(0,len(float_values),8)] += 5.0

If this is not possible I could also split up the lists into eight different ones and add a constant value to all floats in every of the eighth lists.

  • 4
    "I am not able to use numpy" why not? – Julien Jul 25 '23 at 04:59
  • 3
    There is no way around using a loop. Either a slow native python `for` loop, or use c++ speedy vectorisation (still a `for` loop under the hood), e.g. numpy. – Julien Jul 25 '23 at 05:01
  • Not to be pedantic but you said "add a constant value to all floats" while your example adds a constant to only every 8'th value. Those are very different requirements. That your problem statement and example implementation differ on such a critical detail suggests you need to be more precise. As @Julien pointed out pure Python does not support an efficient solution to do what you want. You have to use something like Numpy. – Kurtis Rader Jul 25 '23 at 05:21
  • @KurtisRader: Sry, my fault. I will edit the question – python_noob Jul 25 '23 at 05:37
  • @python_noob Regardless of whether you need to efficiently add a constant to every value, or every 8'th value, in a list this is fundamentally a niche case for Python. An efficient solution requires something like a GPU to perform those operations in parallel. The Python language does not provide any means of expressing that intent (at least as I write this comment). You need to use something like the Numpy package to do that operation efficiently or use a different language. – Kurtis Rader Jul 25 '23 at 05:46
  • If you have to use a list, there is nothing better than a for loop, which is extremely slow. A much faster option would be to use numpy. If that is not possible (why?), then use some other specific solution which will solve the problem at a lower level, but since you have not explained why numpy is not an option, it's impossible to guess what is an option. If nothing else works, the solution might be to not do the addition at all, but do some sort of a lazy addition instead, but again, we don't have enough info. – zvone Jul 25 '23 at 05:54
  • 1
    this could be an [XY Problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) - what are you really trying to do? – ti7 Jul 25 '23 at 06:10
  • I upvoted this question. OP may or may not want to share why they cannot use numpy, but as things stand, "what is the fastest way to [perform the described operation] without numpy" is a valid question. – Felix Fourcolor Jul 25 '23 at 06:10
  • 1
    Sry, I thought that I have explained why I am not able to use numpy in a comment, but something got lost. I am not able to use it because I am not working in pure python environment but in a self written interpreter which uses python and C++ Libraries which are included via bindings. Also QT is available. I am not able to give the whole technical backround of it because I am not aware of it. But the developers of this framework told me that numpy cannot used within it. I am interested in this "lazy addition" but could not find anything in the web. – python_noob Jul 25 '23 at 08:21
  • @zvone: Could you explain to me what you mean with "lower level" or "lazy addition"? Maybe this could solve my problem? – python_noob Jul 26 '23 at 04:33
  • Lazy addition would be not adding at all, but instead having an object which remembers that a value should be added to each eight number. Whether that makes sense here friends on what you are doing, but you have not provided anywhere near enough information for that to be clear. – zvone Jul 26 '23 at 06:50

2 Answers2

0

The comments referring to using a "lazy" number likely mean you could redefine retrieval of the value(s) such that at the point of reading the value by-index (or while iterating), the increment is added, rather than going through every member initially

You likely want to avoid

class LazyIncrementList(list):
    """ implement a list which can lazily increment members
    """
    def __init__(self, /, *args, **kwargs):
        self.period    = kwargs.pop("period", None)     # every Nth value
        self.increment = kwargs.pop("increment", None)  # to add or subtract
        self.offset    = kwargs.pop("offset", 0)        # roll the period
        if any((self.period, self.increment)) and not all((self.period, self.increment)):
            raise TypeError(f"period({period}) and increment({increment}) must be given together if either is provided")
        # consider just all() or raise for None as caller could just use a normal list
        super().__init__(*args, **kwargs)  # list init
    def __repr__(self):
        return f"{type(self).__name__}<{len(self)}>"
    def __getitem__(self, idx, /):
        value = super().__getitem__(idx)
        if self.period is not None and (idx + self.offset) % self.period == 0:
            return value + self.increment
        return value
    def __iter__(self):
        if self.period is None:
            return super().__iter__()
        return iter(self[n] for n in range(len(self)))
    def __contains__(self, key):
        if self.period is None:
            return super().__contains__()
        # caution: this will be much slower than a list!
        #   consider NotImplementedError or internal set() wrapper for small spreads
        for value in self:
            if key == value:
                return True
        return False

Example Usage

>>> list(LazyIncrementList((1, 2, 3), period=5, increment=10, offset=5))
[11, 2, 3]
>>> l = LazyIncrementList((3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9), period=8, increment=0.5)
>>> l
LazyIncrementList<10>
>>> list(l)
[3.5, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 4.3, 3.9]
>>> l[0], l[8]
(3.5, 4.3)
>>> 3.8 in l
False
>>> 4.3 in l
True
>>> l[8] = 12.1
>>> l[8]
12.6
>>> l[8]
12.6

Note that with a few million values and if you eventually use every value, just looping and incrementing is probably fine, surprisingly quick (a few seconds at most), and also obvious.. however, there might be lots of middle-ground, such as if you could lazily read the values from disk by-lines .. but we need more details about your problem to know if this would be an improvement

with open(path) as fh:
    for index, line in enumerate(fh):
        value = float(line)
        if index % 8 == 0:
            value += 0.5
        yield data

Finally, what's "best" would change further if you had very large memory objects rather than humble floats, say images you were manipulating, if your values were located on a slow network resource, or there was too much data for your system memory .. then you might consider breaking the work up into jobs for threads or an async pool work through them or calling out to a database

ti7
  • 16,375
  • 6
  • 40
  • 68
-1

It seems using numpy will be more efficient, but if you cannot use numpy, I guess you can use list comprehension. Maybe you can try the codes below, if you're not concerned about changing the original list but want a new list:

constant = 5.0

float_values = [val + constant if i % 8 == 0 else val for i, val in enumerate(float_values)]

This creates a new list where every 8th value is increased by the constant, and all other values are kept the same.

If you want to add the constant to every 8th element starting from the first element as 1 (i.e., indices 7, 15, 23, ... in 0-based indexing), you should modify the check to (i+1) % 8 == 0:

constant = 5.0

float_values = [val + constant if (i+1) % 8 == 0 else val for i, val in enumerate(float_values)]

Though, I agree with the comments that using numpy will be easier.

aspen
  • 29
  • 4
  • The question was how to do it more efficiently than using a for loop. A list comprehension won't help. In this specific case, it actually makes it slower and requires double the memory. – zvone Jul 25 '23 at 05:56