1

I have a text file in which I have, for example, 1084 elements. I list them.

import csv
a = []
with open('example.txt', 'r') as csvfile:
    file_name = csv.reader(csvfile, delimiter='\t')
    for row in file_name:
        a.append(int(row[1]))
print(a)

[144, 67, 5, 23, 64...456, 78, 124]

Next, I need to take the average of every one hundred elements of the list, average the last 84 elements and bring it to a new list. How exactly can I do this? Maybe with numpy?

smci
  • 32,567
  • 20
  • 113
  • 146
Prap Prep
  • 23
  • 4
  • To chunk your list of length 1084 into chunks of size 100, use this solution [How do you split a list into evenly sized chunks?](https://stackoverflow.com/a/22045226/202229). The rest is trivial. You don't need numpy/scipy at all. – smci Nov 19 '18 at 11:30
  • 1
    @bunji: OP doesn't need numpy/scipy. See the solution I referenced for chunking a list. – smci Nov 19 '18 at 11:32
  • OP: If the solution **must** use numpy(/pandas), please edit that into the title to make it clear why this should not be closed as a duplicate. If not, do we mark this as a dupe, or else leave it open to reflect both (base Python + numpy) approaches? – smci Nov 19 '18 at 11:53

3 Answers3

2

This will work in Python 3, which I'm going to assume you're using since you wrote print as a function. In Python 2 you'll need to make sure that the division is floating point division.

# a = [...]

# Separate the groups. The last slice will be fine with less than 100 numbers.
groups = [a[x:x+100] for x in range(0, len(a), 100)]

# Simple math to calculate the means
means = [sum(group)/len(group) for group in groups]

If you do want to do this in Python 2, you can replace len(group) in the last line with float(len(group)), so that it forces floating point division; or, you can add from __future__ import division at the top of your module, although doing that will change it for the whole module, not just this line.


Here's a nice, reusable version that works with iterators, and is itself a generator:

from itertools import islice

def means_of_slices(iterable, slice_size):
    iterator = iter(iterable)
    while True:
        slice = list(islice(iterator, slice_size))
        if slice:
            yield sum(slice)/len(slice)
        else:
            return

a = [1, 2, 3, 4, 5, 6]

means = list(means_of_slices(a, 2))

print(means)
# [1.5, 3.5, 5.5]

You shouldn't make code too much more general than you actually need it, but for bonus experience points, you can make this even more general and composable. The above should be more than enough for your use though.

Cyphase
  • 11,502
  • 2
  • 31
  • 32
  • 1
    It's always better to add the future import, rather than casting to float. The future import even works on very old versions, at least 2.4, if not earlier. – PM 2Ring Nov 19 '18 at 11:59
  • Agreed, if you don't have a reason not to do it (e.g. legacy code), then the import is better. Sometimes you have to draw a line in what you mention in an answer though, or else you get to, "But really you should just be using Python 3 because.." I'm glad you mentioned it here. Thanks. – Cyphase Nov 19 '18 at 12:05
0

Simply, slice the chunks and get averages.

import math
averages = []
CHUNK_SIZE = 100

for i in range(0, math.ceil(len(a) / CHUNK_SIZE)):
    start_index = i * CHUNK_SIZE
    end_index = start_index + CHUNK_SIZE
    chunk = a[start_index:end_index]
    averages.append(sum(chunk)/len(chunk))
amirathi
  • 436
  • 4
  • 7
-1

You can do this in base Python using the recipe How do you split a list into evenly sized chunks? by senderle:

a = range(1,1084)

from itertools import islice

def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())

means = [sum(chk)/len(chk) for chk in chunk(a, size=100)]

# [50.5, 150.5, 250.5, 350.5, 450.5, 550.5, 650.5, 750.5, 850.5, 950.5, 1042.0]
smci
  • 32,567
  • 20
  • 113
  • 146