1

I have this code:

dd = input("What is your desired degree for slices? ")
listnum = int(360 / dd)

a = OrderedDict()
for x in range(0,listnum):
    for j in range(len(posvcs)):
        begin = 0
        end = dd
        if (posvcs[j,2] >= begin) & (posvcs[j,2] < end):
            a["dataset{0}".format(x)] = posvcs[begin:end,2]
            begin = begin + end
            end = begin + dd

It's supposed to be looking through "posvcs" column 2, each row, for values between two different values in degrees. For example, if i want to have my pie cut into 8 pieces in 45 degree increments and my data set (posvcs) has points with degree values between 0 & 360, i want it to first only look at desired degree = 0 to 45. Then, taking those corresponding rows & columns, putting the data into "dataset0". Then, it'll put data from 45 to 90 degrees into "dataset1".

What it is doing is taking "dd" number of rows (so in my example, it's taking the first 45 rows) and only backing up that data. And then instead of "begin" and "end" values changing to move onto the next angle increment specified, it's staying at 45.

Any help is greatly appreciated!

EDIT: Typo with indentations. It is now written as it is in my program.

jwodder
  • 54,758
  • 12
  • 108
  • 124
layces
  • 161
  • 2
  • 2
  • 12
  • 1
    What error are you getting? I'm assuming that at least some or your issues are that you're trying to index the second column using the index 2 (`posvcs[j,2]`) rather than index 1 (`posvcs[j,1]`). Remember that indexing in python is zero-based! – Suever Jan 15 '16 at 00:04
  • also the `for j in...` row and below need to be indented one indent more, maybe that was just a typo in your question? – alexanderbird Jan 15 '16 at 00:05
  • 2
    Just a side comment. Often if you find yourself including a number in your variable or dictionary key names, it indicates that maybe you should consider using a different data structure to hold your data. In this case, maybe a list rather than a dict with keys of `dataset1`, `dataset2`, etc. – Suever Jan 15 '16 at 00:06
  • So i'm actually not getting an error, it's just not doing what i know it needs to, ya know? I called it column 2 but it's technically column 3 (0 and 1 are other values), haha i had that trouble in a past part of my code and make sure to remember the zero-based indexing. The first 45 rows in my 2800+ row data (posvcs) file run from angles (in column 2) 1.2 to 13.199 and those are the only values it's saving into my premade & prenamed arrays (dataset0, dataset1, etc.) I'm needing it to begin at 1.2 and end at, but not equal, 45. – layces Jan 15 '16 at 00:10
  • @Suever, I am not sure how to create premade and prenamed lists similar to what dict does? I need it to, using what my "dd" value is (i.e. 45 degrees), automatically create that many arrays (360/dd) and name them 0,1,2,etc. – layces Jan 15 '16 at 00:13
  • Is this Python 2 or Python 3? `input` behaves differently in each of them. – jwodder Jan 15 '16 at 00:13
  • @TC I guess that's my point, it doesn't appear that it needs a name if your name is just going to be `dataset1`. If that's the case then you'd just create a list called datasets and `dataset1` would more logically be `dataset[0]`. Just a note. – Suever Jan 15 '16 at 00:15
  • @jwodder I have Python 2.7 – layces Jan 15 '16 at 00:15
  • @Suever Oh, so i would have a list of lists similar to this: http://stackoverflow.com/questions/11487049/python-list-of-lists ? How could i have it append the data to the next list (from 0 to 1 for example) automatically? – layces Jan 15 '16 at 00:22
  • I think `&` is "bitwise and" in Python, you probably want the `and` keyword in your `if` test... – TessellatingHeckler Jan 15 '16 at 00:23
  • Would changing your `&` to `and` in your `if` do anything helpful? – Craig Estey Jan 15 '16 at 00:23
  • @CraigEstey Unfortunately it didn't, it still only runs over the first 45 rows (angles 1.2 to 13.199) instead of running between rows with angles 1.2 to 44.999 – layces Jan 15 '16 at 00:27

3 Answers3

1

You could solve this problem pretty easily using numpy which is ideal for manipulating numbers in this way. Also, it allows you to use boolean arrays as indexing so you can remove a lot of your loops.

import numpy as np
dd = input("What is your desired degree for slices? ")

limits = range(0, 360, int(dd))

# Append 361 just so we get the last group
limits.append(361)

datasets = list()

# Convert to a numpy array
pos = np.array(posvcs)

# Now group everything
for k in range(len(limits) - 1):
    inrange = np.logical_and(pos[:,2] >= limits[k], pos[:,2] < limits[k+1])
    datasets.append(pos[inrange,2])

If you really insist on having lists you can do the following to convert

datalists = [d.tolist() for d in datasets]

I recommend keeping them as numpy arrays though if you're going to be performing more calculations from them:

# Calculating the mean of each group
means = [d.mean() for d in datasets]

# Standard deviation of each group
stdevs = [d.std() for d in datasets]

ADVANCED

If you really want to go down the numpy path, you can replace the first part with something like this:

group_ids = np.digitize(pos[:,2], limits)
datasets = [pos[group_ids == k,2] for k in range(1, len(limits))]
Suever
  • 64,497
  • 14
  • 82
  • 101
  • Holy crap it worked!! THANK YOU. So now, if i wanted to access each one, i'd call it datasets[0] or whatever? – layces Jan 15 '16 at 00:47
  • @TC Yes that's how you'd access them. Now, remember they will be numpy arrays **not** lists anymore. I'm not sure what you want to do with them, but if it's anything number-ish I'd recommend keeping them that way otherwise you can convert to lists as needed – Suever Jan 15 '16 at 00:48
  • @Seuever Ah okay, which one would i use the tolist() function on, datasets or pos? – layces Jan 15 '16 at 00:52
  • I just need to be able to apply a bunch of statistical analysis functions to the separate arrays (i.e. binning them, weighted mean, etc.) Is it best to use lists or keep it as arrays? – layces Jan 15 '16 at 00:53
  • I have updated my answer to demonstrate how to convert to a list of lists. If you're doing things like means, etc. I would definitely keep them as numpy arrays as that functionality is builtin. Check out the numpy documentation. (added some examples to response) – Suever Jan 15 '16 at 00:55
  • This is exciting, and so much easier wow. What if i needed to do the weighted means in each array if i knew the column the values were in and the column the associated weights were in? Is that just http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.average.html ? – layces Jan 15 '16 at 01:01
  • I'm not quite sure what you're asking. But you can do lots of things with numpy. Just play with it and you'll see. – Suever Jan 15 '16 at 01:15
0

I haven't looked at the details, but there's one easy error:

if (posvcs[j,2] >= begin) & (posvcs[j,2] < end)

In Python, use the word "and" instead of &, which is a bitwise-and operator. i.e.

if (posvcs[j,2] >= begin) and (posvcs[j,2] < end)
Riaz
  • 874
  • 6
  • 15
  • Unfortunately when i changed that, the "&" to "and", it didn't change it.. It's still running over 45 rows instead of the values themselves within the rows (which should go up until 45 degrees.) – layces Jan 15 '16 at 00:33
0

Use this instead:

begin = 0
end = dd
for x in range(0,listnum):
    a["dataset{0}".format(x)] = []
    for j in range(len(posvcs)):
        if (posvcs[j][2] >= begin) and (posvcs[j][2] < end):
            a["dataset{0}".format(x)].append(posvcs[j][2])
    begin += dd
    end += dd
SoreDakeNoKoto
  • 1,175
  • 1
  • 9
  • 16
  • This is crazy, it is still giving me the same thing as before. How come it is going until row #45 instead of looking at the VALUES, and whether they equal 45 or not? – layces Jan 15 '16 at 00:37
  • posvcs is a list of lists, right? Try moving the `begin` and `end` declarations outside the outermost loop – SoreDakeNoKoto Jan 15 '16 at 00:38
  • So i put in your first answer and it hadn't worked, but now (with your second one) it is giving me values between angle 32.78 to 36 which are another 45 lines/rows of data, but not the ones i need.. – layces Jan 15 '16 at 00:40
  • What is posvcs? I use python 3 and the way u index it in two different ways: `[x,y]` and `[x:y,z]` looks weird – SoreDakeNoKoto Jan 15 '16 at 00:40
  • posvcs is my 2800 row, 6 column data that i need to access column 2 to look at each row's value (column 2 are my angles, they're sorted from 0 to 360) – layces Jan 15 '16 at 00:43
  • I'm not familiar with py 2.7 by as a list of lists, shouldn't you be indexing it this way: `posvcs[j:2]`...and whats the colon in `posvcs[begin:end,2]` for? Slicing? – SoreDakeNoKoto Jan 15 '16 at 00:49
  • I had thought that it was: datafile[row,column] and that datafile[startrow:endrow,column] would access only certain rows but that you could still acess whatever columns you wanted? I think i was very wrong in trying to do that though.. – layces Jan 15 '16 at 00:55