0

I'm trying to make a boxplot of auto data of mpg per the number of cylinders. This is the code I have

cyls = list(set(np.array(auto.cylinders)))
data = []
for val in cyls:
    d = np.array(auto.loc[auto['cylinders'] == val].mpg)
    data.append(d)
fig, ax = plt.subplots()
ax.boxplot(data, positions = cyls);

This was done in a jupyter notebook. It works fine, but it feels like kind of a roundabout solution, especially since this is apparently a lot easier in R. Is there a more concise way of doing this?

1 Answers1

0

It looks like the most roundabout part of your solution is the for-loop. I understand the pain, since MATLAB also has concise ways of accessing array elements.

It looks like you can make your solution a little more concise by replacing your for-loop with something like this:

data = [ auto[i].mpg for i in indices]

where indices is the array of desired indices for each automobile.

(See Access multiple elements of list knowing their index)

But depending on how you have set up the auto object, it's possible that you could simply use

data = list(set(np.array(auto.mpg)))

to get the corresponding mpg values.