Print specific rows from dataframe using a condition

Question

Be advised I'm doing this in a Function, and I've already referred a pretty good thread.

Here's the python function, the parameter passed is taken from user

def recommend(uid):
    ds = pd.read_csv("pred_matrix-full_ubcf.csv")
    records = ds.loc[ds['uid'] == uid]
    for recom in records:
        print recom

Data Format:

uid iid     rat
344 1189    5
344 1500    5
344 814     5
736 217     3.3242361285
736 405     3.3238380154
736 866     3.323500531
331 1680    2
331 1665    2
331 36      1.999918585

Referred: this1, this2

Unable to get where I'm going wrong, I'm following this1 thread and yet unable to get it.

@cᴏʟᴅsᴘᴇᴇᴅ I'm not getting the error, all I get is 'uid iid rat' on whatever number I give as input — T3J45, Jul 08 '17 at 09:15
Tejas, now I understand. Take a look at my answer. You need to call `records.iterrows()`. — cs95, Jul 08 '17 at 09:21

cs95 · Accepted Answer · 2017-07-10T12:03:36.550

8

To iterate over your rows, use df.iterrows():

In [53]: records = df[df['uid'] == query]

In [54]: for index, row in records.iterrows():
    ...:     print(row['uid'], row['iid'], row['rat'])
    ...: 
344.0 1189.0 5.0
344.0 1500.0 5.0
344.0 814.0 5.0

There's two other possible ways to do select your data. You can use boolean indexing:

In [4]: query = 344

In [7]: df[df['uid'] == query]
Out[7]: 
   uid   iid  rat
0  344  1189  5.0
1  344  1500  5.0
2  344   814  5.0

You can also use DataFrame.query function:

In [8]: df.query('uid == %d' %query)
Out[8]: 
   uid   iid  rat
0  344  1189  5.0
1  344  1500  5.0
2  344   814  5.0

edited Jul 10 '17 at 12:03

answered Jul 08 '17 at 08:56

cs95

379,657
97
704
746

Hey, I did try them. Yet not successful in it. Here's on what I get. – T3J45 Jul 10 '17 at 11:22
1

`records = ds[ds['uid'] == id] for recom in records: print recom` gives output as `Enter User ID:344 uid iid rat` – T3J45 Jul 10 '17 at 11:24
@Tejas Your code looks nothing like mine. You should use `for index, row in records.iterrows()`. Did you see my answer? – cs95 Jul 10 '17 at 11:25
@Tejas You didn't even copy it correctly brother. Check my edit. Look at the first code snippet. Please copy it exactly like that. Why are you making a mistake while doing ctrl+c ctrl+v? – cs95 Jul 10 '17 at 12:03
@coldspeed Ok bro, I'll try but I've got to change variables as per Nomenclature I'm using. – T3J45 Jul 10 '17 at 12:08
Changing variables is fine, but you are changing the functionality of my code and then telling me it doesn't work. – cs95 Jul 10 '17 at 12:11
@Tejas _**What is "query"**_? What are you passing to it? These issues are simple. It seems you are trying to compare strings and ints, which is wrong. In my example, 'uid' is assumed to be integer. – cs95 Jul 10 '17 at 12:21
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148782/discussion-between-tejas-and-cs). – T3J45 Jul 10 '17 at 12:23

score 2 · Answer 2 · answered Apr 17 '19 at 20:24

You could also use the where() method on the DataFrame object right away. You can provide the condition to this method as the first argument. See the following example:

dataset.where(dataset['class']==0)

Which would give the following output

        f000001   f000002   f000003  ...     f000102   f000103  class
0      0.000000  0.000000  0.000000  ...    0.000000  0.080000    0.0
1      0.000000  0.000000  0.000000  ...    0.000000  0.058824    0.0
2      0.000000  0.000000  0.000000  ...    0.000000  0.095238    0.0
3      0.029867  0.000000  0.012769  ...    0.000000  0.085106    0.0
4      0.000000  0.000000  0.000000  ...    0.000000  0.085106    0.0
5      0.000000  0.000000  0.000000  ...    0.000000  0.085106    0.0
6      0.000000  0.000000  0.000000  ...    0.000000  0.127660    0.0
7      0.000000  0.000000  0.000000  ...    0.000000  0.106383    0.0
8      0.000000  0.000000  0.000000  ...    0.000000  0.127660    0.0
9      0.000000  0.000000  0.000000  ...    0.000000  0.106383    0.0
10     0.000000  0.000000  0.000000  ...    0.000000  0.085106    0.0
11     0.021392  0.000000  0.000000  ...    0.000000  0.042553    0.0
12    -0.063880 -0.124403 -0.102466  ...    0.000000  0.042553    0.0
13     0.000000  0.000000  0.000000  ...    0.000000  0.021277    0.0
14     0.000000  0.000000  0.000000  ...    0.000000  0.000000    0.0
15     0.000000  0.000000 -0.060884  ...    0.000000  0.000000    0.0

[18323 rows x 104 columns]

(I got rid of the rest of the output for brevity of the answer)

A huge advantage of using this method over just referencing is that you can additionally replace those values that don't match the condition using the other argument, and also perform some operation on the values that match the condition using the inplace argument. Basically, you can reconstruct the rows of the your dataframe as desired.

Additionally, because this function returns the a dataframe minus those rows that don't match the condition, you could re-reference a specific column such as

dataset.where(dataset['class']==0)['f000001']

And this will print the 'f000001' (first feature) column for you, where the class label is 0.

Print specific rows from dataframe using a condition

2 Answers2