0

"receiverlist" is a Pandas dataframe containing unique receivers in a bank statement and their total amounts. I've also initiated a new column "Category" where values are 0 at first.

    Receiver     Amount Category
140 abcreceiver  5000   0
39  xyzreceiver  3000   0
103 asdreceiver  562.51 0
148 ertreceiver  416.98 0
62  yuireceiver  231.00 0

Goal:

I want to run a function for every receiver in the "Receiver" column so that every receiver that contains a given string (searchterm), gets a given category (givecategory) in the "Category" column. However, my current function only works if the value fully matches, but I don't know how make partial matches enough.

The current function looks like this:

def categorize(searchterm, givecategory):
    receiverlist["Category"] = np.where(receiverlist["Receiver"] == searchterm, givecategory, receiverlist["Category"])
    return receiverlist["Category"]

I then run the categorize-function:

receiverlist["Category"] = categorize("xyz", "Receiver Xyz")

So the question is: how can I give a receiver that contains the given "xyz" (or whatever partial searchterm) add the given givecategory into its Category column?

KMFR
  • 895
  • 1
  • 15
  • 24
  • [`pd.Series.str.contains('your_search_term')`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.contains.html) – ALollz Jan 06 '19 at 17:15
  • I tried that first but it gives "True", and running the function gives "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()" – KMFR Jan 06 '19 at 17:18
  • 1
    `receiverlist.loc[receiverlist.Receiver.str.contains('your_search_term'), 'Category'] = your_category` is how you would set just the subset. You should read the page on indexing and selecting data in pandas, specifically the section on [Boolean inndexing](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing) and the use of `.loc` – ALollz Jan 06 '19 at 17:26
  • Thank you, that did it. – KMFR Jan 06 '19 at 19:35
  • By the way, I don't think this question is a duplicate of the one you linked, that one is talking about filtering rows when this one is about filtering columns. – KMFR Jan 10 '19 at 21:17
  • No, you are assigning value to a column based on a filtered set of rows. One of your biggest issues is that you are not filtering (obtaining the Boolean mask) properly. You do `receiverlist["Receiver"] == searchterm`, which is wrong in this instance. You want `receiverlist["Receiver"].str.contains(searchterm)`, which that duplicate fully explains. – ALollz Jan 10 '19 at 22:31

0 Answers0