1

How do I replace the cell values in a column if they contain a number in general or contain a specific thing like a comma, replace the whole cell value with something else.

Say for example a column that has a comma meaning it has more than one thing I want it to be replaced by text like "ENM".

For a column that has a cell with a number value, I want to replace it by 'UNM'

Brooney
  • 11
  • 1
  • 4

2 Answers2

0

As you have not provided examples of what your expected and current output look like, I'm making some assumptions below. What it seems like you're trying to do is iterate through every value in a column and if the value meets certain conditions, change it to something else.

Just a general pointer. Iterating through dataframes requires some important considerations for larger sizes. Read through this answer for more insight.

Start by defining a function you want to use to check the value:

def has_comma(value):
    if ',' in value:
        return True
    return False

Then use the pandas.DataFrame.replace method to make the change.

for i in df['column_name']:
    if has_comma(i):
        df['column_name'] = df['column_name'].replace([i], 'ENM')
    else:
        df['column_name'] = df['column_name'].replace([i], 'UNM')
codingray
  • 106
  • 7
  • the output result is "TypeError" argument of type 'float' is not iterable – Brooney Mar 18 '22 at 04:03
  • @Brooney That's strange. A column should not be seen as a float, even if it contains floats. Could you point out which line this error is pointing to and if you're in fact using the column syntax, not a specific value? – codingray Mar 19 '22 at 04:15
  • I am trying to iterate through each cell in the column and change the whole cell if a substring occurs in the cell – Brooney Mar 20 '22 at 05:10
0

Say you have a column, i.e. pandas Series called col

The following code can be used to map values with comma to "ENM" as per your example

col.mask(col.str.contains(','), "ENM")

You can overwrite your original column with this result if that's what you want to do. This approach will be much faster than looping through each element.

For mapping floats to "UNM" as per your example the following would work

col.mask(col.apply(isinstance, args=(float,)), "UNM")

Hopefully you get the idea. See https://pandas.pydata.org/docs/reference/api/pandas.Series.mask.html for more info on masking

Riley
  • 2,153
  • 1
  • 6
  • 16