2

please be gentle total Python newbie, I'm currently writing a script which is turning out to be very long and i thought there must! be a for loop method to make this easier. I'm currently going through a CSV, pulling the header titles and placing it within a str.replace code, manually.

df['Col 1'] = df['Col 1'].str.replace('text','replacement')

I figured it would start like this.. but no idea how to proceed!

Import pandas as pd
df = pd.read_csv('file.csv')
for row in df.columns:
   if (df[,:] =... 

Sorry I know this probably looks terrible, but this is all I could fathom with my limited knowledge!

Thanks!

Umar.H
  • 22,559
  • 7
  • 39
  • 74
  • 1
    Why not use `df = df.replace('text','replacement')` ? – jezrael Feb 19 '18 at 14:23
  • Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Feb 19 '18 at 14:27
  • What's the difference ? this works for me when I test in Juypter (str.replace) so it's not an issue, – Umar.H Feb 19 '18 at 14:33
  • 2
    `df['Col 1'] = df['Col 1'].str.replace('text','replacement')` replace only one column, `df = df.replace('text','replacement')` replace all columns in dataframe. – jezrael Feb 19 '18 at 14:34

4 Answers4

1

No worries! We've all been there.

Your import statement should be lowercase: import pandas as pd

In your for loop, I think there's a misunderstanding of what you'll be iterating over. The for row in df.columns will iterate over the column names, not the rows.

Is it correct to say that you'd like to convert the column names to strings?

David Stevens
  • 835
  • 1
  • 6
  • 15
  • my column names are already strings and list correctly within the CSV, however the values within the columns are wrong so i'm just doing some tidying up. What I want to do is something like import pandas as pd df = read_csv('file.csv') df.columns ['col a', 'col b', 'col c'] * 210 (these are all str values) I would like to write a loop method/fuction to write all the column names into the str string above if that makes sense? – Umar.H Feb 19 '18 at 14:30
  • Gotcha. So I would do `for col in df.columms: df[col].replace('text', 'replacement')` to answer your original question. It might also be of interest to you to check out the comment that suggests `df.replace(...)` on the whole dataframe instead of iterating over columns. – David Stevens Feb 19 '18 at 14:35
  • Many thanks, however the change in each column is unique, for example col 1 = replacement.a and so forth, so that's why I probably went with str.replace as opposed to a df.replace – Umar.H Feb 19 '18 at 14:44
1
df = pd.read_csv('file.csv',usecols=['List of column names you want to use from your csv'],
names=['list of names of column you want your pandas df to have'])

You should read the docs and identify the fields that are important in your case.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

Skandix
  • 1,916
  • 6
  • 27
  • 36
drew_psy
  • 95
  • 8
1

jezrael comment solved it much more ellegantly.

But, in case you needed specific code for each column it would go something like this:

import pandas as pd

df = pd.read_csv('file.csv')

for column in df.columns:
    df[column] = df[column].str.replace('text','replacement')
joaoavf
  • 1,343
  • 1
  • 12
  • 25
1

You can do a multiple-column replacement in one shot with replace by passing in a dictionary.

Say you want to replace t1 with r1 in column a; t2 with r2 in column b, you can do

df.replace({"a":{"t1":"r1"}, "b":{"t2":"r2"}})
Tai
  • 7,684
  • 3
  • 29
  • 49
  • This seems much more logical, a question I have is, is there a way to index the columns to see what letter corresponds with each column? or must you list each column name manually? for example my csv has 300+ columns, i'm deleting a few and merging others the most manual part is changing most answers from "yes" to a more appropriate string. – Umar.H Feb 19 '18 at 15:35
  • 1
    @Datanovice here the letters are the column names. So it will look into column `a` and `b`. Do I answer your question? – Tai Feb 19 '18 at 15:37
  • 1
    @Datanovice I think you need to build the dictionary manually. I am not clear for your problem: not sure whether there are side effect to do replacement at once. You can certainly replace all "yes" to something with a call, but I don't know whether that will change some part that you don't want to change as you require column by column replacement. – Tai Feb 19 '18 at 15:41
  • 1
    that does indeed answer my questions. Thanks for your help this was extremely educational. – Umar.H Feb 19 '18 at 16:19