2

I am attempting to build a python program using Pandas that takes any organization of data and converts it into a standard format. This program also takes two columns of data and replaces that data with a specific code according what data is contained within each column. However, this seems to be working fine with small files, but the replace function does not work at all with larger files and does not give me an error message either. What could be going wrong here. Here is some sample code:

data.columnheadder.replace("1|Generic input", "101", regex=True, inplace=True)
James
  • 32,991
  • 4
  • 47
  • 70
enton91
  • 21
  • 1

1 Answers1

0

regex=True does slow down the operation. Why not try this:

data.columnheadder = data.columnheadder.replace("1", "101", regex=False).replace("Generic input", "101", regex=False)

This is only advisable if you have a small number of strings to replace.

For further performance enhancement, please see @unutbu's answer to a similar question I posed.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • Oh ok. Well what would be a better option for a large amount of strings? The number of strings would be indeterminate but no less that 10 and no more than 1000. This is really the issue I am running into. I need a method do both a small amount of strings and a large amount of strings when needed. – enton91 Feb 12 '18 at 20:31
  • @enton91, included link in my response to an answer which might help. – jpp Feb 13 '18 at 09:13