0

Having issues with building a find and replace tool in python. Goal is to search a column in an excel file for a string and swap out every letter of the string based on the key value pair of the dictionary, then write the entire new string back to the same cell. So "ABC" should convert to "BCD". I have to find and replace any occurrence of individual characters.

The below code runs without debugging, but newvalue never creates and I don't know why. No issues writing data to the cell if newvalue gets created.

input: df = pd.DataFrame({'Code1': ['ABC1', 'B5CD', 'C3DE']})

expected output: df = pd.DataFrame({'Code1': ['BCD1', 'C5DE', 'D3EF']})


mycolumns = ["Col1", "Col2"]
mydictionary = {'A': 'B', 'B': 'C', 'C': 'D'}

   for x in mycolumns:

        # 1. If the mycolumn value exists in the headerlist of the file
        if x in headerlist:

            # 2. Get column coordinate
            col = df.columns.get_loc(x) + 1

            # 3. iterate through the rows underneath that header
            for ind in df.index:

                # 4. log the row coordinate
                rangerow = ind + 2
                # 5. get the original value of that coordinate
                oldval = df[x][ind]

                for count, y in enumerate(oldval):
                    # 6. generate replacement value
                    newval = df.replace({y: mydictionary}, inplace=True, regex=True, value=None)
                    print("old: " + str(oldval) + " new: " + str(newval))
                    # 7. update the cell
                    ws.cell(row=rangerow, column=col).value = newval
                else:
                    print("not in the string")
            else:

                # print(df)
                print("column doesn't exist in workbook, moving on")

        else:
            print("done")

        wb.save(filepath)
        wb.close()

I know there's something going on with enumerate and I'm probably not stitching the string back together after I do replacements? Or maybe a dictionary is the wrong solution to what I am trying to do, the key:value pair is what led me to use it. I have a little programming background but ery little with python. Appreciate any help.

kellie A
  • 3
  • 4
  • 1
    Can you provide sample input and expected output? Please see:https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – David Erickson Dec 16 '20 at 18:14
  • `newvalue never creates` - Can you explain that? When you inspected/printed values and/or conditions at various points in your program was there an obvious place where it was misbehaving? If you are using an IDE **now** is a good time to learn its debugging features Or the built-in [Python debugger](https://docs.python.org/3/library/pdb.html). ... [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/questions/25385173/what-is-a-debugger-and-how-can-it-help-me-diagnose-problems) – wwii Dec 16 '20 at 18:14
  • @wwii Sure, when it hits step #6, oldvalue is found successfully, but newvalue just says 'None' on every loop. My expectation was that enumerate would go through each character of the original value, use it for the dictionary key, swap in the dictionary value, and repeat that until the end of the string in the cell. So a string like A54C would become B54D (I'm only replacing letters). I'm using pycharm as my IDE and running the debugger and nothing explicitly breaks, but since newvalue remains a null my only guess is that it's not able to do the replace action successfully – kellie A Dec 16 '20 at 18:31
  • @DavidErickson, yes I'll edit the post, thanks for the link. – kellie A Dec 16 '20 at 18:31
  • `replace` will not return anything when using `inplace=True` - [https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html) – wwii Dec 16 '20 at 18:38
  • @wwii I updated inplace to false and now newvalue isn't null, which seems good, but the newvalue isn't what I thought it would be. I got back oldvalue: ABC1, newvalue: {'Code1': ['ABC1', 'B5CD', 'C3DE'.......]}. Then it returned "raise ValueError("Cannot convert {0!r} to Excel".format(value))" – kellie A Dec 16 '20 at 18:54
  • Please ask a new question - reduce your code to a [mre] - just what is needed to reproduce the problem and be sure to include an example input and output. – wwii Dec 16 '20 at 19:21

1 Answers1

0

newvalue never creates and I don't know why.

DataFrame.replace with inplace=True will return None.

>>> df = pd.DataFrame({'Code1': ['ABC1', 'B5CD', 'C3DE']})
>>> df = df.replace('ABC1','999')
>>> df
  Code1
0   999
1  B5CD
2  C3DE
>>> q = df.replace('999','zzz', inplace=True) 
>>> print(q)
None
>>> df
  Code1
0   zzz
1  B5CD
2  C3DE
>>>

An alternative could b to use str.translate on the column (using its str attribute) to encode the entire Series

>>> df = pd.DataFrame({'Code1': ['ABC1', 'B5CD', 'C3DE']})
>>> mydictionary = {'A': 'B', 'B': 'C', 'C': 'D'}
>>> table = str.maketrans('ABC','BCD')
>>> df
  Code1
0  ABC1
1  B5CD
2  C3DE
>>> df.Code1.str.translate(table)
0    BCD1
1    C5DD
2    D3DE
Name: Code1, dtype: object
>>>
wwii
  • 23,232
  • 7
  • 37
  • 77
  • Whoa thank you! This was much better than what I was trying to do. Had to read up on your links to get the hang of it but it works perfectly and is much simpler to follow. Thanks again. – kellie A Dec 16 '20 at 23:28