0

The CSV file I'm trying to change has 2 columns

Part Number.    Qty
h36fsde (1)      6
gr3jgr  (1)      2
qmdd12  (2)      9

It is basically a list of part numbers in the first column followed by quantities in the second column. I need to remove the (number in brackets) in column one as above (1) (1) (2) because this relates to page numbers and it causes a problem when I try to import it.

Ideally my list should be:

Part Number.    Qty
h36fsde          6
gr3jgr           2
qmdd12           9

So far the code I have allows me to choose the file I wish to change with Tkinter, then I'm trying to replace "(1)" with " " and then create a new directory if it doesn't exist called quotations and write the csv to that directory. So far it loads the file okay and writes a file new5.csv to the quotations folder, however it does not replace the "(1)" with a " " so please can someone let me know what I am doing wrong? I either need to simply remove the bracketed numbers all together or replace them with a " "

Please see my code so far below. I will try to tidy it up with def functions, however I'd like to get it working first and then see how I can improve it and make it more modular.

import tkinter as tk
from tkinter.filedialog import askopenfilename
import pandas as pd
import os  
from pathlib import Path  

root = tk.Tk()
root.withdraw() #Prevents the Tkinter window to come up
csvpath = askopenfilename()
root.destroy()

df = pd.read_csv(csvpath)
df.replace('(1)', ' ', inplace=True)

filepath = Path('quotations/new5.csv')  
filepath.parent.mkdir(parents=True, exist_ok=True)  
df.to_csv(filepath)  

Once I have it working I will need to set it to remove (numbers in brackets) from (1) to (20). Plus is there an easy way for me to write the edited file straight back over the top of the file that was opened rather than keep saving the edited file as "New5.csv" and then renaming it in explorer?

halfer
  • 19,824
  • 17
  • 99
  • 186
  • Show us a sample of the dataframe resulting from ```df = pd.read_csv(csvpath)``` – itprorh66 Feb 08 '22 at 00:29
  • You can normally write the edited file back to the same path you read it from, but your operating system will often prevent you from doing so if you have it open in an application like Excel. – Stuart Feb 08 '22 at 01:09
  • Thanks guys and @Stuart, I really appreciate your help. As soon as I changed it to your recommendation of df.replace(r" \(1\)", "", regex=True, inplace=True then it worked straight away. Really appreciate the help and support. I'm going to try and tidy it up now and add a box to confirm it's been done and put it in to functions and then I'll put up here for any comments of what I should be thinking about to improve it. Thanks again! – BristolPip Feb 08 '22 at 09:19
  • Thanks @itprorh66, the output was exactly the same as the input, however it has been resolved now. Thanks for your help and interest though, very much appreciated. – BristolPip Feb 08 '22 at 09:21
  • Similar question: https://stackoverflow.com/questions/28986489/how-to-replace-text-in-a-string-column-of-a-pandas-dataframe – Stuart Feb 08 '22 at 10:36

1 Answers1

0

df.replace finds and replaces whole cells, not text within them, unless the string to find is given as a regular expression:

df.replace(r" \(1\)", "", regex=True, inplace=True)

Alternatively, use Series.str.replace instead.

df["Part Number."] = df["Part Number."].str.replace('(1)', '')

To replace all the bracketed numbers in one go, use a more general regex:

df["Part Number."] = df["Part Number."].str.replace(r" \(\d\d?\)", '', regex=True)
Stuart
  • 9,597
  • 1
  • 21
  • 30