I searched for other answers before asking this doubt. Iam running on a windows 11 machine. The csv file I got has a " in between some lines which is cause an error when importing to mongodb. So i wanted to remove it. So I found that the sed command is very fast in doing that. Most of you may recommend me to use the replace function in python but here it is not feasible because the file is 5GB in size. and when I tested both methods I found that sed is much faster.
In my system I have to run bash in command terminal and enter bash mode and then run the sed command there.
How should I run subprocess.run() command for this to achieve. Below is my code
import subprocess
p = subprocess.run('bash' | r"sed -i 's/\"/-/g' D:\Backupfiles\MAY2021\Names.csv", shell=True, capture_output=True, check=True)
print(p.returncode)
given below is the error I get when running the above code.
"C:\Users\AEC Office Kollam\anaconda3\envs\SDR Project\python.exe" "C:/Users/AEC Office Kollam/Documents/Atom/Python/MongoDB/SDR Project/subprocesstutorial.py"
Traceback (most recent call last):
File "C:\Users\AEC Office Kollam\Documents\Atom\Python\MongoDB\SDR Project\subprocesstutorial.py", line 3, in <module>
p = subprocess.run('bash' r"sed -i 's/\"/-/g' D:\Backupfiles\MAY2021\SDR1\BSNL\BSNL-DEC2020-EKYCC.csv", shell=True, capture_output=True, check=True)
File "C:\Users\AEC Office Kollam\anaconda3\envs\SDR Project\lib\subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'bashsed -i 's/\"/-/g' D:\Backupfiles\MAY2021\SDR1\BSNL\BSNL-DEC2020-EKYCC.csv' returned non-zero exit status 1.