I've read through the threads shown in below:
But they are not exactly what I am looking for.
What I am trying to accomplish here is to rename files while converting them from Excel into csv. My conversion code works, BUT I also want to get rid of the unnecessary words in my output file names.
Let's say my file names are:
"Sample_file_2016-4-30.xlsx", "Hello_world_2014-5-30.xlsx", "Great_day_2015-1-14.xlsx"
I want my output to be (all characters before the numbers to be deleted):
"2016-4-30.csv", "2014-5-30.csv", "2015-1-14.csv"
Here's what I've already done (and the code works):
def xslx_to_csv():
files = os.listdir(r"~\files to be converted")
current_path = os.getcwd()
os.chdir(r"~\files to be converted")
for file in files:
print file
filename = os.path.splitext(file)[0]
wb = xlrd.open_workbook(file)
sh = wb.sheet_by_index(0)
new_ext = 'csv'
new_name = (filename, new_ext)
csvfile = open(".".join(new_name), 'wb')
wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)
for rownum in xrange(sh.nrows):
wr.writerow(sh.row_values(rownum))
csvfile.close()
However, this code only gives me the output as following:
"Sample_file_2016-4-30.csv", "Hello_world_2014-5-30.csv", "Great_day_2015-1-14.csv"
What i've tried so far:
I've tried using os.rename()
, and str.replace()
(as suggested by Djizeus), and I've also tried using static string position, e.g.: new_name[14:35]
to get the partial name.
But I need a more dynamic method. How to recognize and remove all characters before the numbers in format of yyyy-mm-dd?
Bonus question: I want to take this a bit further, instead of just REMOVING the extra parts from the file names, I wonder how can I ALTER the file names. For example, in this case, the desired output could be:
"Bonus_file_2016-4-30.csv", "Bonus_file_2014-5-30.csv", "Bonus_file_2015-1-14.csv"
So basically, I want to replace the beginning words with a certain word like "Bonus".