how can you replace characters in a string under the condition re.IGNORECASE

Question

sentence = 'this is a book.pdf'

sentence.replace( 'pdf' or 'PDF' ,'csv' )

sentence.replace('pdf','csv',re.IGNORECASE)

how can i replace the characters under the condition

specified such as Pdf or PDF
or Ignoring cases all together

how about using lower() for the sentence i.e. sentence.lower().replace(...) ? — Paras, Jul 09 '20 at 02:59
you should explicitly state that case sensitvity is not actually important, and your actual goal is replacing any file extension with .csv, which is another question with a very large amount of duplicates. This is a case of [XY Problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) along with an an ambiguously stated goal that does not and will not match with its answers nor keywords. This would no doubt pollute search results. — user120242, Jul 09 '20 at 05:05

Daniel Butler · Answer 1 · 2020-07-09T04:09:55.673

0

I’m going to assume you are doing this to a string

sentence = sentence.lower()

Better yet just sentence.lower() where you are using sentence next could do the trick hard to say without more context.

edited Jul 09 '20 at 04:09

answered Jul 09 '20 at 02:58

Daniel Butler

3,239
2
24
37

authough not best practice , it definitely shortes the code and works well – quadhd Jul 09 '20 at 03:50
Agreed with more context about what it is used for a better example could be given – Daniel Butler Jul 09 '20 at 04:10
@DanielButler I think it's fair to say that lowercasing the entire filename is in the majority of use cases not the intended result and could cause mangling of filenames. The only valid cases are if you actually need it in lowercase or you are using it only as an intermediate checking string (whereas OP is assigning it to a variable "sentence" which implies it will be reused in a language context where case sensitvity is probably explicitly needed) – user120242 Jul 09 '20 at 04:55

score 0 · Answer 2 · answered Jul 09 '20 at 03:13

0

If you are doing this for multiple kinds of files then you can find the index of the period(.), delete everything after it and add the file extension to the end

sentence = sentence - sentence[sentence.index(".")+1:]
sentence += "csv"

answered Jul 09 '20 at 03:13

TheTriad

9
1

what if you have multiple dots in between is there a way of specifying the last dot – quadhd Jul 09 '20 at 03:36
use `.rindex` to get last index of – user120242 Jul 09 '20 at 04:47

user120242 · Accepted Answer · 2020-07-09T05:04:34.510

0

Looks as if you want to truncate any file extension found and add .csv. I would recommend using \w{1,5} (one to five word chars) instead of \w+ (one or more), because of the case of files named an12n512n5125.1125n125n125 which I've had in my own file blobs often.

Match period followed by one or more alphanumeric characters at the end of string ($) and replace with .csv. Case sensitivity no longer matters:

import re
sentence = 'this is a book.pdf'
ext2 = 'csv'
sentence = re.sub(rf'\.\w+$', f'.{ext2}', sentence)

slice end of string, lowercase compare it to .pdf, and replace .pdf with .csv. Using string interpolation (f"") for customizable extensions

sentence = 'this is a book.pdf'
ext1 = 'pdf'
ext2 = 'csv'
sentence = sentence[:-4]+f'.{ext2}' if sentence[-4:].lower()==f'.{ext1}' else sentence

Using regex with $ to match end of string with re.IGNORECASE. Using string interpolation for customizable extensions

import re
sentence = 'this is a book.pdf'
ext1 = 'pdf'
ext2 = 'csv'
sentence = re.sub(rf'\.{ext1}$', f'.{ext2}', sentence, flags=re.IGNORECASE)

edited Jul 09 '20 at 05:04

answered Jul 09 '20 at 03:23

user120242

14,918
3
38
52

The initial solution is not good practice as it assumes you know were .pdf is located, but the latter is more plausible – quadhd Jul 09 '20 at 04:03
I don't understand that assertion; is the use case not for file extensions? file extensions must be at the end of the string, so you would want to explicitly only allow it at the end of a string, which is the reasoning for using [:-4]. Unless you want to allow .pdf anywhere in the string, which in my own usage is actually a source of false positives. such as (a real example) PDFEscape.exe => csvEscape.exe would cause issues. If for the case of something like: file.pdf.zip is also needed, a slightly more complicataed regex to account for that would be needed – user120242 Jul 09 '20 at 04:23
If this is a simple batch script where all your file names are known to be within the domain that .lower works, there's no reason not to just use .lower. It really has nothing to do with "best practice" rather than, if you're just creating a .bat/.sh script or something similar for a restricted use case, cutting this corner doesn't matter. – user120242 Jul 09 '20 at 04:30
oh yes,I get what you mean now – quadhd Jul 09 '20 at 05:44

how can you replace characters in a string under the condition re.IGNORECASE

3 Answers3