I have a bunch of data that looks like this:
Bigtable,[4] MariaDB[5]
How do I use Python re library to remove those [4] quotations?
I have a bunch of data that looks like this:
Bigtable,[4] MariaDB[5]
How do I use Python re library to remove those [4] quotations?
You can use the re.sub to remove those scientific quotations
>>> import re
>>> s = "Bigtable,[4] MariaDB[5]"
>>> re.sub(r'\[.*?\]', '', s)
'Bigtable, MariaDB'
The regex \[.*?\]
will match the substrings that starts with [
and ends with ]
with as few character inside the brackets as possible
If you only want to remove square brackets with numbers inside, use this regex instead: \[\d+\]
It is slightly unclear as to what you want in your output: only drop [4]
types or drop all quotes, such as... [4]
, [5]
.
The following two examples show you how to handle these two scenarios with python. However, if you want to use command line, here is what you can using echo
+ sed
.
echo "Bigtable,[4] MariaDB[5]" | sed -E "s/\[[0-9]+\]//gm"
## Output
Bigtable, MariaDB
Python
Assuming that you only want to replace citation quotes that are similar to [4]
and still keep [5]
, this should work for you.
See example in Regex101.com: scenario-1
import re
# define regex pattern(s)
pattern1 = r"(?:,\s*)(\[\d+\])(?:(\s*)?)"
# compile regex pattern(s) for speed
pat1 = re.compile(pattern1)
# evaluate regex substitution
result1 = pat1.sub(r',\g<2>', text)
print(result1)
With
,\g<2>
we are replacing,[4]
with,
but leaving[5]
untouched. See here for more details.
Output:
Bigtable, MariaDB[5]
Bigtable, MariaDataBase[15]
Bigtable, GloriaDB[51]
Python
Removing all such [4], [5]
quotations.
See example in Regex101.com: scenario-2
import re
# define regex pattern(s)
pattern2 = r"(\[\d+\])"
# compile regex pattern(s) for speed
pat2 = re.compile(pattern2)
# evaluate regex substitution
result2 = pat2.sub('', text)
print(result2)
Output:
Bigtable, MariaDB
Bigtable, MariaDataBase
Bigtable, GloriaDB
# we will use this text for testing the regex
text = """\
Bigtable,[4] MariaDB[5]
Bigtable,[40] MariaDataBase[15]
Bigtable, [14] GloriaDB[51]
"""