1

I have a text string and I want to replace two words with a single word. E.g. if the word is artificial intelligence, I want to replace it with artificial_intelligence. This needs to be done for a list of 200 words and on a text file of size 5 mb. I tried string.replace but it can work only for one element, not for the list.

Example

Text='Artificial intelligence is useful for us in every situation of deep learning.'

List a : list b
Artificial intelligence: artificial_intelligence
Deep learning: deep_ learning 
...

Text.replace('Artificial intelligence','Artificial_intelligence') is working. But

For I in range(len(Lista)):
 Text=Text.replace(Lista[I],List b[I])

doesn't work.

James Z
  • 12,209
  • 10
  • 24
  • 44
Aartika Sethi
  • 39
  • 1
  • 5
  • 200 words doesn't look much, have you tried use Sublime Text and do a simple find / replace? – Edilson Borges Jun 08 '17 at 12:37
  • Actually I am having 98 such files..it will be too long to manually update all:) – Aartika Sethi Jun 08 '17 at 12:39
  • Ok, but this is just a 'one time' replace? If so, I still recommend you to use sublime text and do a Ctrl + Shift + F and replace all occurrences. – Edilson Borges Jun 08 '17 at 12:41
  • @AartikaSethi why not using regex for searching that word and then replace it. – Kumar Jun 08 '17 at 12:41
  • What is actually not working? do you get an error message? is the result erroneous? – Tryph Jun 08 '17 at 12:44
  • @kumar even I thought of regex but couldn't come up with a syntactically correct phrase that can look for spaced words searching..can u recommend any gud link – Aartika Sethi Jun 08 '17 at 12:45
  • @tryph the terms are not getting replaced as in deep learning is still deep learning and not deep_learning – Aartika Sethi Jun 08 '17 at 12:47
  • @AartikaSethi Since you have "deep learning" in your test string and "Deep learning" (with an uppercase d) in your list, it is normal. – Tryph Jun 08 '17 at 12:52

2 Answers2

4

I would suggest using a dict for your replacements:

text = "Artificial intelligence is useful for us in every situation of deep learning."
replacements = {"Artificial intelligence" : "Artificial_intelligence",
                "deep learning" : "deep_learning"}

Then your approach works (although it is case-sensitive):

>>> for rep in replacements:
        text = text.replace(rep, replacements[rep])
>>> print(text)
Artificial_intelligence is useful for us in every situation of deep_learning.

For other approaches (like the suggested regex-approach), have a look at SO: Python replace multiple strings.

Christian König
  • 3,437
  • 16
  • 28
1

Since you have a case problem between your list entries and your string, you could use the re.sub() function with IGNORECASE flag to obtain what you want:

import re

list_a = ['Artificial intelligence', 'Deep learning']
list_b = ['artificial_intelligence', 'deep_learning']
text = 'Artificial intelligence is useful for us in every situation of deep learning.'

for from_, to in zip(list_a, list_b):
    text = re.sub(from_, to, text, flags=re.IGNORECASE)

print(text)
# artificial_intelligence is useful for us in every situation of deep_learning.

Note the use of the zip() function wich allows to iterate over the two lists in the same time.


Also note that Christian is right, a dict would be more suitable for your substitution data. The previous code would then be the following for the exact same result:

import re

subs = {'Artificial intelligence': 'artificial_intelligence',
        'Deep learning': 'deep_learning'}
text = 'Artificial intelligence is useful for us in every situation of deep learning.'

for from_, to in subs.items():
    text = re.sub(from_, to, text, flags=re.IGNORECASE)

print(text)
Tryph
  • 5,946
  • 28
  • 49