I have been running a pywikibot on Marathi wikipedia since almost a month now. The only task of this bot is find and replace. You can find overall details of pywikibot at: pywikibot. You can find the details of that particular find and replace operation at replace.py and fixes.py and even further examples of fixes here.
The following is a part of my source code. When running the bot on Marathi wikipedia, I am facing a difficulty because of the Marathi language's script. All of the replacements are going fine, but one is not. For example, I will use English words instead of Marathi.
The first part ("fix") of following code searches for "{{PAGENAME}}", and replaces it with "{{subst:PAGENAME}}". The msg parameter is the edit summary.
The second fix of the code "man", finds "man" and replaces it with "gent". But the problem is, it is also replacing "human" to "hugent", "craftsmanship" to "craftsgentship" and so on.
fixes = {
'name': {
'regex': True,
'nocase': True,
'msg': {'mr': '{{PAGENAME}} → पानाचे मूळ नाव (base name of page)'},
'replacements': [
( r'{{ *PAGENAME *}}', '{{subst:PAGENAME}}' ),
],
},
'man': {
'regex': True,
'msg': {'mr': 'man → gent'},
'replacements': [
('man', 'gent'),
],
},
}
So I tried to update the find and replace parameter from ('man', 'gent')
to ('man ', 'gent ')
(space in the end) and then to (' man ', ' gent ')
(space at the both ends). But both these changes didn't change any words, not even the original (only) "man".
So how do I change the instance of "He was a good man - a true humanitarian" to "He was a good gent - a true humanitarian" without making it hugentitarian?