2

I have a table that contains tokens and their translations separated by an '=' (a line of it would be 'ACTION_PLAN=Action Plan'). I need to parse another file and substitute all the tokens for these values.

I managed to create a dict that has all tokens as keys and the phrases as values with the following code:

with open(dictionaryFileName) as d:
    commands = dict(line.split('=', 1) for line in d)

And it does what I intended, a dict with 'TOKEN:Phrase'

However, I now need to use this dict to substitute all the tokens in another file (a csv).

This file goes like 'ACTION_PLAN,GROUP_ANALYTICAL_MAP_REPORT,READ', every comma exactly one token, so I tried doing the following:

data = pd.read_csv(permissionFileName)

data["module_name"] = data["module_name"].str.translate(commands)

print(data)

where "module_name" is the name of the first column.

But it just returns exactly the same, no change at all and no exceptions either. I did some research and found that the dict needs to have unicode characters as keys, anyway to work around this other than creating the method myself?

Expected for this specific block of code: input

module_name, group_name, perm_name
ACTION_PLAN,GROUP_ANALYTICAL_ACTION_PLAN_REPORT,READ
ACTION_PLAN,GROUP_ANALYTICAL_MAP_REPORT,READ

output:

Action Plan,GROUP_ANALYTICAL_ACTION_PLAN_REPORT,READ
Action Plan,GROUP_ANALYTICAL_MAP_REPORT,READ

dictionary:

ACTION_PLAN=Action Plan
help-ukraine-now
  • 3,850
  • 4
  • 19
  • 36
brightpants
  • 415
  • 1
  • 4
  • 28

1 Answers1

1

As stated in this post you can use either of the following, where map is said to be faster. data["module_name"].replace(commands) or data["module_name"].map(commands)

If it would be partial replacements (for anyone else reading this) you can use data["module_name"].replace(commands, regex=True) which does 2 things, so use with caution:

  • Enable partial replacement
  • Allow regexes
Laurens Koppenol
  • 2,946
  • 2
  • 20
  • 33