-2

I have been working on this project for a couple months - taking some generated xls and xlsx documents and using a combo of the csv module and pandas (python) to rearrange the whole order of the data so it will be appropriate for manual upload to a system that requires a certain data order for correct import.

no stress. There are several different documents with their own original structure as well as many templates for the import. Besides the data rearranging, I have also needed to add some internal codes to some of the documents that we need for local student work management and rename some columns to match the import requirements. I have all this working, but all my efforts to code something that takes student names and instructor names, currently listed as [lastname,firstname], NEED to be [lastname, firstname] with a SPACE added in the name after the comma before the first line.

I have messed around with a for loop using regex and something as simple as

df.replace(',', ', ', regex=True) 

OR

df["Column Name"].str.replace(',', ', ') (which I think is way wrong, but tried it anyway)

What else might I try to accomplish what I would think of as being simple. Nothing seems to be working. I am running the script, I get no errors, and yet this change is not being made. I have looked all over stackoverflow, but am not having success.

thank you in advance

jyssyl
  • 13
  • 7
  • 2
    So many words, so little really useful information. Please check [ask]. Show [mre], incl. sample data. Check [how to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/4046632) – buran Nov 29 '21 at 13:53
  • 2
    You might just need to set the default `inplace=False` argument to `inplace=True` – Andre Nov 29 '21 at 13:54
  • 2
    Also, both your code snippets don't work inline. – buran Nov 29 '21 at 13:54

2 Answers2

1

You could use a regular expression to ensure there is always just one space:

import pandas as pd
import re

data = [['flintstone,fred'], ['flintstone, wilma'], ['rubble,     barney']]
df = pd.DataFrame(data, columns=['Name'])
df['Name'] = df['Name'].str.replace(', *', ', ', regex=True)

print(df)

Giving you:

                Name
0   flintstone, fred
1  flintstone, wilma
2     rubble, barney
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
0

Replace will return a new dataframe, if you are doing other operation on that df try:

df = df.replace(',', ', ')

Otherwise try to add more line of your code to show what you do after the replace operation.

mucio
  • 7,014
  • 1
  • 21
  • 33