Hello below please view my VBA code that works. I am trying to re-write it in Pandas, but my Pandas script is not working half way in ( My Pandas attempted script is below the VBA) Could anyone help me finish this off if possible ( which I think it is)
Sub mymacro()
Columns(19).Replace "DFHD", "SFD"
Columns(19).Replace "DFBG", "SFD"
Columns(19).Replace "DFVD", "SFD"
Columns(19).Replace "MFUB", "BFD"
Columns(19).Replace "MFBD", "BFD"
Columns(19).Replace "DFBD", "BFD"
Columns(19).Replace "UFNC", "CFD"
Columns(19).Replace "UFNC", "CFD"
Columns(19).Replace "BFYD", "BFD"
'Having trouble starting below here'
Columns("T:AC").Select
Selection.EntireColumn.Hidden = True
ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=19, Criteria1:=Array( _
"U*"), Operator:=xlFilterValues
ActiveWindow.SmallScroll Down:=-100
ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=30, Criteria1:=Array( _
"350", "B*"), Operator:=xlFilterValues
ActiveWindow.SmallScroll Down:=-100
Range("S3").Select
ActiveCell.FormulaR1C1 = "BD"
Range("S3").Select
Selection.Copy
Range(Selection, Selection.End(xlDown)).Select
ActiveSheet.Paste
Range("S3").Select
Application.CutCopyMode = False
ActiveSheet.ShowAllData
ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=19, Criteria1:="=UND", Operator:=xlOr, Criteria2:="=UNH"
ActiveWindow.SmallScroll Down:=-21
ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=30, Criteria1:=Array( _
"DR9", "DV0", "DV5", "DV8", "DV9", "DVG", "DV*"), Operator:=xlFilterValues
ActiveWindow.SmallScroll Down:=-36
Range("S11").Select
ActiveCell.FormulaR1C1 = "SD"
Range("S11").Select
Selection.Copy
Range(Selection, Selection.End(xlDown)).Select
ActiveSheet.Paste
Range("S11").Select
Application.CutCopyMode = False
ActiveSheet.ShowAllData
ActiveWindow.SmallScroll Down:=-10
ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=19, Criteria1:="UNH"
ActiveWindow.SmallScroll Down:=-27
Range("S1815").Select
ActiveCell.FormulaR1C1 = "FUHD"
Range("S1815").Select
Selection.Copy
Range(Selection, Selection.End(xlDown)).Select
ActiveSheet.Paste
Range("S1815").Select
Application.CutCopyMode = False
ActiveWindow.SmallScroll Down:=-30
ActiveSheet.ShowAllData
ActiveWindow.SmallScroll Down:=-240
Below is my Pandas script, note where I start having trouble is commented , because the first 12 lines of code work great.
import pandas as pd
import numpy as np
data = pd.read_excel("orsthrufirstarticledeltion.xlsx", encoding = "ISO-8859-1", dtype=object)
data.loc[data.Format == 'DFHD', 'Format'] = 'SFD'
data.loc[data.Format == 'DFBG', 'Format'] = 'SFD'
data.loc[data.Format == 'DFVD', 'Format'] = 'SFD'
data.loc[data.Format == 'MFUB', 'Format'] = 'BFD'
data.loc[data.Format == 'MFBD', 'Format'] = 'BFD'
data.loc[data.Format == 'DFBD', 'Format'] = 'BFD'
data.loc[data.Format == 'UFNC', 'Format'] = 'CFD'
data.loc[data.Format == 'BFYD', 'Format'] = 'BFD'
# Trouble starts below
data.loc[(data["Fmt"] != str) & (data["Format"] == "UN*"), "Format"] = 'BD' # the UN* did not work
#data.loc[(data["Fmt"] == '350') & (data["Format"] == "UNB"), "Format"] = 'BD'
#data.loc[(data["Fmt"] != str) & (data[data.Format.str.startswith('UN',na=False)]), "Format"] = 'BD'
#
writer = pd.ExcelWriter('mstrplc2.xlsx', engine='xlsxwriter')
data.to_excel(writer, sheet_name='Sheet1')
writer.save()
----- New Attempt At Getting A Solution ---------
Below please view the sample dataframe with the raw data we will start with, I have the code to export into excel if you wish.
import pandas as pd
startdf = pd.DataFrame({'Column_A':['DFHD', 'DFBG', 'DFVD', 'MFUB', 'MFBD', 'DFBD', 'UFNC', 'UFNC', 'BFYD',
'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'],
'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})
writer = pd.ExcelWriter('testdf.xlsx', engine='xlsxwriter')
startdf.to_excel(writer, sheet_name='Sheet1')
The First step will be to take all of the values in Column A and replace the existing values with the new values as listed below (so were just editing Column A)
- "DFHD" -> "SFD" "DFBG"->"SFD" "DFVD"-> "SFD" "MFUB"-> "BFD" "MFBD"-> "BFD" "DFBD"->"BFD" "UFNC"-> "CFD" "UFNC"->"CFD" "BFYD"-> "BFD"
After writing in this logic the data should look like this:
df2 = pd.DataFrame({'Column_A':['SFD', 'SFD', 'SFD', 'BFD', 'BFD', 'BFD', 'CFD', 'CFD', 'BFD',
'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'],
'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})
Now we will continue to edit just Column A, but use value in Column B to dictate what Column A values should be, so think of each value row by row. First filter out SFD, BFD and CFD from Column A, so the remaining values will be 'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'. For these remaining values, we will look to column B to decide how to change what is in column A. Logic below:
- Values starting with B or that are a number in Column B, should mean that the matching row value in Column A should now change to BFD
- Values starting with D or OPT in Column B should mean that the matching row value in Column A should now change to SFD
- Values starting with U or that are number in Column B, should mean that the matching row value in Column A should now change to UHFD
After this logic the final output dataframe should be
resultdf = pd.DataFrame({'Column_A':['SFD', 'SFD', 'SFD', 'BFD', 'BFD', 'BFD', 'CFD', 'CFD', 'BFD',
'BFD', 'SFD', 'SFD', 'SFD', 'SFD', 'BFD','UHFD', 'UHFD', 'BFD', 'BFD', 'BFD', 'SFD','BFD', 'UHFD', 'SFD'],
'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})
writer = pd.ExcelWriter('finalresult.xlsx', engine='xlsxwriter')
resultdf.to_excel(writer, sheet_name='Sheet1')