3

First of all, I'll discuss my need from this post..... Now I need to update existing excel xlsm file from another excel csv file, as this csv file is an exported csv from outlook mail inbox and this is the template of my outlook mail csv

|---------------------|------------------|------------------|------------|
|      Subject        |       Body       |    From: (Name)  | To: (Name) |
|---------------------|------------------|------------------|------------|
|blabla LCAIN5678 bla |bla bla bla bla   |bla bla bla bla   |bla bla bla |
|---------------------|------------------|------------------|------------|

As I working and analyze on this csv file, by using str.extract after I read with pandas and this is my extracting code as to extract specific data from the csv file, as to extract specific string like this LCAIN5678 contains five characters and five numbers and then use dropna() with,

This code

# this object extract 5 chars and 5 numbers from specific column in csv
replaced_sbj_value = myOutlook_inBox['Subject']
.str.extract(pat='(L(?:DEL|CAI|SIN).\d{5})').dropna()

This is my xlsm file read by pandas

gov_tracker_sheet = pd.read_excel(r'' + mydi
                              sheet_name
                              header=1) 

and this is my csv file read by pandas

myOutlook_inBox = pd.read_csv(r'' + mydir + 'test.CSV
                          encoding='latin-1')    

Then I inisialize it like this

myOutlook_inBox["Subject"] = replaced_sbj_value
print (replaced_sbj_value)

To be like this way

|-----------------|
|    Subject      |
|-----------------|
|   LCAIN5678     |
|-----------------|

Then I create a condition using some functions like loc str.contains

Like this condition as this conditions is filters for some texts in the csv

# Condition 1: any mail from mowafy to te
frm_mwfy_to_te = myOutlook_inBox.loc[myOutlook_inBox['From: 
(Name)'].str.contains("mowafy", na=False) \
                                 & myOutlook_inBox['To: 
(Name)'].str.contains("te", na=False)] \
.drop_duplicates(keep=False)

Then I create a variable by using join method to join variable frm_mwfy_to_te to the excel xlsm file I want to update

filtered_data_cond1 = gov_tracker_sheet.loc[
gov_tracker_sheet['SiteCode']
.str.contains('|'.join(frm_mwfy_to_te.Subject))]
print(filtered_data_cond1)

as the out of this just get me the rows I want to update from the xlsm excel file

So finally here a create a tuple value as to update the Dataframe by using this variable And this is my tuple variable

values = tuple(filtered_data_cond1['Pending  '].values.tolist())

As this is the output of my tuple value

(u'TE', u'PP', u'TE', u'TE', u'TE', u'TE', u'TE', u'TE', u'TE')

So from here I started using Regex by using replace method.

And this is my replace variable I created

updated_gov_tracker = gov_tracker_sheet.replace(to_replace=values,
                                            value='xxxxxxxxxxxx', 
regex=False)

This code works but It just replaces all rows in the xlsm file which contains TE I need to update the needed rows only

and I used another way some think like this but it's doesn't work fine

updated_gov_tracker=re.sub(values,"xxxxxxxxx",gov_tracker_sheet)

Second Thing I need I want to replace my new Dataframe with the old xlsm table sheet without losing my macros in the excel

Edited

and this is my existing excel file I want to update looks like

enter image description here

and this is my csv Outlook inbox excel looks like this

enter image description here

Now after searching I've found xlwings also could help me, by selecting the range of the row and write the new updatedvalue, So I need now to take the tuple variable, as this variable contains a strings of column called site code, finally I need now to update the excel xlsm first sheet depending on values rows in this tuple

Example

I have a tuple value something like this LCAIN12345 LCAIN54632 LCAIN78965 blablabla

Now this values already existing in the xlsm sheet in column called SiteCode in sheet called Gov_Tracker I want to update some columns depending rows on this values, like I want to update in columns called Pending, Pending Status and blablabla

Now I want to modify this columns rows values like change the old value in Pending, Pending Status, blablabla equals to the new values TE, Waiting TE Acceptance` blablabla

I hope this info is clear enough

  • gov_tracker_sheet.. ? – iamklaus Jul 08 '19 at 10:11
  • @iamklaus actually `gov_tracker_sheet` is just my master sheet in the `xlsm` just please check my post edits –  Jul 08 '19 at 10:20
  • You should split this problem into two different questions. Your first problem, is still unclear to me: How does your current data look and how do you want it to look? Your second problem is different, you want to create (or update, but I suggest create) a xlsm with the xslxwriter. – Frieder Jul 08 '19 at 10:20
  • Your question isn't clear: "but It just replaces all rows in the xlsm file which contains TE I need to update the needed rows only" - what are the needed rows? Can you post several rows, and explain which are not "needed" and should be left as-as is, and which are "needed" and what they should look like after `replace`? – Jack Fleeting Jul 08 '19 at 10:20
  • @Jack Fleeting @Frieder actually the needed rows is in the `tuple` value I put my need `rows` in the `tuple` value and my second question is just I want to replace the old master which is the `gov_tracker_sheet ` new sheet in the `xlsm` file with the new updated `Dataframe` but without losing macros in the xlsm file because I run macro after updating the `xlsm` file –  Jul 08 '19 at 10:22
  • could you please some data for gov_tracker_sheet ? – iamklaus Jul 08 '19 at 10:26
  • @iamklaus check my edit please –  Jul 08 '19 at 10:31
  • PD can only *read* Excel, but not write it. XlsxWriter can only *write* xlsx, but not *update* files. So if you want to edit your file, you need to switch to another library that supports updating existing xlsm files. – Frieder Jul 08 '19 at 10:42

1 Answers1

0

Easiest solution for creating a xlsm with XlsxWriter is to manually change the file ending, and then include the macros in an extra step (see here):

import pandas as pd

df = pd.DataFrame({'First' : [5, 2, 0, 10, 4], 
                   'Second' : [9, 8, 21, 3, 8]})

writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')

df.to_excel(writer, sheet_name='Sheet1')

workbook  = writer.book
workbook.filename = 'test.xlsm'
workbook.add_vba_project('./vbaProject.bin')

writer.save()

See this answer. You can extract the macros (vbaProject.bin) with the vba_extract.py utility which is shipped by the XslxWriter package.

$ vba_extract.py macro_file.xlsm
Extracted: vbaProject.bin
Frieder
  • 1,208
  • 16
  • 25
  • your code means that I should create a new `excel` from scratch I need just modify please check my edit I posted some picture –  Jul 08 '19 at 10:32
  • 1
    This is not possible with xlsxwriter. See https://xlsxwriter.readthedocs.io/introduction.html `It cannot read or modify existing Excel XLSX files.` You could try https://openpyxl.readthedocs.io/en/stable/index.html#module-openpyxl, but I have no expierence with it and I don't know if they support macros. – Frieder Jul 08 '19 at 10:36
  • So I actually here I use `replace` function as to replace the needed values want to change in the `dataframe` so now the dataframe could be updated.... as I have more than 45 columns and more than 1000 rows so it's would be take a lot of time as to create these rows manually..... you got what I mean? –  Jul 08 '19 at 10:41
  • please could you check my edit I just posted pictures of templates excel files I am using to be clear enough, I really appreciate for your help....Thanks a lot –  Jul 08 '19 at 10:43
  • 1
    Your dataframe is a pandas object, but has nothing to do with your original excel file. The dataframe just holds all the data which is in the excel file, but changing data in the dataframe object does _not_ change your excel file. – Frieder Jul 08 '19 at 10:44
  • Yeah I got you now, so finally of the result I should put my all data rows and columns manually in the python code then I use writer to update right? –  Jul 08 '19 at 10:46
  • 1
    No! I just said, if you want to _modify_ an existing excel file, you can _NOT_ use xlxwriter, because this library can only create new files. I don't know which library supports this, but you can try openpyxl. – Frieder Jul 08 '19 at 10:50
  • so there's any materials here you could give me how can I use `openpyxl` to solve my case? I just want to modify only not to create a new excel –  Jul 08 '19 at 10:52
  • 1
    Yes there is, [original documentation](https://openpyxl.readthedocs.io/en/stable/tutorial.html) and [some question](https://stackoverflow.com/questions/17675780/how-to-save-xlsm-file-with-macro-using-openpyxl) on SO on modifying a xlsm file. – Frieder Jul 08 '19 at 10:56
  • really thanks a lot you helped me a lot...... any way I tried these sequence I posted with sql database it works fine but that's will take a lot of time by using `database` `sqlAlchemy` –  Jul 08 '19 at 11:09