16

It's possible with xlsxwriter to save variables to existing excel files and read them after, though the problem is that the variables are stored as strings in my excel file.

Let's say I have a list of many different variables with various types (pd.datetimerange, pd.df, np.arrays, etc.), if I save them to the excel file the variable type would be lost.

Also the Python script is called from my Excel file so I can't change anything in it without writing a VBA script. Which would temporarily close my workbook, dump the chars (with say pickle strings) and reopen it.

Is it possible to retrieve the types from the excel file without writing a parsing function first (which would take the excel string and yield me with the equivalent variable type)?

Bython
  • 1,135
  • 1
  • 12
  • 21
  • 2
    Saving as a `str` and using `eval(str)` when re-opening the file may be a solution, i.e.: saving: `x = '[123,456]'` opening: `eval(x)` – Pedro Lobito Dec 07 '18 at 05:31
  • 1
    Exactly that's the kind of function type I am looking for, but eval works only for build-in types. Let's say `x = "[123,DatetimeIndex(['2018-12-04','2018-12-05', '2018-12-06'],dtype='datetime64[ns]', freq='D')]"`. Then `eval(x)` would yield an error name DatetimeIndex not defined (as it's a pandas variable type)... – Bython Dec 07 '18 at 06:09
  • What have you tried so far? – Pedro Lobito Dec 07 '18 at 06:11
  • I am still thinking, but my approach would be to read the string, write a custom parsing function which filters for 50 different string types and yields me the desired variable as a return, but that would be quite tedious... Easiest approach would be to use pickle, but I have the restriction to do everything in one excel file – Bython Dec 07 '18 at 06:15
  • 2
    You could try dumping the bytecode of the list using `pickle` and storing it as a string in a cell. Painful just to think about it. – hilberts_drinking_problem Dec 07 '18 at 06:23
  • 1
    Also does not look like you could store more than 8192 bytes in one cell, though I may be wrong on that. – hilberts_drinking_problem Dec 07 '18 at 06:26
  • 1
    @trailing_whitespace tried doing it this way... Yep, excel cell stores only 32k chars ... also the python script is called from the excel file so writing to excel would be even more bothersome i.e. adding overhead VBA code. Do you have any other ideas? – Bython Dec 08 '18 at 18:39
  • Well, it's hard to say without knowing your full setup. I recall that the package xlwings had an easy way to call python from vba. As for the space limitation, sounds like your only option is to split the string into parts, possibly using some lossless compression. – hilberts_drinking_problem Dec 08 '18 at 20:56
  • can you dump pickle outside of the excel file so you can reload them at the python level? i,e. you build excel with the variable in it and you save the pickles at the same time, then if you open python you could reload everything – Steven G Dec 09 '18 at 20:58
  • @StevenG Sadly no, or else it would be much easier. But as it seems that's the only solution... – Bython Dec 10 '18 at 05:27
  • is there scope to have an extra column to use to indicate the type? you could have a wrapper class that you register all your types with that allocates them a number that when written into excel writes that number into the next column? then when reading back passes the string to the constructor of that type and returns the created object? – James Kent Dec 10 '18 at 13:40
  • @Bython do you require this in excel format so that it can be inspected/modified or just as a storage format? if you only need it for inspection purposes you could pickle the objects and add the pickled file to the excel file as xlsx is basically a zipfile with specific contents. then you could have the display version in the worksheet and the pickle version in the same file – James Kent Dec 11 '18 at 09:14
  • @JamesKent I knew .xlsx was a zip file because if you rename the file extension you can unzip it. But how do you save into the file whilst Excel has it open? Surely it would be locked. – S Meaden Dec 12 '18 at 09:29
  • @SMeaden my apologies I didn't think of that, you are correct that excel locks the file so that wouldn't work. – James Kent Dec 12 '18 at 09:46
  • @JamesKent no worries, I suspect it is possible via some arcane and ancient API. I was hoping you knew how. – S Meaden Dec 12 '18 at 10:28

1 Answers1

5

As per your comment, you can get eval to correctly process the symbols that are local to some module by passing the appropriate dict of locals into eval, along with your string. Here's a workable solution:

import pandas as pd

def getlocals(obj, lcls=None):
    if lcls is None: lcls = dict(locals().items())

    objlcls = {k:v for k,v in obj.__dict__.items() if not k.startswith('_')}
    lcls.update(objlcls)

    return lcls

x = "[123,DatetimeIndex(['2018-12-04','2018-12-05', '2018-12-06'],dtype='datetime64[ns]', freq='D')]"
lcls = getlocals(pd)

result = eval(x, globals(), lcls)
print(result)

Output:

[123, DatetimeIndex(['2018-12-04', '2018-12-05', '2018-12-06'], dtype='datetime64[ns]', freq='D')]

As a Responsible Person, it is also my duty to warn you that using eval for your application is ridiculously unsafe. There are many discussions of the dangers of eval, and none of them suggest there's a way to completely mitigate those dangers. Be careful if you choose to use this code.

tel
  • 13,005
  • 2
  • 44
  • 62