Thanks in advance for any assistance y'all can offer.
I'm attempting to create a Pandas data frame from a .dat file (DBISAM table) generated by the Retail Edge POS software. My question was similar enough to this and when using their code, I was able to get a result, where other efforts to load the data failed entirely.
with open(fname, "rb") as f: # binary mode
data = pd.DataFrame(
[e.decode("latin-1") if e != b'\xa0' else None for e in l.strip().split()]
for l in f
)
print(data.shape)
print(data.ndim)
print(data.head())
The results:
DF shape: (31626, 115)
DF dimensions: 2
Sample returned data: 0 É9☺ ♠¾Y#dË@=qÒã¼dÐ☺
In the Database System Utility I use to query store data, this table should have 27 columns and 39,310 rows, as of my latest check.
I used Chardet to try determining the correct encoding, which identifies it as Windows-1254. When I swap that in for Latin-1, I get a different error: 'charmap' codec can't decode byte 0x8e in position 11: character maps to <undefined>
Similarly, when I swap in UTF-8 encoding: 'utf-8' codec can't decode byte 0xc9 in position 0: invalid continuation byte
I have worked comfortably with Pandas on CSV and txt files, but I feel way out of my depth here. I've also tried using the StringIO and BytesIO methods, but haven't managed to retrieve the data in a meaningful form. This is my first step toward visualizing inventory and sales data for a farmer-owned grocery store, so I'm not bringing professional IT/coding abilities to the table. I'm grateful for any suggestions.