When I read this file with pandas, only 15 of the 44 rows are read, seemingly without a system. Cell E31 is set to nan
, and I'm unable to find out why.
How to reproduce:
% pip show pandas
Name: pandas
Version: 1.5.1
(...)
% python3
Python 3.10.7 (main, Sep 15 2022, 01:51:29) [Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> test = pandas.read_excel("Downloads/invalid_file.xlsx")
>>> test
name super object ... comment_rm gui_element gui_attributes
0 hasSignature hasValue TextValue ... NaN SimpleText NaN
1 hasCreator hasValue TextValue ... NaN SimpleText NaN
2 hasProvenience hasValue TextValue ... NaN SimpleText NaN
3 hasDate hasValue TextValue ... NaN SimpleText NaN
4 hasLanguage hasValue TextValue ... NaN SimpleText NaN
5 hasCopyright hasValue TextValue ... NaN SimpleText NaN
6 hasComment hasValue TextValue ... NaN Textarea NaN
7 hasDescription hasValue TextValue ... NaN Richtext NaN
8 hasLatinName hasValue, dcterms:title TextValue ... NaN SimpleText NaN
9 hasSynonym hasValue Synonyme ... NaN SimpleText NaN
10 hasPagenum seqnum IntValue ... NaN SimpleText NaN
11 partOf isPartOf :Manuscript ... NaN Searchbox NaN
12 hasTitle hasValue TextValue ... NaN SimpleText NaN
13 hasFolio hasValue TextValue ... NaN SimpleText NaN
14 hasPlant hasLinkTo :Plant ... NaN Searchbox NaN
[15 rows x 15 columns]
>>> test.iloc[9,4]
nan
I carefully checked the docs if I have to set a certain parameter, but I found nothing.