Is it possible to do dataframe to xml in python without iteration?

Question

Dataframe to xml in python without iteration?

Input Dataframe:

    A   B   C   D
    aa  ab  ac  ad
    aaa abb acc add

Output in XML:

    <A>aa</A>
    <B>ab</B>
    <C>ac</C>
    <D>ad</D>
    <A>aaa</A>
    <B>abb</B>
    <C>acc</C>
    <D>add</D>

Possible duplicate, https://stackoverflow.com/a/18576067/4985099 — sushanth, Jun 26 '20 at 13:40
Please understand my question. I am asking about without for/while loop iteration — Rajeshkanna Purushothaman, Jun 26 '20 at 13:43
Does this answer your question? [How do convert a pandas dataframe to XML?](https://stackoverflow.com/questions/18574108/how-do-convert-a-pandas-dataframe-to-xml) — iacob, Mar 25 '21 at 09:29

score 2 · Answer 1 · answered Jun 26 '20 at 15:30

given dataframe x:

>>> import pandas as pd
>>> x = pd.DataFrame([['aa','ab','ac','ad'],['aaa','abb','acc','add']],columns=['A','B','C','D'])
>>> x
     A    B    C    D
0   aa   ab   ac   ad
1  aaa  abb  acc  add

You can use this function. However, there is no guarantee that no loops are done internally in pandas and numpy functions used here.

>>> import numpy as np
>>> def to_xml(df):
...     
...     #extract columns and repeat them by number of rows
...     cols = df.columns.tolist()*len(df.index)
...     
...     #convert df to numpy and reshape columns to one vector
...     df_numpy = np.array(df)
...     df_numpy = df_numpy.reshape(np.dot(*df_numpy.shape))
...     
...     #convert columns and numpy array to pandas and apply function that formats each row, convert to list
...     listlike = pd.DataFrame([df_numpy,cols]).apply(lambda x: '<{0}>{1}</{0}>'.format(x[1],x[0])).tolist()
...    
...     #return list of rows joined with newline character
...     return '\n'.join(listlike)

output:

>>> print(to_xml(x))
<A>aa</A>
<B>ab</B>
<C>ac</C>
<D>ad</D>
<A>aaa</A>
<B>abb</B>
<C>acc</C>
<D>add</D>

Hi @Jan Musil, Actually your program is giving the output. But the column headers are duplicated for the full dataset. `cols = df.columns.tolist()*len(df.index)` It will increase the memory and program will become slow for huge data. If it is not possible I am fine with that — Rajeshkanna Purushothaman, Jun 27 '20 at 11:09
It should be still very fast and not increasing memory usage I believe. — Jan Musil, Jun 27 '20 at 22:09

Is it possible to do dataframe to xml in python without iteration?

1 Answers1