1

I am trying to create a two dimensional (2-D) data structure using a Matlab structure imported in Python.

When I use pandas.DataFrame, each cell contains a matrix, however, they are displayed in the List format. I am trying to change it to the Matrix format.

The DataFrame in Python would look similar using the following code: (However, it is not the same, since the real data is imported from Matlab and would have a different type which I could not recreate it using python)

import pandas as pd
k=[[0,1,2,3,4,5,6]]
df=pd.DataFrame(k)
df[:] = df[:].astype('object')
df.at[0,0] = [[1]]
df.at[0,1] = [[1.0,2.0],[2.0,4.0],[8.0,3.0],[9.0,7.0]]
df.at[0,2] = [[0.487],[1.532],[1.544],[1.846]]
df.at[0,3] = [[3.0]]
df.at[0,4] = [[3.0]]
df.at[0,5] = [[-1]]
df.at[0,6] = [[]]
display(df)

Which results in:

Result_of_the_code

(You can also find similar result by running the following snippet.)

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>0</th>
      <th>1</th>
      <th>2</th>
      <th>3</th>
      <th>4</th>
      <th>5</th>
      <th>6</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>[[1]]</td>
      <td>[[1.0, 2.0], [2.0, 4.0], [8.0, 3.0], [9.0, 7.0]]</td>
      <td>[[0.487], [1.5326], [1.544], [1.846]]</td>
      <td>[[3.0]]</td>
      <td>[[3.0]]</td>
      <td>[[-1]]</td>
      <td>[[]]</td>
    </tr>
  </tbody>
</table>

As you can see, each cell is displayed as a list, i.e:

Displayed_matrix_as_list_form

(You can also find similar result by running the following snippet.)

<body>
    [[1.0, 2.0], [2.0, 4.0], [8.0, 3.0], [9.0, 7.0]]
</body>

I am trying to change it to something like:

Intended_result

(You can also find similar result by running the following snippet.)

.matrix {
        position: relative;
    }
    .matrix:before, .matrix:after {
        content: "";
        position: absolute;
        top: 0;
        border: 1px solid #000;
        width: 6px;
        height: 100%;
    }
    .matrix:before {
        left: -10px;
        border-right: -0;
    }
    .matrix:after {
        right: -10px;
        border-left: 0;
    }
<div align=center>
  <table class="matrix">
    <tr>
      <td>1</td>
      <td>2</td>
    </tr>
    <tr>
      <td>2</td>
      <td>4</td>
    </tr>
    <tr>
      <td>8</td>
      <td>3</td>
    </tr>
    <tr>
      <td>9</td>
      <td>7</td>
    </tr>
  </table>
</div>

Thank you.

2 Answers2

1

Pandas has a default output printer that won't be able to achieve what you need. However, you can use pandas.Styler to create HTML and then insert HTML into a DataFrame and then render that HTML, using the necessary CSS styles you have provided:

data = [
    [[1]],
    [[1.0,2.0],[2.0,4.0],[8.0,3.0],[9.0,7.0]],
    [[0.487],[1.532],[1.544],[1.846]],
    [[3.0]],
    [[3.0]],
    [[-1]],
]
    
df = pd.DataFrame([
    [(pd.DataFrame(x)
        .style
        .hide_index()
        .hide_columns()
        .set_table_attributes('class="matrix"')
        .to_html()
     ) for x in data]
], dtype="object")
df.style.set_table_styles([
    {"selector": ".matrix", "props": "position: relative;"},
    {"selector": ".matrix:before, .matrix:after", 
     "props":  'content: ""; position: absolute; top: 0; border: 1px solid #000; width: 6px; height: 100%;'
    },
    {"selector": ".matrix:before", "props": "left: -0px; border-right: -0;"},
    {"selector": ".matrix:after", "props": "right: -0px; border-left: 0;"}
])

enter image description here

Attack68
  • 4,437
  • 1
  • 20
  • 40
  • Hello @Attack68 , Thank you so much for your solution. This is indeed a very nice and helpful piece of code. However, there is one problem, it works perfectly when input is a 3-D matrix. But the problem is, first of all, the input type is not a list, but rather an imported Matlab structure using scipy, and moreover, Matlab structures can have different shapes in different rows. For example, in this example, our second row and first column could be a 2x2 matrix. I have wrote a very messy and inefficient code which does not work perfectly but will share down here – Mohammad Badri Ahmadi Aug 23 '21 at 17:12
  • @MohammadBadriAhmadi, you are asking for a highly specific output to a highlight specific input. I have provided the skeleton for the output - I doubt you will able to this any other way. Therefore, I suggest you work towards standardising your input format into the necessary format. – Attack68 Aug 23 '21 at 18:16
  • @attack69 that is very true, I seriously loved your codes output style. Unfortunately, I am working mainly with datasets that are stored in Matlab structures. Thank you so much anyway for your help. I learned a lot from your code. – Mohammad Badri Ahmadi Aug 23 '21 at 19:34
0

@Attack68, here is the code I mentioned in reply to your beautiful answer. just remember, as I mentioned, the real data are imported from a Matlab structure. Meaning it would not work with the data I provided in the question itself, but works fine with Matlab structures imported to python using scipy.io. I wrote this code with the help of @Valdi_Bo answer on link and @Paul Panzer answer on link.

df = pd.DataFrame(data)

import re 
def pretty_col(data):
     data=np.array(data)
     if data.size <= 1:
         return format(data)
     else:
         return format(data[:, None])[1:-1].replace('[', '\u23A1', 1).replace(' [', '\u23A2', data.size-2).replace(' [', '\u23A3').replace(']', '\u23A4', 1).replace(']', '\u23A5', data.size-2).replace(']', '\u23A6')
def pretty_cols(data, comma=False):
    if comma:
        f='\n'.join(line[0] + line + line[-1] for line in map(str.join, data.shape[0] // 2 * ('  ',) + (', ',) + (data.shape[0] - 1) // 2 * ('  ',), zip(*map(str.split, map(pretty_col, data.T), data.shape[1]*('\n',)))))
    else:
        f='\n'.join(line[0] + line + line[-1] for line in map(''.join, zip(*map(str.split, map(pretty_col, data.T), data.shape[1]*('\n',)))))
    return f

def myFmt(txt):
    if txt=="":
        return "[]"
    else:
        q=r'<font">bananas\n</font>'
        q=q.replace("bananas", repr(txt))
        q=q.replace("'", '')
        return q.replace(r'\n', '<br>')
def ttest(x):
    for i,k in enumerate(x):
        for j,l in enumerate(k):
            x[i][j]=float(format(l, '.2f'))
            return x

def transform(tdf,prec):
    for col in tdf.columns:
        tdf[col] = tdf[col].apply(pretty_cols)
        for j in range(len(tdf[col])):
            tdf[col][j]=fixing_newline(tdf[col][j],prec)


print(df[df.columns[0]])
def fixing_newline(string,prec):
    string=string.replace("⎡⎡", ' aa ').replace("⎤⎤", ' bb ').replace("\n", ' cc ').replace("⎢⎢", ' dd ').replace("⎥⎥", ' ee ').replace("⎣⎣", ' ff ').replace("⎦⎦", ' gg ').replace("[[", ' hh ').replace("]]", ' kk ')
    chunks = string.split(' ')
    string=""
    for i,k in enumerate(chunks):
        try: 
            string+=str("{:."+str(prec)+"f}").format(float(k))
        except ValueError:
            string+=k
    string=string.replace("aa", "⎡⎡").replace("bb", '⎤⎤').replace("cc", '\n').replace("dd", '⎢⎢').replace("ee", '⎥⎥').replace("ff", '⎣⎣').replace("gg", '⎦⎦').replace("hh", '[[').replace("kk", ']]')
    return string


transform(df,3)
df=df.style.format(myFmt)

display(df)

which would result in something like: Results (Need help with inline images since I do not have enough reputation.)

However, the code is not efficient at all, and also does not work well all the time.