0

Is it possible to convert a html table into a pipe table with pandoc (or using any other tool)?

I tried pandoc bla.html --to markdown+pipe_tables and pandoc bla.html --to markdown+pipe_tables-simple_tables but both seem to produce simple tables.

bla.html contains:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>age</th>
      <th>workclass</th>
      <th>education</th>
      <th>gender</th>
      <th>hours-per-week</th>
      <th>occupation</th>
      <th>income</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>39</td>
      <td>State-gov</td>
      <td>Bachelors</td>
      <td>Male</td>
      <td>40</td>
      <td>Adm-clerical</td>
      <td>&lt;=50K</td>
    </tr>
  </tbody>
</table>

If I use -t markdown_github as suggested here, the output is html again.

Community
  • 1
  • 1
Andreas Mueller
  • 27,470
  • 8
  • 62
  • 74

1 Answers1

1

I realized that "-t markdown_github" does produce the right result after I entered something into the first <th> cell. The empty cell seems to trip pandoc.

Andreas Mueller
  • 27,470
  • 8
  • 62
  • 74