1

I have a text file that I got from a PDF file with tables. The table structure: run code to see structure

<table>
<tr>
  <th>name</th>
  <th>monday</th>
  <th>tuesday</th>
  <th>wednesday</th>
  <th>thursday</th>
  <th>friday</th>
 </tr>
  
  <tr>
  <td>XXX</td>
  <td>14:30</td>
  <td>14:30</td>
  <td>     </td>
  <td>14:30</td>
  <td>14:30</td>
  
  </tr>
  </table>

As you see Wednesday is empty and I need to insert the times to a database. And I have no idea how I can count the empty field as empty.

  • How do you know it's Wednesday and not another day? (And where are Sunday and Friday?) – Dennis Williamson Aug 12 '19 at 21:08
  • its a example how it can look like it diffrent from every name, is a schedule for a school – Alfred tärning Aug 12 '19 at 21:15
  • Now you've taken out the example data and say "run code" but there's no code. – Dennis Williamson Aug 12 '19 at 21:17
  • haha sorry am new at this, its back now – Alfred tärning Aug 12 '19 at 21:20
  • right now I use cut to grab the time, but it doesn't feel right. the file that has the time and names are not in line all the time – Alfred tärning Aug 12 '19 at 21:24
  • Ah! You need to use a tool that is intended for parsing (X)HTML! [Don't use the wrong tools.](https://stackoverflow.com/a/1732454/26428). For example, use [xpath](https://manpages.ubuntu.com/manpages/disco/en/man1/xpath.1p.html). – Dennis Williamson Aug 12 '19 at 21:32
  • no its nothing to do with html. its a bash code that need to know where its a empty space so when the time is put in a database it will not go to the wring column wrong – Alfred tärning Aug 12 '19 at 21:43
  • The data you show looks an awful lot like an HTML table to me. – Dennis Williamson Aug 12 '19 at 21:50
  • haha ya but html code above is just for how it looks like when the pdf file is converted to a txt file. – Alfred tärning Aug 12 '19 at 22:06
  • you know in bash when i typ "awk {'print $4'}" I want the time from wednesday but its empty. So I want the "script" to say "wednesday time empty". but insted for taking the wednesday time "awk {'print $4'}" takes the thursday time – Alfred tärning Aug 12 '19 at 22:09
  • It's not helpful for people to see data which is not the data you're working with. If the HTML table above is "Point A" in your processing and the data that you had previously posted is "Point B" then there's no way to go from Point B to the desired result, but there is if you start at Point A and forget about B completely. – Dennis Williamson Aug 12 '19 at 22:23
  • 5 days, four values. How do you know which one is missing? Maybe you have 2 tabs, spaces or other delimiters? – Walter A Aug 12 '19 at 22:42
  • sorry but o cant post the real data beacuse of GDPR its a schedule for school – Alfred tärning Aug 13 '19 at 08:38

0 Answers0