1

I am new in bash scripting, I have html file and i want to read the file and show into terminal with formatting.

My Html file Code

<table>
<tr><th >Country Name</th><th >City1</th><th >City2</th><th>City3</th></tr>
<tr><td>CHINA</td><td>500</td><td>700</td><td>1200</td></tr>
<tr><td>USA</td><td>400</td><td>600</td><td>1000</td></tr>
</table>

How can format Terminal output, i mean their spaces between colum1 and column2?

Muhammad Rashid
  • 563
  • 1
  • 6
  • 25
  • 1
    Don't put text in images. – Cyrus Sep 25 '20 at 19:54
  • it is not text i just take prtscr of html file and terminal, so that i can explain my problem easly. – Muhammad Rashid Sep 25 '20 at 19:56
  • 1
    There is no such thing as "awk bash". Awk is one programming language. Bash is a different one. A bash script that calls awk (or the inverse) is a script that has different parts written in different languages, run by completely independent interpreters. – Charles Duffy Sep 25 '20 at 19:57
  • ...anyhow, if you want to extract content from XML or HTML, there are dedicated tools for that. I strongly recommend using something that leverages XPath, XSLT, and other standardized query languages; my preferred favorite command-line tool (which generates XSLT under the hood in many of its modes) is [xmlstarlet](http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html). – Charles Duffy Sep 25 '20 at 19:57
  • And if you're going to use `printf` in awk, use it for _both_ values -- you can have it pad out the string to a specific column length. – Charles Duffy Sep 25 '20 at 20:00
  • Now I see that editing counts for re-open, I had no intention to re-open, only to improve the title. – thanasisp Sep 26 '20 at 15:53

1 Answers1

1

Option 1: Using column to format your existing code's output

Use column tool to format the code for you

$ cat test.sh 
#!/bin/bash

pre="<tr><td>"
post="<\/td><\/tr>"
mid="<\/td><td>"

cat myfile.html | grep "<td>" | sed -e "s/^$pre//g;s/$post$//g;s/$mid/ /g" | awk '{ sum=($2+$3+$4); printf $1  " %.0f \n" ,sum}'

$ cat myfile.html 
<table>
<tr><th >Country Name</th><th >City1</th><th >City2</th><th>City3</th></tr>
<tr><td>CHINA</td><td>500</td><td>700</td><td>1200</td></tr>
<tr><td>USA</td><td>400</td><td>600</td><td>1000</td></tr>
</table>

$ ./test.sh | column -t
CHINA  2400
USA    2000

Option 2: Updating your existing code's use of printf

If we know the longest possible country-name length, we can tell printf to pad to it. Changing only the awk part of your existing answer (in this case, telling it to pad to 8 spaces):

grep "<td>" myfile.html \
  | sed -e "s/^$pre//g;s/$post$//g;s/$mid/ /g" \
  | awk '{ sum=($2+$3+$4); printf "%-08s %.0f \n", $1, sum}'

...we get output:

CHINA    2400
USA      2000
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
lucasgrvarela
  • 351
  • 1
  • 4
  • 9
  • 1
    This is already covered in [How can I align the columns of tables in bash?](https://stackoverflow.com/questions/12768907/how-can-i-align-the-columns-of-tables-in-bash), which is a member of the duplicate list. (And as covered there, the `column` tool is not available everywhere bash is, so `printf` is often better). – Charles Duffy Sep 25 '20 at 20:04
  • @CharlesDuffy in that case should I remove my answer? I'm new here 2 days only, not sure about all the "rules". – lucasgrvarela Sep 25 '20 at 20:05
  • 2
    _shrug_. If it were me, I'd just flag it community-wiki, but that's a strictly voluntary action -- I can't tell you to do it (but disowning gaining any reputation from an answer, as a community-wiki flag does, tends to make answers that edge up against the rules more acceptable). – Charles Duffy Sep 25 '20 at 20:06
  • 1
    @CharlesDuffy done, in that case because I believe this answer can be improved to answer this specific scenario with the printf | awk tool. if someone wants to put more effort on it :) – lucasgrvarela Sep 25 '20 at 20:14
  • 1
    Heh. Glad to demo using awk's printf for alignment. – Charles Duffy Sep 25 '20 at 20:23
  • @thanaisp, I'd have to search meta, and honestly it's something of an artifact of the old days of the site -- it may well turn out to no longer be a matter of consensus, but back when CW was more actively encouraged this kind of thing was one of the use cases. And you're right -- downvoting just for being a duplicate isn't fair if a question is clear, useful and well-written (and duplicates can well be upvoted should they be those things). I haven't placed any votes on this question in either direction. – Charles Duffy Sep 26 '20 at 12:47
  • @thanaisp, ...btw, I'm not quite sure I follow the part about an upvoting swarm. I will admit to being a little conflicted on this one because this answer _doesn't_ meld all the duplicates, and particularly leaves out some of the important ones about using HTML- or XML-aware tools for parsing; but Lucas was making a reasonable effort to be helpful, and I wanted to reward that. – Charles Duffy Sep 26 '20 at 12:52
  • That said, I can free this one from offtopic content. Have a nice day, thanks, also for your good posts here. – thanasisp Sep 26 '20 at 14:39