0

Created a text file as hello_world.rtf with following two lines only: Hello World

and trying to read above file using below bash script from terminal:

while test= read -r line; do
> echo "The text read from file is: $line"
> done < hello_world.rtf

and it returns the following:

The text read from file is: {\rtf1\ansi\ansicpg1252\cocoartf1671\cocoasubrtf500
The text read from file is: {\fonttbl\f0\fswiss\fcharset0 Helvetica;}
The text read from file is: {\colortbl;\red255\green255\blue255;}
The text read from file is: {\*\expandedcolortbl;;}
The text read from file is: \paperw12240\paperh15840\margl1440\margr1440\vieww10800\viewh8400\viewkind0
The text read from file is: \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
The text read from file is: 
The text read from file is: \f0\fs24 \cf0 Hello\

Any suggestion what is wrong here and how can I get the clean result?

user1934643
  • 165
  • 1
  • 10
  • 4
    [RTF](https://en.wikipedia.org/wiki/Rich_Text_Format) means Rich Text Format. It is a language for text formatting (developed and used mostly by Microsoft and deprecated for a while). The text inside the file looks as you can see in the output of your code. It contains the words _"Hello"_ and _"World"_ but also formatting instructions. Save the file as plain text, not RTF and it will contain only the text you typed in it. – axiac Oct 26 '20 at 21:54
  • `test=` in front of `read` does not have any effect. – axiac Oct 26 '20 at 21:56
  • Thank you @axiac, I removed "test=" and changed the file as txt file and now it returns only one line i.e. The text read from file is: Hello. I am expecting a similar line for "World" also. – user1934643 Oct 26 '20 at 22:01
  • 1
    Make sure the second line ends with a new-line character. `read` returns `false` when reaches the end of file and your code exits the `while` loop and does not display the last value read by `read`. If the file ends with a new-line character, the last line (that is read but not listed by the code) is empty, therefore nothing is lost. It is a recommended practice for text files to always end with a newline character. Alternatively you can print the value of `line` again after the loop. – axiac Oct 26 '20 at 22:15
  • Super!!! it works like a charm now. Thank you so much @axiac – user1934643 Oct 26 '20 at 22:29

1 Answers1

0

RTF means Rich Text Format. It is a language for text formatting, developed and used mostly by Microsoft and deprecated for a while.

The text inside the file looks as you can see in the output of your code. It contains the words "Hello" and "World" but also formatting instructions.

Save the file as plain text, not RTF and it will contain only the text you typed in it.

test= in front of read does not have any effect in this context. You can remove it.

Make sure the last line of the file ends with a new-line character. read returns an non-zero exit status (and this means false) when it reaches the end of file and your code exits the while loop and does not display the last value read by read. If the file ends with a new-line character, the last line (that is read but not listed by the code) is empty, therefore nothing is lost.

It is a recommended practice for text files to always end with a newline character.


Alternatively you can print the value of line again after the loop. It contains the last line of the file (from the last end-of-line character until the end of file).

axiac
  • 68,258
  • 9
  • 99
  • 134