1
#!/bin/bash
# Make a txt copy of any html files

for value in $1/*.html
do
        if [[ $value == *.html ]]; then
            cp $value $1/$( basename -s .html $value ).txt
        fi
done

ERROR: cp: cannot stat '/.html': No such file or directory cp: failed to access 'index.html/.txt': Not a directory

jww
  • 97,681
  • 90
  • 411
  • 885
Claudio Lopez
  • 131
  • 1
  • 3
  • 10
  • pass part of the filename as command line parameter – hashbrown Jul 05 '18 at 01:45
  • I did, I am passing the whole file name as ./convert_html_to_txt.sh index.html for example and throws the error – Claudio Lopez Jul 05 '18 at 01:47
  • @ClaudioLopez, it is obvious error why you are giving a file name where you should give a directory name so lets take example `/index.html/a.html` will not be there so system is complaining it is not there. what I believe yo want to copy 1 sort of html files to another directory if yes then confirm once. – RavinderSingh13 Jul 05 '18 at 01:50
  • @ClaudioLopez, could you please check my solution once and let me know if this helps you? – RavinderSingh13 Jul 05 '18 at 01:59
  • `Convert .html file to .txt using cp` Huh? `cp` doesn't convert anything. Are you simply talking about copying to another file with a different *extension*? If you want to strip the html tags, then there is a utility `html2txt` that does a good job (make sure you look at the options, e.g. `-utf8`, etc.., if you use it) – David C. Rankin Jul 05 '18 at 02:50
  • Also see [How to use Shellcheck](https://github.com/koalaman/shellcheck), [How to debug a bash script?](https://unix.stackexchange.com/q/155551/56041) (U&L.SE), [How to debug a bash script?](https://stackoverflow.com/q/951336/608639) (SO), [How to debug bash script?](https://askubuntu.com/q/21136) (AskU), [Debugging Bash scripts](http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_02_03.html), etc. – jww Jul 05 '18 at 11:50

2 Answers2

1

The $1 in the below statement is the first command line parameter.

$1/*.html

In your code, it is expecting the parent directory name containing the HTML files. Suppose, you parent directory is /home/user/my_html_files, then if you pass this as the command line parameter, then all the HTML files inside this directory will be considered.

# ./convert_html_to_txt.sh /home/user/my_html_files

The above will result into /home/user/my_html_files/*.html in your code. If your HTML file is in current directory, just pass . as the command line parameter (. denotes current directory)

hashbrown
  • 3,438
  • 1
  • 19
  • 37
  • I believe problem is not only that, user is giving a file in arguments, and doing wrong check on file existing or not(since for loop only going through html files so I believe this check could be skipped here). – RavinderSingh13 Jul 05 '18 at 01:57
  • 1
    may be. only OP can clarify :=) – hashbrown Jul 05 '18 at 02:00
  • @hashbrown you were right. I just copied this the for loop somewhere else and did not realize $1 was called before any file is checked against the if statement insede the code. Thank you! – Claudio Lopez Jul 06 '18 at 04:02
0

Firstly you need to pass a directory name(with complete path) to script and second since you are looping through only HTML files in directory so you need not to do check for it again rather you could put check condition on its copy either it copied as .txt successfully or not. I believe most probably you are looking for this kind of solution.

cat script.ksh
for value in $1/*.html
do
   temp=${value%.*}
   echo cp "$value" "$1/$temp.txt"
   if [[ $? -eq 0 ]]
   then
       echo "File named $value copied successfully to "$value" "$1/$temp.txt"
   else
       echo "Please check file named $value NOT copied to "$value" "$1/$temp.txt"
   fi
done

Then run the script.ksh as script.ksh "/directory_name/with_full_path". Also I have put echo before cp command so once you see command is printing correct by above script you could remove it then.

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93