3

I have a bunch of latex files witch use the \input{filename.tex} macro (it works like an #include from C), and I want to resolve them so I can output all of them to a single .tex file (the file must be pasted at the place of the \input{} macro, it is safe to assume that each file is referenced only once).

Example:

tesis.tex:

My thesis.
\input{chapter1.tex}
More things
\input{chapter2.tex}

chapter1.tex:

Chapter 1 content.

chapter2.tex:

Chapter 2 content.
\include{section2-2.tex}

section2-2.tex:

Section 1.

The desired result should be:

My thesis.
Chapter 1 content.
More things
Chapter 2 content.
Section 1.

If there was only a \input{foo.tex} level I would be able to solve this with this AWK program:

/\\input\{.*\}/{
    sub(/^[^{]*{/,"",$0)
    sub(/}[^}]*$/,"",$0)
    system("cat " $0)
    next
}

{
    print $0
}

Is there any way to read files recursively in AWK?

(I am open to do it with any other language, but the posixest the better)

Thanks!

James Brown
  • 36,089
  • 7
  • 43
  • 59

2 Answers2

2

Here's a solution in awk using getline in a recursive function for the job. I assumed chapter2.tex:

Chapter 2 content.
\input{section2-2.tex}

Code:

$ cat program.awk
function recurse(file) {              # the recursive function definition
    while((getline line<file) >0) {   # read parameter given file line by line
        if(line~/^\\input/) {         # if line starts with \input 
            gsub(/^.*{|}.*$/,"",line) # read the filename from inside {}
#           print "FILE: " line       # debug
            recurse(line)             # make the recursive function call
        }
        else print line               # print records without \input
    }
    close(file)                       # after file processed close it
}
{                                     # main program used to just call recurse()
    recurse(FILENAME)                 # called
    exit                              # once called, exit
}

Run it:

$ awk -f program.awk tesis.tex
My thesis.
Chapter 1 content.
More things
Chapter 2 content.
Section 1.

Solution expects \input to be in the beginning of the record without any other data on it.

James Brown
  • 36,089
  • 7
  • 43
  • 59
0

Since you have tag it also bash, something like this could work in bash, but it is not tested:

#!/bin/bash
function texextract {
while read -r line;do    
    if [[ "$line" =~ "input" || "$line" =~ "include" ]];then  #regex may need finetune
      filename="${line: 0:-1}"  #removes the last } from \include{section2-2.tex}
      filename="${filename##*{}" #removes from start up to { ---> filename=section2-2.tex
      texextract "$filename"  #call itself with new args
    else
      echo "$line" >>commonbigfile
    fi
done <"$1" #$1 holds the filename send by caller
return
}

texextract tesis.tex #masterfile

In bash 4.4 (and maybe in other versions also) a function can call itself. This is what i make use of here.

George Vasiliou
  • 6,130
  • 2
  • 20
  • 27