1

I'm looking for an efficient way to replace a bunch of placeholders/tokens in a user supplied text file, with their corresponding values stored in a simple map or environment vars. The thing is that the template file will be supplied by the end user, so I'm looking for a "safe" way to do only the variable replacements, without any risk of code execution, etc.

Go's standard "text/template" would work for the replacement itself but imposes specific formatting requirements (e.g. dot "." before the Key) and opens up other possibilities with its function invocations, pipelines, etc.

So what I'm looking for, ideally, is a function that can parse a text file with configurable delimiters ("{{}}" or "${}" or "$##$") and replace all the detected tokens with lookups into a supplied map or their env var values. Similar to what Python's string.Template (https://docs.python.org/2.6/library/string.html?highlight=string.template#string.Template) does.

Is there an easy way to configure or reuse the text/template library for this? Are there any other approaches that would fit the use case better? I've looked into non-golang options as well (like envsubtr, awk and sed scripts etc.) so feel free to go outside of Go if something fits better.

Sample input file ('template.properties'):

var1=$#VAR_1#$
var2=$#VAR_2#$

Sample input data:

VAR_1 = apples
VAR_2 = oranges

Expected output after processing:

var1=apples
var2=oranges
Ike
  • 763
  • 1
  • 7
  • 17
  • You could manually read the file and perform successive replace operations for the variables/replacements, and you can do this efficiently by building the output on-the-fly. It can be done with fairly few lines of code (~30). See this question+answer which shows this in Java: [Alternative to successive String.replace](http://stackoverflow.com/questions/26735276/alternative-to-successive-string-replace) – icza Jun 10 '15 at 17:09
  • Thanks icza. Implementing my own replacer is definitely an option but I was hoping to find an efficient and flexible library that can do it. Replacing tokens in a string seems like a generic enough task that someone else would have solved well already. – Ike Jun 10 '15 at 17:54
  • Would something like [Mustache](https://mustache.github.io/) work for you? – n0741337 Jun 10 '15 at 17:57
  • n0741337, Mustache is a great option. I didn't think to look for a Go implementation. It still supports "Sections", however, with syntax like `{{#`. Do you know if there's a way to disable that piece? – Ike Jun 10 '15 at 18:10
  • You could pre-filter the input (good idea anyway) and reject it if it contains {{# – mpez0 Jun 10 '15 at 18:23
  • Regarding Mustache, looks like the javascript and go projects are licensed for you to fork/edit freely. I'm a javascript novice and I've never looked at go code, but they seem simple enough to alter. For mustache.js looks like you could just sabotage the [`tagRe`](https://github.com/janl/mustache.js/blob/master/mustache.js#L71) to remove tags you don't want. For mustache.go it looks like you could intercept the [`tag[0]`](https://github.com/hoisie/mustache/blob/master/mustache.go#L241) to skip tags you don't like in the parse() func. Then make sure the corresponding parse tests fail. – n0741337 Jun 10 '15 at 23:01

2 Answers2

1

This will work as long as your variable names don't contain ERE metacharacters:

$ cat tst.awk
NR==FNR { var2val[$1] = $NF; next }
{
    for (var in var2val) {
        sub("[$]#"var"#[$]",var2val[var])
    }
    print
}

$ awk -f tst.awk input.data template.properties
var1=apples
var2=oranges

wrt your comment below about having the mappings in variables instead of in input.data, this might be what you're looking for:

$ cat tst.awk
BEGIN {
    split(vars,tmp)
    for (i in tmp) {
        var2val[tmp[i]] = ENVIRON[tmp[i]]
    }
}
{
    for (var in var2val) {
        sub("[$]#"var"#[$]",var2val[var])
    }
    print
}

will work with shell variables like:

$ VAR_1=apples VAR_2=oranges gawk -v vars="VAR_1 VAR_2" -f tst.awk template.properties
var1=apples
var2=oranges

or:

$ export VAR_1=apples
$ export VAR_2=oranges
$ gawk -v vars="VAR_1 VAR_2" -f tst.awk template.properties
var1=apples
var2=oranges

or:

$ VAR_1=apples
$ VAR_2=oranges
$ VAR_1="$VAR_1" VAR_2="$VAR_2" gawk -v vars="VAR_1 VAR_2" -f tst.awk template.properties
var1=apples
var2=oranges

Note that this is gawk-specific due to ENVIRON and requires VAR_1 etc. to be exported or set on the command line as I have it above.

Or maybe this is what you want:

$ cat tst.awk
BEGIN {
    var2val["VAR_1"] = VAR_1
    var2val["VAR_2"] = VAR_2
}
{
    for (var in var2val) {
        sub("[$]#"var"#[$]",var2val[var])
    }
    print
}

$ VAR_1=apples
$ VAR_2=oranges
$ awk -v VAR_1="$VAR_1" -v VAR_2="$VAR_2" -f tst.awk template.properties
var1=apples
var2=oranges
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 1
    Thanks for the snippet @EdMorton. I currently have the source var/values in either a data structure (in a Go or Python program) or Environment Variables. I can write the var data to a temp file for this to work as suggested. Is there a trivial way to "pass in" the data or have the awk script look it up from exported shell Environment Variables that wouldn't require creating a temp file with it? – Ike Jun 10 '15 at 20:35
  • Sorry I've never heard of "Go" and couldn't tell a Python program from a hole in the ground. When you say "env vars" - are you talking about shell variables or something else? Whatever it is please edit your question to show it. – Ed Morton Jun 10 '15 at 20:37
  • @Ike I edited my question to show a couple of possible solutions to one possible interpretation of your comment about having values in env vars. – Ed Morton Jun 10 '15 at 20:45
  • 1
    Thanks again @EdMorton. This looks like a good, pragmatic approach. I didn't think of passing the var names in the command line but it's easy enough to do programmatically and also helps as an input filter.I'll test it with a couple of real-world template files and report back. – Ike Jun 10 '15 at 20:50
0

Just use fasttemplate[1]. It perfectly fits your requirements:

  • Arbitrary placeholders' start and end delimiters can be used.
  • Zero risk for untrusted input, because there is no any logic except placeholders' substitution.
  • Works much faster than text/template (by 10x).

[1]https://github.com/valyala/fasttemplate

valyala
  • 11,669
  • 1
  • 59
  • 62