0

I have few lines of code in a file (code has few new lines, tabs, string and pattern-string)

I want to get this content of file as a string value, so that it can be sent as a string value of some parameter in json {param1: "value1", code: "code-content-from-file-should-go-here"}

lets say file content is

function string.urlDecode(str)
  if string.isEmpty(str) then return str end
  str = string.gsub(str, "+", " ")
  str = string.gsub(str, "%%(%x%x)", function(h) return string.char(tonumber(h, 16)) end)
  str = string.gsub(str, "\r\n", "\n")
  return str
end

which should get converted to (what I see here is newline, tabs, in general code formatting is preserved, " \ etc are escaped)

function string.urlDecode(str)\n  if string.isEmpty(str) then return str end\n  str = string.gsub(str, \"+\", \" \")\n  str = string.gsub(str, \"%%(%x%x)\", function(h) return string.char(tonumber(h, 16)) end)\n  str = string.gsub(str, \"\\r\\n\", \"\\n\")\n  return str\nend

So that json becomes

{param1: "value1", code: "function string.urlDecode(str)\n  if string.isEmpty(str) then return str end\n  str = string.gsub(str, \"+\", \" \")\n  str = string.gsub(str, \"%%(%x%x)\", function(h) return string.char(tonumber(h, 16)) end)\n  str = string.gsub(str, \"\\r\\n\", \"\\n\")\n  return str\nend"}

While conversion of file-content to string in above mentioned manner can be done using sed (got from few related slackoverflow threads like How can I replace a newline (\n) using sed?), but I will have to handle each scenario like newline, tabs, ", \, and if there are any other special characters that needs to be escaped (which I dont know)

Is there any bash command (or maybe python module) that can handle all such scenario's for code-content-from-file to string conversion?

As this sees like a common use case if someone wants to send code content in JSON

Community
  • 1
  • 1
lucky
  • 414
  • 1
  • 7
  • 19
  • Isn't this essentially the same question as [How to escape special characters in building a JSON string?](http://stackoverflow.com/questions/19176024/how-to-escape-special-characters-in-building-a-json-string) – Benjamin W. May 11 '17 at 21:16

1 Answers1

1

If content is in file.txt

function encode {
    local input=$1
    local output
    for ((i=0;i<${#input};i+=1)); do
        ic=${input:$i:1}
        if [[ $ic = $'\n' ]]; then
            oc='\n'
        elif  [[ $ic = '\' || $ic = '"' ]]; then
            oc='\'$ic
        # [[ $ic < $'\040' ]] # works only if LC_COLLATE=C or LC_ALL=C
        elif (( $(printf "%d" "'$ic") < 32 )); then
            oc='\0'$(printf "%02o" "'$ic")
        else
            oc=$ic
        fi
        output=$output$oc
    done
    echo "$output"
}

printf '{param1: "%s", code: "%s"}' "value1" "$(encode "$(<file.txt)")"    
Nahuel Fouilleul
  • 18,726
  • 2
  • 31
  • 36
  • While I don't think the whole slew of "things that JSON doesn't understand" such as all control characters (`\b`, `\r`, `\f` etc.) have to be escaped unless you know they show up in your input file, at least tabs should be, especially if you don't want to remove tab indents in source code. – Benjamin W. May 11 '17 at 20:21
  • @BenjaminW. Control characters are valid in JSON string literal, and it was not in question but the code may be modified easily adding for example `elif [[ $ic < $'\040' ]]; then oc='\0'$(printf "%o" "'$ic");` the only character which may break is NUL '\0', because cannot be set into a bash variable – Nahuel Fouilleul May 11 '17 at 20:54
  • I don't think they're valid in strings, according to the [RFC](https://tools.ietf.org/html/rfc7159): "All Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F)." Or try this minimal `jq` example: `jq <<< $'{"a":"a\bb"}'` – Benjamin W. May 11 '17 at 21:13