1

I've written a plugin which writes JSON output to a file in the _data directory:

while current_page <= total_pages do

  url = 'https://web.consonance.app/api/v2/products.json'
  query = {
    'page' => "#{current_page}",
    'q[publishing_status_eq]' => '04' 
  }
  headers = {
    Authorization: "Token token=**************************"
  }
  request = HTTParty.get(url, query: query, headers: headers)

  hash = JSON.parse(request.body)

  hash['products'].each do |item|
    product_array.push(item)
  end

  current_page += 1

end

# open products.json in data dir and write array output converted from hash back to JSON

File.open("./_data/products.json", "w") { |file| 
  file.puts JSON.pretty_generate(product_array)
}

which puts the desired output as a JSON array in the _data directory with the following format:

[
  {
    "id": 100,
    "work_id": 50,
    "full_title": "Title #1"
  },
  {
    "id": 101,
    "work_id": 51,
    "full_title": "Title #2"
  }
]

When I try to build my site, I get the error:

jekyll 3.8.5 | Error:  (/Users/jamiebowman/Documents/web dev/jekyll/press/_data/products.json): control characters are not allowed at line 1 column 1

When I remove the square brackets at the beginning and the end of the JSON file, then the site builds, but I cannot properly access the data without it being an array.

What are control characters in this context and why are they stopping the site from building?

Traceback errors

Traceback (most recent call last):
        30: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/ruby_executable_hooks:24:in `<main>'
        29: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/ruby_executable_hooks:24:in `eval'
        28: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/jekyll:23:in `<main>'
        27: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/jekyll:23:in `load'
        26: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/exe/jekyll:15:in `<top (required)>'
        25: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary.rb:19:in `program'
        24: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/program.rb:42:in `go'
        23: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `execute'
        22: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `each'
        21: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `block in execute'
        20: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:75:in `block (2 levels) in init_with_program'
        19: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `start'
        18: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `each'
        17: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `block in start'
        16: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-livereload-0.2.2/lib/jekyll-livereload/build.rb:30:in `process'
        15: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/build.rb:36:in `process'
        14: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/build.rb:65:in `build'
        13: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/command.rb:28:in `process_site'
        12: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/site.rb:69:in `process'
        11: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/site.rb:164:in `read'
        10: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/reader.rb:18:in `read'
         9: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:20:in `read'
         8: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:38:in `read_data_to'
         7: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:38:in `each'
         6: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:46:in `block in read_data_to'
         5: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:68:in `read_data_file'
         4: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:157:in `load_file'
         3: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:157:in `open'
         2: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:157:in `block in load_file'
         1: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:143:in `load'
/Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:143:in `parse': (/Users/jamiebowman/Documents/web dev/jekyll/press/_data/products.json): control characters are not allowed at line 1 column 1 (Psych::SyntaxError)
jbowman
  • 380
  • 3
  • 15
  • `[{'foo': 'bar'}]` is not valid json. You can verify this by pasting your JSON into https://jsonlint.com/. What do you mean by "I cannot properly access the data"? How are you trying to access the file data? – lacostenycoder Jun 05 '19 at 14:10
  • I have pasted my JSON into jsonlint.com and it is valid. I will edit the PO with a clearer sample of data. – jbowman Jun 05 '19 at 14:54
  • By accessing the data I mean via Jekyll's liquid language i.e. `{{ site.data.products }}` – jbowman Jun 05 '19 at 15:00
  • please post full back trace errors – lacostenycoder Jun 05 '19 at 15:21
  • I ran `bundle exec jekyll serve --trace` and posted results. Hope that's what you meant. – jbowman Jun 05 '19 at 15:42
  • 1
    @lacostenycoder I don't know why but it seems to have something to do with the number of results in the response. If I run the plugin as `current_page <= 4` with 50 results per page then the site builds successfully. Same with `current_page <= 2` with 100 results per page. So, the build is unsuccessful if more than 200 items are returned from the request. – jbowman Jun 05 '19 at 15:55

2 Answers2

1

I'm on the team that maintains the API you're calling, and had the same error. It is to do with non-ASCII characters being included in the response. You can sanitize them like this:

problematic_string.encode(Encoding.find('ASCII'), encoding_options)

where encoding_options are

  def encoding_options
    {
      :invalid           => :replace,  # Replace invalid byte sequences
      :undef             => :replace,  # Replace anything not defined in ASCII
      :replace           => '',        # Use a blank for those replacements
      :universal_newline => true       # Always break lines with \n
    }
  end

source: How to get rid of non-ascii characters in ruby

The problematic_string will likely be a long text such as a review, production blurb or other descriptive text.

snowangel
  • 3,452
  • 3
  • 29
  • 72
0

It seems like a JSON.parse error I don't think you want or need pretty format on the file. Maybe try just this:

File.open("./_data/products.json", "w") { |file| 
  file.write product_array.to_json
}

But your error seems like it might be related to this issue so maybe have a look at the fork posted in the issue to see if it helps.

lacostenycoder
  • 10,623
  • 4
  • 31
  • 48
  • the `.to_json` method results in the same error, but I will try again with the forked Jekyll version – jbowman Jun 05 '19 at 17:53