2

I have data in markup like this:

  <p class="bbook">Lorem</p>
  <p class="bref">
     <a class="ref" href="prin.v.ii.ii.html">2:15</a>
     <a class="ref" href="prin.v.i.v.html">3:17-19</a>
     <a class="ref" href="prin.v.v.html">3:19 </a>
  </p>

 <p class="bbook">Ipsum</p>
 <p class="bref">
     <a class="ref" href="sec.vii.xxii.html">10:18</a>
     <a class="ref" href="sec.vii.ix.html">10:27</a>
     <a class="ref" href="sec.vii.xxiii.html">10:28</a>
 </p>

I'd like to convert it to a JSON object like this:

{
    "Lorem": {
        "prin.v.ii.ii.html": "2:15",
        "prin.v.i.v.html": "3:17-19",
        "prin.v.v.html": "3:19"
    },
    "Ipsum": {
        "sec.vii.xxii.html": "10:18",
        "sec.vii.ix.html": "10:27",
        "sec.vii.xxiii.html": "10:28"
    }
}

I've seen some HTML to JSON solutions here but none that I can find that deal with attributes. I know it might be easier if the markup had ul's but it doesn't. How could I convert this?

nathanbweb
  • 717
  • 1
  • 11
  • 26

5 Answers5

4

Pretty easily, I should think. Here's some example code in jQuery-flavored Javascript, but you can adjust to taste with the DOM traverser and JSON library in your language of choice. (For example, in Perl, you'd use the HTML::TreeBuilder and JSON modules.)

var json_obj = {};
$('p.bbook').each(function(i,el) {
    var which = $(el).text();
    var refs = {};
    $(el).next('p.bref').find('a.ref').each(function(i,el) {
        var href = $(el).attr('href');
        var chapter_verse = $(el).text();
        refs[href] = chapter_verse;
    });
    json_obj[which] = refs;
});
var json_result = JSON.stringify(json_obj);

At this point, json_result contains a JSON string whose contents match what you describe in your question.

Aaron Miller
  • 3,692
  • 1
  • 19
  • 26
  • @Musa Thanks for that! I forgot to use `var` to scope my variables properly, and wouldn't have spotted the error had you not linked your fiddle. – Aaron Miller Jul 18 '13 at 16:28
2

Use $.parseJSON() and $.each() from the jQuery framework. Here an exemple :

$(document).ready(function () {
    var jsonp = '[{"Lang":"jQuery","ID":"1"},{"Lang":"C#","ID":"2"}]';
    var lang = '';
    var obj = $.parseJSON(jsonp);
    $.each(obj, function () {
        lang += this['Lang'] + "<br/>";
    });
    $('span').html(lang);
});​

krishwader
  • 11,341
  • 1
  • 34
  • 51
jody_lognoul
  • 817
  • 7
  • 15
1

I think you should take a look at Beautiful Soup 4.

Start a Python script, feed the html to the soup, and you should be able to get whatever you want into a dictionary, and use json.dumps() at the end to get your JSON.

# import/install bs4, json (already included)
end_json = {}

soup = BeautifulSoup(html_string)
books = soup.findAll('p', class='bbook')
for book in books:
    # etc, etc

Edit: Don't know how I missed the JQuery in the question title, but BS4 is awesome.

bbill
  • 2,264
  • 1
  • 22
  • 28
1

http://jsfiddle.net/wDjhJ/

var result = {};
$('.bbook').each(function(a,b){
    var $this = $(b);
    result[$this.text()] = {};
    $this.next().find('a').each(function(k,v){
        var item = $(v);
        result[$this.text()][item.attr('href')] = item.text();
    });
});

$('body').append(JSON.stringify(result));

Traverse the dom with a couple of loops.

bluetoft
  • 5,373
  • 2
  • 23
  • 26
1

jsFiddle

$(document).ready(function() {
    var O = {}, el, key, a;
    $('.bbook').each(function(index, value) {
        el = $(value);
        key = el.text();
        O[key] = {};
        el.next().find('a').each(function(i, v) {
            a = $(v);
            O[key][a.attr('href')] = a.text();
        });
    });

    console.log(JSON.stringify(O));
});
Manoj Yadav
  • 6,560
  • 1
  • 23
  • 21