-2

I want to parse html file as json object in flask. Let's say I have form where I want to import files with extension html. output must be as json file which is contain will be as in below:

{  
   "a":[  
      "https://www.google.hu/",
      "https://www.facebook.com/"
   ]
}

Can someone please share any guide or anyting or advice for this. Thank you for reading.

Imprfectluck
  • 654
  • 6
  • 26

1 Answers1

1

It seems like you have more than one problem here. Let me answer the JSON question though. You need to import json from the flask library and use the json.dumps() method to encode the dictionary object into a JSON string suitable for returning in a response. Please check the sample code below.

from flask import Flask, json

app = Flask('0.0.0.0', port=8080, debug=True)

@app.route('/url', methods=['POST'])
def html_json():
    data = {"a": ["https://www.google.hu/",
                  "https://www.facebook.com/"
                 ]
    }
    return json.dumps(data), 200

http://flask.pocoo.org/docs/0.12/patterns/fileuploads/ should help with file upload tasks.

Beautiful Soup

BeautifulSoup is a Python library that can be used to parse HTML. This will give you a structured set of elements that you can iterate over and convert into JSON.

from BeautifulSoup import BeautifulSoup
html = "html file as string here"
soup = BeautifulSoup(html)

You can then iterate through each tag in the HTML.

links = soup.find_all('a',href=True)

Here is a question where someone converts a HTML table to JSON (far easier than a whole page): Convert a HTML Table to JSON

I found a blog post which could help. The author created a HTML to JSON Parser: http://www.xavierdupre.fr/blog/2013-10-27_nojs.html

Community
  • 1
  • 1
Edward Williams
  • 596
  • 7
  • 25
  • Yes actuallly more more than. And I am stuck now. ). Thank you so much for share. –  Feb 22 '17 at 22:41
  • Up-vote if I helped, if you are stuck, please say what you are stuck on so I can help you. – Edward Williams Feb 22 '17 at 22:44
  • My porblem is here I used json format which I shared as example. How can I extend all html elements as json. this is like let's say will be like json object text and everything inside of the tag will as elements. –  Feb 22 '17 at 22:46
  • I am so sorry @Edward, my brain totally freeze. If I didn't explain correctly can rewrite that. –  Feb 22 '17 at 22:50