0

I've scraped data from web page. I got such content page, how I can convert this object to python dict? Json can not use, because because it's not valid json structure.

{
                    id: 37429,
                    debug: true,
                    title: '37429',
                    filters: {
                        3: {
                            title: '3',
                            all: true,
                            values: {
                                2006: {
                                    title: '2006',
                                    order: 0,
                                    checked: true
                                }
                            },
                            indicator: false
                        },
                        58835: {
                            title: '58835',
                            all: false,
                            values: {
                                1785924: {
                                    title: '1785924',
                                    checked: true
                                }
                            },
                            indicator: false
                        },
                        58423: {
                            title: '58423',
                            all: false,
                            values: {
                                1785900: {
                                    title: '1785900',
                                    checked: true
                                }
                            },
                            indicator: false
                        }
                    }
                }
unknown
  • 121
  • 1
  • 3
  • 9

1 Answers1

1

It actually depends on how you want to approach the problem. Does all the results of the scraped pages have invalid json formats? if yes then you should probably write a code to automatically correct json formats and such is answered here.

On the other hand, if it seems like it happened once and the instance is the one that you have posted in your question, you could fixing the format yourself manually and then do json.loads(your_string).

Personally I would advise you to figure out why you the result of your scraper wasn't in the correct json format to make your life easier in the future. Kudos!

Vincent Pakson
  • 1,891
  • 2
  • 8
  • 17
  • No, result does not be suppose JSON format, but idea with "automatically correct" is interesting. – unknown Oct 15 '18 at 08:54
  • So you're saying that the expected results of your scraper is not in json format? but it just so happens that what you scraped was a json formatted. Can you also specify what kind of scraper your using. maybe that could help – Vincent Pakson Oct 15 '18 at 09:57