I write an app that gets IMDb movie information by scraping movie page source. Some of the movie data in page source are in JSON format with movie schema from "Schema.org".
{
"@context": "http://schema.org",
"@type": "Movie",
"url": "/title/tt7131622/",
"name": "Once Upon a Time... in Hollywood",
"genre": [
"Comedy",
"Drama"
],
"actor": [
{
"@type": "Person",
"url": "/name/nm0000138/",
"name": "Leonardo DiCaprio"
},
{
"@type": "Person",
"url": "/name/nm0000093/",
"name": "Brad Pitt"
},
{
"@type": "Person",
"url": "/name/nm3053338/",
"name": "Margot Robbie"
},
{
"@type": "Person",
"url": "/name/nm0386472/",
"name": "Emile Hirsch"
}
],
"director": {
"@type": "Person",
"url": "/name/nm0000233/",
"name": "Quentin Tarantino"
},
"creator": [
{
"@type": "Person",
"url": "/name/nm0000233/",
"name": "Quentin Tarantino"
},
{
"@type": "Organization",
"url": "/company/co0050868/"
},
{
"@type": "Organization",
"url": "/company/co0452101/"
},
{
"@type": "Organization",
"url": "/company/co0159772/"
}
}
I made a "Movie" class to deserialize the JSON object. There is a property Person
class with the name "Director".
internal class ImdbJsonMovie
{
public string Url { get; set; }
public string Name { get; set; }
public string Image { get; set; }
public List<string> Genre { get; set; }
public List<ImdbJsonPerson> Actor { get; set; }
public ImdbJsonPerson Director { get; set; }
//public string[] Creator { get; set; }
}
It's OK. But the problem is some movies such as "The Matrix" have more than one director.
{
"@context": "http://schema.org",
"@type": "Movie",
"url": "/title/tt0133093/",
"name": "The Matrix",
"genre": [
"Action",
"Sci-Fi"
],
"actor": [
{
"@type": "Person",
"url": "/name/nm0000206/",
"name": "Keanu Reeves"
},
{
"@type": "Person",
"url": "/name/nm0000401/",
"name": "Laurence Fishburne"
},
{
"@type": "Person",
"url": "/name/nm0005251/",
"name": "Carrie-Anne Moss"
},
{
"@type": "Person",
"url": "/name/nm0915989/",
"name": "Hugo Weaving"
}
],
"director": [
{
"@type": "Person",
"url": "/name/nm0905154/",
"name": "Lana Wachowski"
},
{
"@type": "Person",
"url": "/name/nm0905152/",
"name": "Lilly Wachowski"
}
],
"creator": [
{
"@type": "Person",
"url": "/name/nm0905152/",
"name": "Lilly Wachowski"
},
{
"@type": "Person",
"url": "/name/nm0905154/",
"name": "Lana Wachowski"
},
{
"@type": "Organization",
"url": "/company/co0002663/"
},
{
"@type": "Organization",
"url": "/company/co0108864/"
},
{
"@type": "Organization",
"url": "/company/co0060075/"
},
{
"@type": "Organization",
"url": "/company/co0019968/"
},
{
"@type": "Organization",
"url": "/company/co0070636/"
}
}
So it must be List<Person>
.
internal class ImdbJsonMovie
{
public string Url { get; set; }
public string Name { get; set; }
public string Image { get; set; }
public List<string> Genre { get; set; }
public List<ImdbJsonPerson> Actor { get; set; }
public List<ImdbJsonPerson> Director { get; set; }
//public string[] Creator { get; set; }
}
Another problem is how to deserialize creator property that is made by the Person
class and Organization
class.
So the question is "How to deserialize this complex JSON object?"
Thank you