Any infobox is a template transcluded by curly brackets. Let's have a look to a template and how it is transcluded in wikitext:
Infobox film
{{Infobox film
| name = Actresses
| image = Actrius film poster.jpg
| alt =
| caption = Catalan language film poster
| native_name = ([[Catalan language|Catalan]]: '''''Actrius''''')
| director = [[Ventura Pons]]
| producer = Ventura Pons
| writer = [[Josep Maria Benet i Jornet]]
| screenplay = Ventura Pons
| story =
| based_on = {{based on|(stage play) ''E.R.''|Josep Maria Benet i Jornet}}
| starring = {{ubl|[[Núria Espert]]|[[Rosa Maria Sardà]]|[[Anna Lizaran]]|[[Mercè Pons]]}}
| narrator = <!-- or: |narrators = -->
| music = Carles Cases
| cinematography = Tomàs Pladevall
| editing = Pere Abadal
| production_companies = {{ubl|[[Canal+|Canal+ España]]|Els Films de la Rambla S.A.|[[Generalitat de Catalunya|Generalitat de Catalunya - Departament de Cultura]]|[[Televisión Española]]}}
| distributor = [[Buena Vista International]]
| released = {{film date|df=yes|1997|1|17|[[Spain]]}}
| runtime = 100 minutes
| country = Spain
| language = Catalan
| budget =
| gross = <!--(please use condensed and rounded values, e.g. "£11.6 million" not "£11,586,221")-->
}}
There are two high level Page
methods in Pywikibot to parse the content of any template inside the wikitext content. Both use mwparserfromhell
if installed; otherwise a regex is used but the regex may fail for nested templates with depth > 3:
raw_extracted_templates
raw_extracted_templates
is a Page
property with returns a list of tuples with two items each. The first item is the template identifier as str, 'Infobox film'
for example. The second item is an OrderedDict with template parameters identifier as keys and their assignmets as values. For example the template fields
| name = FILM TITLE
| image = FILM TITLE poster.jpg
| caption = Theatrical release poster
results in an OrderedDict as
OrderedDict((name='FILM TITLE', image='FILM TITLE poster.jpg' caption='Theatrical release poster')
Now how get it with Pywikibot?
from pprint import pprint
import pywikibot
site = pywikibot.Site('wikipedia:en') # or pywikibot.Site('en', 'wikipedia') for older Releases
page = pywikibot.Page(site, 'Actrius')
all_templates = page.page.raw_extracted_templates
for tmpl, params in all_templates:
if tmpl == 'Infobox film':
pprint(params)
This will print
OrderedDict([('name', 'Actresses'),
('image', 'Actrius film poster.jpg'),
('alt', ''),
('caption', 'Catalan language film poster'),
('native_name',
"([[Catalan language|Catalan]]: '''''Actrius''''')"),
('director', '[[Ventura Pons]]'),
('producer', 'Ventura Pons'),
('writer', '[[Josep Maria Benet i Jornet]]'),
('screenplay', 'Ventura Pons'),
('story', ''),
('based_on',
"{{based on|(stage play) ''E.R.''|Josep Maria Benet i Jornet}}"),
('starring',
'{{ubl|[[Núria Espert]]|[[Rosa Maria Sardà]]|[[Anna '
'Lizaran]]|[[Mercè Pons]]}}'),
('narrator', ''),
('music', 'Carles Cases'),
('cinematography', 'Tomàs Pladevall'),
('editing', 'Pere Abadal'),
('production_companies',
'{{ubl|[[Canal+|Canal+ España]]|Els Films de la Rambla '
'S.A.|[[Generalitat de Catalunya|Generalitat de Catalunya - '
'Departament de Cultura]]|[[Televisión Española]]}}'),
('distributor', '[[Buena Vista International]]'),
('released', '{{film date|df=yes|1997|1|17|[[Spain]]}}'),
('runtime', '100 minutes'),
('country', 'Spain'),
('language', 'Catalan'),
('budget', ''),
('gross', '')])
templatesWithParams()
This is similar to raw_extracted_templates property but the method returns a list of tuples with again two items. The first item is the template as a Page
object. The second item is a list of template parameters. Have a look at the sample:
Sample code
from pprint import pprint
import pywikibot
site = pywikibot.Site('wikipedia:en') # or pywikibot.Site('en', 'wikipedia') for older Releases
page = pywikibot.Page(site, 'Actrius')
all_templates = page.templatestemplatesWithParams()
for tmpl, params in all_templates:
if tmpl.title(with_ns=False) == 'Infobox film':
pprint(tmpl)
This will print the list:
['alt=',
"based_on={{based on|(stage play) ''E.R.''|Josep Maria Benet i Jornet}}",
'budget=',
'caption=Catalan language film poster',
'cinematography=Tomàs Pladevall',
'country=Spain',
'director=[[Ventura Pons]]',
'distributor=[[Buena Vista International]]',
'editing=Pere Abadal',
'gross=',
'image=Actrius film poster.jpg',
'language=Catalan',
'music=Carles Cases',
'name=Actresses',
'narrator=',
"native_name=([[Catalan language|Catalan]]: '''''Actrius''''')",
'producer=Ventura Pons',
'production_companies={{ubl|[[Canal+|Canal+ España]]|Els Films de la Rambla '
'S.A.|[[Generalitat de Catalunya|Generalitat de Catalunya - Departament de '
'Cultura]]|[[Televisión Española]]}}',
'released={{film date|df=yes|1997|1|17|[[Spain]]}}',
'runtime=100 minutes',
'screenplay=Ventura Pons',
'starring={{ubl|[[Núria Espert]]|[[Rosa Maria Sardà]]|[[Anna '
'Lizaran]]|[[Mercè Pons]]}}',
'story=',
'writer=[[Josep Maria Benet i Jornet]]']