The Penn Treebank format does not annotate the internal structure of a noun phrase, e.g.
(NP (JJ crude) (NN oil) (NNS prices))
or
(NP
(NP (DT the) (JJ big) (JJ blue) (NN house))
(SBAR
(WHNP (WDT that))
(S
(VP (VBD was)
(VP (VBN built)
(PP (IN near)
(NP (DT the) (NN river)))))))
I would like to extract the heads (prices and house). Do you know of any tool that can do this?