I have a very large knowledge graph in pandas dataframe format as follows.
This dataframe KG
has more than 100 million rows.
KG:
pred subj obj
0 nationality BART USA
1 placeOfBirth BART NEWYORK
2 locatedIn NEWYORK USA
... ... ... ...
116390740 hasFather BART HOMMER
116390741 nationality HOMMER USA
116390743 placeOfBirth HOMMER NEWYORK
I tried to get a row from this KG with a specific value for subj.
Using the subj column as a series, I tried to indexing the KG by generating a boolean series using isin()
function as shown below.
KG[KG['subj'].isin(['BART', 'NEWYORK'])]
My desired output is
pred subj obj
0 nationality BART USA
1 placeOfBirth BART NEWYORK
2 locatedIn NEWYORK USA
116390740 hasFather BART HOMMER
I have to repeat the above
But the above method takes a long time. Is there any way to reduce the time effectively than this method?
thanks!