I am using Watson Personality Insights to get results from a body of text. The results I am getting from the Node.js Personality insights demo are different from the results I get when using the Python SDK.
Python script:
with open('input_file.txt', encoding='utf-8') as input_file:
profile = personality_insights.profile(
input_file.read(), content_type='text/plain;charset=utf-8',
raw_scores=True, consumption_preferences=True)
print(profile)
Python output: (adding only agreeableness scores to stay under character limit)
{
"trait_id": "big5_agreeableness",
"name": "Agreeableness",
"category": "personality",
"percentile": 0.2641097108346445,
"raw_score": 0.717124182764663,
"children": [{
"trait_id": "facet_altruism",
"name": "Altruism",
"category": "personality",
"percentile": 0.5930367181429955,
"raw_score": 0.7133462509414262
},
{
"trait_id": "facet_cooperation",
"name": "Cooperation",
"category": "personality",
"percentile": 0.49207238025136585,
"raw_score": 0.5781918028043768
},
{
"trait_id": "facet_modesty",
"name": "Modesty",
"category": "personality",
"percentile": 0.7504251965616365,
"raw_score": 0.4840369062774408
},
{
"trait_id": "facet_morality",
"name": "Uncompromising",
"category": "personality",
"percentile": 0.4144135962141314,
"raw_score": 0.6156094284542545
},
{
"trait_id": "facet_sympathy",
"name": "Sympathy",
"category": "personality",
"percentile": 0.8204286367393345,
"raw_score": 0.6984933017082747
},
{
"trait_id": "facet_trust",
"name": "Trust",
"category": "personality",
"percentile": 0.5357101531393991,
"raw_score": 0.5894943830064112
}
]
}
Node.js script:
fs.readFile('input_file.txt', 'utf-8', function (err,data) {
var params={};
params.text=data;
params.content_type='text/plain; charset=utf-8';
params.raw_scores=true;
params.consumption_preferences=true;
personality_insights.profile(params, function(error, response) {
console.log(JSON.stringify(response));
});
});
Node.js output:
{
"id": "Agreeableness",
"name": "Agreeableness",
"category": "personality",
"percentage": 0.2798027409516949,
"sampling_error": 0.101059064,
"children": [{
"id": "Altruism",
"name": "Altruism",
"category": "personality",
"percentage": 0.597937110939136,
"sampling_error": 0.07455418080000001
}, {
"id": "Cooperation",
"name": "Cooperation",
"category": "personality",
"percentage": 0.46813215597029234,
"sampling_error": 0.0832951302
}, {
"id": "Modesty",
"name": "Modesty",
"category": "personality",
"percentage": 0.7661123497302398,
"sampling_error": 0.0594182198
}, {
"id": "Morality",
"name": "Uncompromising",
"category": "personality",
"percentage": 0.42178661415240626,
"sampling_error": 0.0662383546
}, {
"id": "Sympathy",
"name": "Sympathy",
"category": "personality",
"percentage": 0.8252000440378008,
"sampling_error": 0.1022423736
}, {
"id": "Trust",
"name": "Trust",
"category": "personality",
"percentage": 0.5190032062613837,
"sampling_error": 0.0600995984
}]
}
The input file for both is the same:
Operations at ports in the U.S. Southeast are shut as the region copes with the changing path of one hurricane even as another is churning toward the region. Hurricane Irma was downgraded to a Category 1 storm as it pushed up through western and central Florida, the WSJ’s Arian Campo-Flores and Joseph De Avila report. That put the Port Tampa Bay in its path but left major trade gateways on the Atlantic coast, including Jacksonville, Georgia’s Port of Savannah and South Carolina Port of Charleston largely outside the storm’s strongest force. The second Category 4 storm to reach the U.S. this season lashed the Miami area with powerful winds and sheets of rain, and both Florida coasts were preparing for severe storm surges and flooding as it headed north and likely toward Georgia. With the storm following so soon after Hurricane Harvey hit the Gulf Coast and a third storm, Jose, heading north, the U.S. issued a rare waiver of the Jones Act, the federal law that prohibits foreign ships from operating in domestic sea routes, the WSJ’s Costas Paris reports. The action will allow foreign tankers to distribute fuel to hurricane-stricken areas.
There is a mismatch in the values received from the two approaches. The values are the same for both the scripts when content_type=text/plain
the addition of the charset=utf-8
attribute does not seem to make a difference in the results received through the Python code.