UPDATE: I opened an issue on github based on a Ivan Mainetti's suggestion. If you want to weigh in there, it is :https://github.com/orientechnologies/orientdb/issues/6757
I am working on a database based on OrienDB and using a python interface for it. I've had pretty good luck with it, but I've run into a problem that seems to be the driver's (pyorient) wonkiness when dealing with certain unicode characters.
The data structure I'm uploading to the database looks like this:
new_Node = {'@Nodes':
{
"Abs_Address":Ono.absolute_address,
'Content':Ono.content,
'Heading':Ono.heading,
'Type':Ono.type,
'Value':Ono.value
}
}
I have created literally hundreds of records flawlessly on OrientDB / pyorient. I don't think the problem is necessarily a pyorient specific question, however, as I think the reason it is failing on a particular record is because the Ono.absolute_address element has a unicode character that pyorient is somehow choking on.
The record I want to create has an Abs_address of /u/c/2/a1–2, but the node I get when I pass the value to the my data structure above is this:
{'@Nodes': {'Content': '', 'Abs_Address': u'/u/c/2/a1\u20132', 'Type': 'section', 'Heading': ' Transferred', 'Value': u'1\u20132'}}
I think that somehow my problem is python is mixing unicode and ascii strings / chars? I'm a bit new to python and not declaring types, so I'm hoping this isn't an issue with pyorient perse given that the new_Node object doesn't output the properly formatted string...? Or is this an instance of pyorient not liking unicode? I'm tearing my hair out on this one. Any help is appreciated.
In case the error is coming from pyorient and not some kind of text encoding, here's the pyorient-related info. I am creating the record using this code:
rec_position = self.pyo_client.record_create(14, new_Node)
And this is the error I'm getting:
com.orientechnologies.orient.core.storage.ORecordDuplicatedException - Cannot index record Nodes{Content:,Abs_Address:null,Type:section,Heading: Transferred,Value:null}: found duplicated key 'null' in index 'Nodes.Abs_Address' previously assigned to the record #14:558
The error is odd as it suggests that the backend database is getting a null object for the address. Apparently it did create an entry for this "address," but it's not what I want it to do. I don't know why address strings with unicode are coming up null in the database... I can create it through orientDB studio using the exact string I fed into the new_Node data structure... but I can't use python to do the same thing.
Someone help?
EDIT:
Thanks to Laurent, I've narrowed the problem down to something to do with unicode objects and pyorient. Whenever a the variable I am passing is type unicode, the pyorient adapter sends a null value to the OrientDB database. I determined the value that is causing the problem is an ndash symbol, and Laurent helped me replace it with a minus sign using this code
.replace(u"\u2013",u"-")
When I do that, however, pyorient gets unicode objects which it then passes as null values... This is not good. I can fix this short term by recasting the string using str(...) and this appears to solve my immediate problem:
str(Ono.absolute_address.replace(u"\u2013",u"-"))
. Problem is, I know I will have symbols and other unusual characters in my DB data. I know the database supports the unicode strings because I can add them manually or use SQL syntax to do what I cannot do via pyorient and python... I am assuming this is a dicey casting issue somewhere, but I'm not really sure where. This seems very similar to this problem: http://stackoverflow.duapp.com/questions/34757352/how-do-i-create-a-linked-record-in-orientdb-using-pyorient-library
Any pyorient people out there? Python gods? Lucky s0bs? =)