1

I'm newer to Python and am trying to find the most Pythonic way to parse a response from an LDAP query. So far what I have works but I'd like to make it neater if possible. My response data is this:

"[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"

Out of that data I'm really only interested in the fields within the {} so that I can throw it into a dictionary...

"department:theDepartment,mail:theEmail@mycompany.com"

What I'm doing now feels (and looks) really brute-force but works. I've added in extra commenting and output results based on what each step is doing to try and elaborate on this mess.

#Original String
#"[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"

#split at open {, take the latter half
myDetails = str(result_set[0]).split('{') 
#myDetails[1] = ["'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"]

#split at close }, take the former half
myDetails = str(myDetails[1]).split('}') 
#myDetails[0] = ["'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']"]

#split at comma to separate the two response fields
myDetails = str(myDetails[0]).split(',') 
#myDetails = ["'department': ['theDepartment']","'mail': ['theEmail@mycompany.com']"]

#clean up the first response field
myDetails[0] = str(myDetails[0]).translate(None, "'").translate(None," [").translate(None,"]") 
#myDetails[0] = ["department:theDepartment"]

#clean up the second response field
myDetails[1] = str(myDetails[1]).translate(None," '").translate(None, "'").translate(None,"[").translate(None,"]")
#myDetails[1] = ["mail:theEmail@mycompany.com"]

While I'm a big fan of "if it ain't broke, don't fix it" I'm a bigger fan of efficiency.

EDIT This ended up working for me per the accepted answer below by @Mario

myUser = ast.literal_eval(str(result_set[0]))[0][1] 
myUserDict = { k: v[0] for k, v in myUser.iteritems() }
Dan
  • 5,153
  • 4
  • 31
  • 42
  • 1
    It looks like something that a JSON parser can do. Also, since you're only interested in what's between the curly braces, you can probably build a simple regex to parse it. – Edward L. Mar 05 '15 at 19:14
  • Consider reviewing the marked answer [to this question](http://stackoverflow.com/questions/13297654/convert-string-into-dictionary-with-python). It seems to work as-is for `x = "{'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']}"` – jedwards Mar 05 '15 at 19:15
  • If the entire output was valid json this would be easy. ;) – Dan Mar 05 '15 at 19:17
  • 1
    Are you aware of Python-ldap? https://pypi.python.org/pypi/python-ldap – oefe Mar 05 '15 at 19:19
  • @oefe python-ldap is the module I'm using already. My LDAP server is giving me the aforementioned string I'm trying to clean up. – Dan Mar 05 '15 at 19:20
  • Doesn't Python-ldap parse the response already? At least, the document for LDAPObject.search_s says: Each result tuple is of the form (dn, attrs), where dn is a string containing the DN (distinguished name) of the entry, and attrs is a dictionary containing the attributes associated with the entry. The keys of attrs are strings, and the associated values are lists of strings. – oefe Mar 05 '15 at 19:53
  • @oefe I'm not sure what to tell you. You can see exactly how I'm building my query on another answer I provided today. http://stackoverflow.com/questions/4784775/ldap-query-in-python/28880749#28880749 – Dan Mar 05 '15 at 20:27
  • I'm not familiar with Python-ldap, but from reading the docs (see my previous comment) it should return already parsed data; the string at the top of your question is the `repr` of those parsed data. There is no need to parse the `repr`. – oefe Mar 05 '15 at 20:52
  • Downvote on a valid question? Care to share why? – Dan Apr 29 '15 at 17:11

2 Answers2

5

Trusting your input and counting on its strict regularity, this will parse your example data and produce what it is you're expecting:

import ast

ldapData = "[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"

# Using the ast module's function is much safer than using eval. (See below!)
obj = ast.literal_eval(ldapData)[0][0]
rawDict = obj[1]
data = { k: v[0] for k, v in rawDict.iteritems() }

# The dictionary.
print data

The line using the curly brackets is called a dict comprehension.


Edit: Another user on this thread suggests using the ast.literal_eval function. I have to agree, after researching this. The eval function will execute any string. If the input was something like this, you'd have a big problem:

eval("__import__('os').system('rm -R *')") 

On the other hand, if this same string was parsed with the ast function, you would get an exception:

>>> import ast
>>> ast.literal_eval("__import__('os').system('rm -R *')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/ast.py", line 80, in literal_eval
    return _convert(node_or_string)
  File "/usr/lib64/python2.7/ast.py", line 79, in _convert
    raise ValueError('malformed string')
ValueError: malformed string
>>> 

Further discussion can be found here:

http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html

The module's documentation is here:

https://docs.python.org/2/library/ast.html

Mario
  • 2,397
  • 2
  • 24
  • 41
  • 2
    Consider using ast.literal_eval over eval – Tim Mar 05 '15 at 19:46
  • 1
    I've incorporated your suggestion, but elsewhere in this thread the OP mentions that he's getting the input from the python-ldap module. The module processes the raw input, so I'm going to guess it's already vetted. (I did not see his mention of this until I had edited my response.) I was unaware of the alternative to eval, so thank you for that. – Mario Mar 05 '15 at 20:16
  • 1
    I ended up getting this to work. I'm still dissecting how AST works for reference but this was like magic. Thanks for the help! `myUser = ast.literal_eval(str(result_set[0]))[0][1] myUserDict = { k: v[0] for k, v in myUser.iteritems() }` – Dan Mar 06 '15 at 01:54
2

Considering this uses ast.literal_eval it's not perfect but it sure is cleaner

>>> import ast
>>> a = "[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"                                                                                                                                                                    
>>> ast.literal_eval(a)[0][0][1]
{'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']}
>>> type(ast.literal_eval(a)[0][0][1])                                                                                                                               
<type 'dict'>                                                                                                                                                        
Tim
  • 41,901
  • 18
  • 127
  • 145