5

PowerShell is capable of pulling list of 1492 records. When I using Python with ldap3 module I'm bumping into 1000 records limit. Please help me change Python code to exceed the limit.

PowerShell input: get-aduser -filter * -SearchBase "OU=SMZ USERS,OU=SMZ,OU=EUR,DC=my_dc,DC=COM" | Measure-Object

output: Count : 1492 Average : Sum : Maximum : Minimum : Property :

import json
from ldap3 import Server, \
Connection, \
AUTO_BIND_NO_TLS, \
SUBTREE, \
ALL_ATTRIBUTES

def get_ldap_info(u):
with Connection(Server('my_server', port=636, use_ssl=True),
                auto_bind=AUTO_BIND_NO_TLS,
                read_only=True,
                check_names=True,
                user='my_login', password='my_password') as c:

    c.search(search_base='OU=SMZ Users,OU=SMZ,OU=EUR,DC=my_dc,DC=com',
             search_filter='(&(samAccountName=' + u + '))',        
             search_scope=SUBTREE,
             attributes=ALL_ATTRIBUTES,
             size_limit = 0,
             paged_criticality = True,                 
             paged_size = None,
             #attributes = ['cn'],
             get_operational_attributes=True)        

    content = c.response_to_json()
result = json.loads(content)
i = 0
for item in result["entries"]:
    i += 1
print(i)  
get_ldap_info('*')
user2978216
  • 474
  • 2
  • 6
  • 19
  • The AD CMDlets within Powershell use Active Directory Web Services to communicate with a DC so they might behave differently compared to using System.DirectoryServices (where I'm also limited to 1000 objects) – bluuf Nov 27 '17 at 08:49
  • `Get-Aduser` has a `-ResultSetSize` parameter to set limits, regarding the pythoh code check this if it helps: http://www.novell.com/coolsolutions/tip/18274.html – Avshalom Nov 27 '17 at 10:03
  • Seems you should not be setting `paged_size` to `None` if you want paged searching. – Bill_Stewart Nov 27 '17 at 13:39
  • @Bill_Stewart changing paged_size does not solve the issue. If I set it to 10 it's pulling 10 records. If I set it to 1500 it's pulling 1000. – user2978216 Nov 29 '17 at 11:11
  • I believe you have to set both `paged_size` and `size_limit`. But other than that I am not an expert in that particular module. – Bill_Stewart Nov 29 '17 at 13:29

3 Answers3

8

If you change your code to using the paged_search method of the extend.standard namespace instead you should be able to retrieve all the results you are looking for.

Just be aware that you will need to treat the response object differently.

def get_ldap_info(u):
with Connection(Server('XXX', port=636, use_ssl=True),
                auto_bind=AUTO_BIND_NO_TLS,
                read_only=True,
                check_names=True,
                user='XXX', password='XXX') as c:

    results = c.extend.standard.paged_search(search_base='dc=XXX,dc=XXX,dc=XXX',
             search_filter='(&(samAccountName=' + u + '))',        
             search_scope=SUBTREE,
             attributes=ALL_ATTRIBUTES,
             #attributes = ['cn'],
             get_operational_attributes=True)        


i = 0
for item in results:
    #print(item)
    i += 1
print(i)  
get_ldap_info('*')
bfloriang
  • 516
  • 1
  • 7
  • 11
  • It returns more than 1000 records. You have answered my question. Thank you! :D – user2978216 Dec 06 '17 at 11:36
  • 1
    To pull data out from results generator, you can use `import pandas as pd;json_ls=[l["tag"] for l in results]; df = pd.DataFrame(json_ls)`. It will pull out dict from generator and load into pandas dataframe. – Decula May 24 '18 at 23:20
  • This recommendation keeps breaking with the same result of 1000k records. You should rebind using the cookie as next answer below. – Jcc.Sanabria Nov 16 '20 at 14:59
1

The solution is available in the following link.

This piece of code will fetch page by page.

from ldap3 import Server, Connection, SUBTREE
total_entries = 0
server = Server('test-server')
c = Connection(server, user='username', password='password')
c.search(search_base = 'o=test',
         search_filter = '(objectClass=inetOrgPerson)',
         search_scope = SUBTREE,
         attributes = ['cn', 'givenName'],
         paged_size = 5)
total_entries += len(c.response)
for entry in c.response:
    print(entry['dn'], entry['attributes'])
cookie = c.result['controls']['1.2.840.113556.1.4.319']['value']['cookie']
while cookie:
    c.search(search_base = 'o=test',
             search_filter = '(objectClass=inetOrgPerson)',
             search_scope = SUBTREE,
             attributes = ['cn', 'givenName'],
             paged_size = 5,
             paged_cookie = cookie)
    total_entries += len(c.response)
    cookie = c.result['controls']['1.2.840.113556.1.4.319']['value']['cookie']
    for entry in c.response:
        print(entry['dn'], entry['attributes'])
print('Total entries retrieved:', total_entries)
Hara
  • 1,467
  • 4
  • 18
  • 35
0

Instead of using the page_search we can do the following thing which worked for me:

                all_users = []
                cookie = None
                search_filter = AD_USERS_CONFIG.AD_SEARCH_FILTER.replace('_memberof_',memberTemplate)
                # Query Active Directory
                # We are doing this as LDAP can return max 1000 entries from an AD group
                while True:
                    conn.search(GMI_AD_USERS_CONFIG.AD_SEARCH_BASE,
                                search_filter,
                                search_scope = SUBTREE,
                                size_limit=0,
                                paged_size=1000,
                                paged_cookie=cookie,
                                attributes=GMI_AD_USERS_CONFIG.AD_ATTRS)
                    # Retrieve the current page of search results
                    users_page = conn.entries

                    # Add the current page to the list of all users
                    all_users.extend(users_page)

                    # Check if there are more pages to retrieve
                    cookie = conn.result['controls']['1.2.840.113556.1.4.319']['value']['cookie']
                    if not cookie:
                        break
Amar Kumar
  • 11
  • 1