-2

I am running my whole Active directory against user accounts trying to find what doesn't belong. Using my code my output gives me the words that only occur once in the Username column. Even though I am analyzing one column of data, I want to keep all of the columns that are with the data.

from pandas import DataFrame, read_csv
import pandas as pd  
f1 = pd.read_csv('lastlogonuser.txt', sep='\t', encoding='latin1')
f2 = pd.read_csv('UserAccounts.csv', sep=',', encoding ='latin1')
f2 = f2.rename(columns={'Shortname':'User Name'})
f = pd.concat([f1, f2])
counts = f['User Name'].value_counts()
f = counts[counts == 1] 
f 

I get something like this when I run my code:

sample534         1
sample987         1
sample342         1
sample321         1
sample123         1

I would like ALL of the data from the txt files to come out in my out put, but I still want only the username column analyzed. How do I keep all of the data in all columns, or do I have to use a different word count to include all columns of data?

I would like something like:

   User Name    Description
1  sample534    Journal Mailbox managed by         
1  sample987    Journal Mailbox managed by    
1  sample342    Journal Mailbox managed by   
1  sample321    Journal Mailbox managed by 
1  sample123    Journal Mailbox managed by 

Sample of data I am using:

Account User Name User CN                       Description
ENABLED MBJ29     CN=MBJ29,CN=Users             Journal Mailbox managed by  
ENABLED MBJ14     CN=MBJ14,CN=Users             Journal Mailbox managed by
ENABLED MBJ08     CN=MBJ30,CN=Users             Journal Mailbox managed by   
ENABLED MBJ07     CN=MBJ07,CN=Users             Journal Mailbox managed by 
Merlin
  • 24,552
  • 41
  • 131
  • 206
JetCorey
  • 303
  • 3
  • 11

1 Answers1

1

Based on your description, I guess you want to use the counts of unique elements as index to select rows in your dataframe. Maybe you can try this:

df2 = pd.DataFrame()    
counts = f['User Name'].value_counts()
counts = counts[counts == 1].index
for index in counts:
    df2 = df2.append(f[f['User Name'] == index])
Andreas Hsieh
  • 2,080
  • 1
  • 10
  • 8