1

I have recently written a script in python that processes a Microsoft Windows DHCP server dump file and generates an XML file of the current reservations using spreadsheet XML formatting.

The script basically opens a file using the python open() command, then iterates over every line (for line in file) and looks for the key word reservedip. If the keyword is found, the line is broken up into fields using the shlex split() command.

However, when I run this script with the default dump files of the microsoft DHCP server, I get no results. also note that I was unable to use Linux's grep command to search in the file

I then tried to open the file in gedit and save it as a unix text file. After this was done, I got results and was able to grep within the file. This method however defeats the whole point of writing a script to automate my work.

I have been searching on google but had no luck in finding what I am looking for. I also tried to open the file in binary mode, but this was also no help.

I hope somebody can help me with this.

As per request, here is an example of what the script does (at least the looping part) and the DHCP server output:

Script

# Setup an empty dictionary to store the extracted records
records = {}

# Open dhcp dump file
f = open(dhcp.txt, "r")

# Iterate file line by line
for line in f:

  # Only use line with the word "reservedip" in it
  if "reservedip" in line:

    # Split line into fields by spaces (excluding quoted substrings)
    field = shlex.split(line)

    # Add new entry for each record using the 32bit IP address int as it's key
    records[addr_to_int(field[7])] = [field[7], field[8], field[9], field[10]]

*note: addr_to_int is a function I wrote that converts a dotted IPv4 address to an integer*

DHCP dump

Unfortunately I cannot include the real DHCP server dump due to company policy. But the lines I'm trying to get out of the file look like this:

Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 003386dd00gg "hostname.company.local" "Host Description" "BOTH"

Thanks in advance, Pascal

JonnyJD
  • 2,593
  • 1
  • 28
  • 44
  • So is it a binary file or a text file? – Lev Levitsky Dec 17 '12 at 09:50
  • AS I understand: either the encoding of the generated DHCP file is not the same as what the Python script expects. Also, end line character in Windows and Unix are not the same and this could change the behavior of the script while generating line tokens within your for loop, because according to what you said: you saved the very same file as Unix text file and you managed to read the file content properly. So I think, it is the output file that has to be generated "Unix compatible" – dariyoosh Dec 17 '12 at 09:50
  • @LevLevitsky, I'm not sure, the file contains text (sort of like a config file for dhcp, issued by the powershell command: netsh dhcp server scope dump > file.txt – Pascal Van Acker Dec 17 '12 at 10:01
  • @dariyoosh, I had also done some testing with the script. And as far as i could tell, it could iterate the file line by line, but the problem was with searching for a key word within the string or line. I was using the statement: if "reservedip" in line: – Pascal Van Acker Dec 17 '12 at 10:02

3 Answers3

1

One way to eliminate that it is a problem with the end line characters is by making the end line characters Unix style using re:

import re

dhcp_file = open( path_to_dhcp_file, 'r' )
for line in dhcp_file:
    # Change en line char to UNIX style
    line = re.sub( "\r\n", r"\n", line )

    # now do your things on line
Nemelis
  • 4,858
  • 2
  • 17
  • 37
  • Thanks, but I have already tried this and it does not work. The problem does not appear to be with the line breaks, but with the string search. (I use **if "reservedip" in line:** and this does not return any lines) – Pascal Van Acker Dec 17 '12 at 10:36
  • @ Pascal Van Acker, If possible, post your script code here, it might can to understand better what causes the problem. In addition a sample of your DHCP file might also be helpful to do some test. – dariyoosh Dec 17 '12 at 10:55
  • @Nemelis, I've added some info to the post itself. Hope it helps – Pascal Van Acker Dec 17 '12 at 11:25
1

Based on the two lines, you provided as an example of the content of your DHCP dump file, I made the following test case (for the sake of clarity in this example I added l1, l2, l3, ... at the beginning of each line, referring to the line number)

So here is the dump file that I created on Linux Fedora Core 17 (x86_64) data.txt:

l1: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 
l2: 003386dd00gg "hostname.company.local" "Host Description" "BOTH"
l3: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 
l4: 003386dd00gg "hostname.company.local" "Host Description" "BOTH"
l5: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add  172.16.104.207 
l6: 003386dd00gg "hostname.company.local" "Host Description" "BOTH"
l7: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add  172.16.104.207 
l8: 003386dd00gg "hostname.company.local" "Host Description" "BOTH"
l9: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 
l10: 003386dd00gg "hostname.company.local" "Host Description" "BOTH"  

You said that:

also note that I was unable to use Linux's grep command to search in the file

Here is what I get when I run a grep with the above sample file

$ cat data.txt | grep reservedip
l1: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 
l3: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 
l9: Dhcp Server \\servername.company.local Scope 172.16.104.0 Add reservedip 172.16.104.207 
$ 

And here is also the test that I did with a python script in order to check whether the script is able to find the key word "reservedip" in the sample file:

lineNumber = 0
with open("./data.txt") as dhcpDumpFile:
    for line in dhcpDumpFile:
        lineNumber += 1
        if "reservedip" in line:
            print("Found 'reservedip' at the line: ", lineNumber)

And the result that I get is:

$ python -tt myscript.py
("Found 'reservedip' at the line: ", 1)
("Found 'reservedip' at the line: ", 3)
("Found 'reservedip' at the line: ", 9)
$

So, it works for me.

Regards,

Dariyoosh

dariyoosh
  • 604
  • 1
  • 4
  • 12
  • Thanks for your efforts, @dariyoosh. As stated I'm sure it's going to be related to the way powershell redirects the dhcp dump info to a text file. Because when I open the file with gedit and save it in unix format, it works for me aswel. But I'm trying to automate this whole process so I can just display a webpage or spreadsheet with up to date DHCP information. – Pascal Van Acker Dec 17 '12 at 12:19
1

Possibly the encoding of these strings in the file is not in an ASCII compatible character encoding. UTF-8 and latin should be compatible since they use exactly one byte for chars that are in ASCII. UTF-16 and UTF-32 are not compatible, they use always more than one byte per character. UTF-16 is not rare in MS files, sometimes files are even mixed.

Possibly the dump uses 2 bytes, even for ASCII characters. Then you would have r~e~s~e~r~v~e~d~i~p in the file with ~ being some other byte (can also be ~r or even ~~ which still encodes to r.

Just a wild guess, since you are not allowed to post the actual file and I don't know anything about MS DHCP server dumps.

What does

file file.txt

give you?

What about

file --mime-type --mime-encoding

That won't necessarily tell you the encoding if it is a "mixed" binary/strings file, but if it is plain UTF/ASCII text, it should tell you.

JonnyJD
  • 2,593
  • 1
  • 28
  • 44
  • server1;txt: UTF-8 Unicode text, server2.txt: Little-endian UTF-16 Unicode text, with CRLF, CR line terminators; (The first txt file I converted using GEdit, so it looks like you are right and the encoding on the file is incorret. Any idea on how to solve it?) – Pascal Van Acker Dec 17 '12 at 12:47
  • 1
    Nevermind the above: http://stackoverflow.com/questions/8827419/converting-utf-16-utf-8-and-remove-bom – Pascal Van Acker Dec 17 '12 at 12:50