1

I know little about PHP so decided that the creation of a web-based tool for generating Red Hat kickstart files will be a good project to learn with. Among other things, the tool will parse a CSV file and generate a table containing the data taken from it. The input file is in the following format:

host1,10.153.196.248,255.255.255.0,10.153.196.1,00:50:56:ac:69:cb
,10.153.157.113,255.255.255.128,10.153.157.1,
,10.153.157.241,255.255.255.128,10.153.157.129,
,/home,10,,,
,swap,10,,,
,/opt,60,,,
,/data,30,,,
,,,,,
host2,10.153.155.124,255.255.255.128,10.153.155.1,00:50:56:ac:69:ce
,10.153.157.114,255.255.255.128,10.153.157.1,
,10.153.157.242,255.255.255.128,10.153.157.129,
,/home,10,,,
,swap,10,,,
,/opt,60,,,
,/data,30,,,
,,,,,

Each section of text represents the information for one server. The fields are as follows:

hostname,eth0 IP, eth0 netmask, eth0 gateway, eth4 MAC
null,eth1 IP, eth1 netmask, eth1 gateway, null
blank,eth2 IP, eth2 netmask, eth2 gateway, null
null,partition name, partition size in GB, null, null
null,partition name, partition size in GB, null, null
null,partition name, partition size in GB, null, null
null,partition name, partition size in GB, null, null
null,null,null,null,null

At the moment, I can parse it and generate a table with each row in the input file being a row in the table. The function that handles this:

function processFile($workFile) {

    if (file_exists($workFile)) {
        print '<table>';
        $fh = fopen("$workFile", 'rb');
        if ($fh) {
            for ($line = fgets($fh); !feof($fh); $line = fgets($fh)) {
                $line = trim($line);
                $info = explode(',', $line);
                print '<tr><td>' . $info[0] . '</td><td>' . $info[1] . '</td><td>' . $info[2] . '</td><td>' . $info[3] . '</td></tr>';
            }
        } else {
            print "Failed to open $workFile";
        }
        print '</table>';
    } else {
        print "File $workFile does not exist";
    }

}

Which generates:

host1     eth0 IP      eth0 netmask      eth0 gateway
          eth1 IP      eth1 netmask      eth1 gateway
          eth2 IP      eth2 netmask      eth2 gateway
          partition 1  partition 1 size
          partition 2  partition 2 size
          partition 3  partition 3 size
          partition 4  partition 4 size
host2     eth0 IP      eth0 netmask      eth0 gateway
          eth1 IP      eth1 netmask      eth1 gateway
          eth2 IP      eth2 netmask      eth2 gateway
          partition 1  partition 1 size
          partition 2  partition 2 size
          partition 3  partition 3 size
          partition 4  partition 4 size

This is a start. However, not every server is going to have four partitions. Some will have many more, some will have one or two fewer. And not knowing that information ahead of time puts a hindrance on what I want to do which is to add a row below the partition information for each server as well as probably break each server up into its own table. Something along the lines of this:

host1     eth0 IP          eth0 netmask      eth0 gateway
          eth1 IP          eth1 netmask      eth1 gateway
          eth2 IP          eth2 netmask      eth2 gateway
          partition 1      partition 1 size
          partition 2      partition 2 size
          partition 3      partition 3 size
          partition 4      partition 4 size
          partition 5      partition 5 size
          partition 6      partition 6 size
          How many disks?  [Text Field}    

host2     eth0 IP          eth0 netmask      eth0 gateway
          eth1 IP          eth1 netmask      eth1 gateway
          eth2 IP          eth2 netmask      eth2 gateway
          partition 1      partition 1 size
          partition 2      partition 2 size
          partition 3      partition 3 size
          partition 4      partition 4 size
          How many disks?  [Text Field}

My prevailing thought is that I'm going to have to have a field in the CSV file indicate that the row contains partition information. It seems the easiest approach. I'm wondering if there is another means I could use, though, that doesn't require altering the format of the input file.

I'll also have to figure out how to use the line containing all null fields as the section delimiter.

Any thoughts on how I can approach this will be appreciated.

j0k
  • 22,600
  • 28
  • 79
  • 90
theillien
  • 1,189
  • 3
  • 19
  • 33
  • why wouldn't you use a mysql database for this? – nathan hayfield Oct 30 '12 at 21:43
  • perhaps the data is being imported from another source – Flosculus Oct 30 '12 at 22:21
  • The purpose of the tool is to take a CSV file which is created from an existing spreadsheet and use it to generate kickstart files. After the files are generated the CSV is unnecessary. It will likely be retained for archaeological purposes, but otherwise, persistence in the form of a database is overkill. If the CSV is lost, it can be recreated from the spreadsheet if it is ever needed again as unlikely as that would be. – theillien Oct 31 '12 at 03:45

2 Answers2

0
echo('<table>');
$i = 0;
while($data = fgetcsv($fh)) {
    echo('<tr>');
    $cellTag = 'td';
    if($i == 0) {
        $cellTag = 'th';
    }
    $i++;
    foreach($data as $cell) {
        echo('<'.$cellTag.'>'.$cell.'</'.$cellTag.'>');
    }
    echo('</tr>');
}
echo('</table>');

Give that a try. Lemme know if it doesnt work first time. I'd have tested it, but i dont have a CSV file off hand.

Flosculus
  • 6,880
  • 3
  • 18
  • 42
  • That worked. Although, I did comment out the section that creates the table header. I'm not presently concerned about that. I also need to be able to skip the display of the MAC address, though. Otherwise, that seems to allow for arbitrary rows of partition data. Thanks. – theillien Oct 31 '12 at 03:24
  • Now I just need to figure out how to add the one row below each section of host data that allows for the entry of disk number:`print "Number of disks:";` – theillien Oct 31 '12 at 03:34
  • Using another post here at StackOverflow (http://stackoverflow.com/a/5040899/1602022) I figured out how to look for the lines that have no entries and to use those to determine when to insert the row which contains the input field for the number of disks. – theillien Oct 31 '12 at 20:46
0

As you are parsing data that should match a particular format, it should be regular in nature, allowing you to use regular expression to strictly match the input you want (and throw away that you don't):

$rows = explode("\n", $input);

// @link http://stackoverflow.com/questions/106179/regular-expression-to-match-hostname-or-ip-address
$regIp = "(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])";
$regHost = "(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])";

$hosts = array();
foreach ($rows as $row) {
  $matches = null;
  if (preg_match("/(?<host>$regHost),(?<ip>$regIp),(?<netmask>$regIp),(?<gateway>$regIp),(?<mac>.*)/", $row, $matches)) {
    $host = $matches['host'];
    $hosts[$host]['name'] = $matches['host'];
    $hosts[$host]['mac'] = $matches['mac'];
    $hosts[$host]['eth'][] = array(
      'ip'      => $matches['ip'],
      'netmask' => $matches['netmask'],
      'gateway' => $matches['gateway'],
    );
  } else if (preg_match("/,(?<ip>$regIp),(?<netmask>$regIp),(?<gateway>$regIp),/", $row, $matches)) {
    $hosts[$host]['eth'][] = array(
      'ip'      => $matches['ip'],
      'netmask' => $matches['netmask'],
      'gateway' => $matches['gateway'],
    );
  } else if (preg_match("/,(?<name>.+),(?<size>\d+),,,/", $row, $matches)) {
    $hosts[$host]['partition'][] = array(
      'name' => $matches['name'],
      'size' => $matches['size'],
    );
  } else if (preg_match("/,,,,,/", $row)) {
    // we already partition output array with value of `$host` variable.
    echo "Found terminating row\n"; 
  } else {
    echo "Unrecognised data on row: $row\n";
  }
}

var_export($hosts);

Since we are dealing with a CSV file, one could also use $fields = str_getcsv($row) inside the loop to get an array containing each field. One would then need to count how many fields the array had, how many where empty and validate what each field contained.

Using a regular expression directly on the string representation of each CSV row allows us to strictly match all of the above in a single expression. The (?<var>...) parts of each regular expression are named matches. Using the input provided by OP, the above script outputs:

Found terminating row
Found terminating row
array (
  'host1' => 
  array (
    'name' => 'host1',
    'mac' => '00:50:56:ac:69:cb',
    'eth' => 
    array (
      0 => 
      array (
        'ip' => '10.153.196.248',
        'netmask' => '255.255.255.0',
        'gateway' => '10.153.196.1',
      ),
      1 => 
      array (
        'ip' => '10.153.157.113',
        'netmask' => '255.255.255.128',
        'gateway' => '10.153.157.1',
      ),
      2 => 
      array (
        'ip' => '10.153.157.241',
        'netmask' => '255.255.255.128',
        'gateway' => '10.153.157.129',
      ),
    ),
    'partition' => 
    array (
      0 => 
      array (
        'name' => '/home',
        'size' => '10',
      ),
      1 => 
      array (
        'name' => 'swap',
        'size' => '10',
      ),
      2 => 
      array (
        'name' => '/opt',
        'size' => '60',
      ),
      3 => 
      array (
        'name' => '/data',
        'size' => '30',
      ),
    ),
  ),
  'host2' => 
  array (
    'name' => 'host2',
    'mac' => '00:50:56:ac:69:ce',
    'eth' => 
    array (
      0 => 
      array (
        'ip' => '10.153.155.124',
        'netmask' => '255.255.255.128',
        'gateway' => '10.153.155.1',
      ),
      1 => 
      array (
        'ip' => '10.153.157.114',
        'netmask' => '255.255.255.128',
        'gateway' => '10.153.157.1',
      ),
      2 => 
      array (
        'ip' => '10.153.157.242',
        'netmask' => '255.255.255.128',
        'gateway' => '10.153.157.129',
      ),
    ),
    'partition' => 
    array (
      0 => 
      array (
        'name' => '/home',
        'size' => '10',
      ),
      1 => 
      array (
        'name' => 'swap',
        'size' => '10',
      ),
      2 => 
      array (
        'name' => '/opt',
        'size' => '60',
      ),
      3 => 
      array (
        'name' => '/data',
        'size' => '30',
      ),
    ),
  ),
)

As we now have an array of properly structured data, it should be trivial to output the data in any format required (ie. HTML table).

deizel.
  • 11,042
  • 1
  • 39
  • 50
  • The regex route seems far more complicated than is necessary. I'm already able to output the data in the format I want. I simply need to know how to work with an arbitrary number of rows for each set of partitions. I'll look into your suggestion regarding str_getcsv(). – theillien Oct 31 '12 at 03:21