0

I'm coding a plugin that runs everyday at 5am. It combines multiple csv files (That have a txt extension).

Currently, it is working... HOWEVER, the output format is incorrect.

The input will look like this:

"","","","","email@gmail.com","PARK PLACE 109 AVE","SOME RANDOM DATA","","","",""

And so on. this is only a partial row.

The ouput of this code does not retun the same format. It produces something like this without the " in columns without data

,,,,email@gmail.com,"PARK PLACE 109 AVE","SOME RANDOM DATA",,,,

Here is the part of the function that combines everything:

function combine_and_email_csv_files() {
  // Get the current time and date
  $now = new DateTime();
  $date_string = $now->format('Y-m-d_H-i-s');

  // Get the specified directories
  $source_directory = get_option('csv_file_combiner_source_directory');
  $destination_directory = get_option('csv_file_combiner_destination_directory');

  // Load the CSV files from the source directory
  $csv_files = glob("$source_directory/*.txt");

  // Create an empty array to store the combined CSV data
  $combined_csv_data = array();

  // Loop through the CSV files
  foreach ($csv_files as $file) {
    // Load the CSV data from the file
    $csv_data = array_map('str_getcsv', file($file));

    // Add the CSV data to the combined CSV data array
    $combined_csv_data = array_merge($combined_csv_data, $csv_data);
  }

  // Create the combined CSV file
  $combined_csv_file = fopen("$destination_directory/$date_string.txt", 'w');

  // Write the combined CSV data to the file
  foreach ($combined_csv_data as $line) {
    fputcsv($combined_csv_file, $line);
  }

  // Close the combined CSV file
  fclose($combined_csv_file);
}

No matter, what I've tried... it's not working. I'm missing something simple I know.

JShobbyist
  • 532
  • 3
  • 9
  • 23
Kyle A
  • 11
  • 6
  • Not sure if https://stackoverflow.com/questions/2489553/forcing-fputcsv-to-use-enclosure-for-all-fields helps – Nigel Ren Dec 16 '22 at 19:41
  • Thank you! yes, it took some time... but i re-coded it to use fputs instead of fputcsv. Thanks a bunch for pointing my nose in the right direction.... – Kyle A Dec 16 '22 at 20:42

2 Answers2

0

Thank you Nigel!

So this thread, Forcing fputcsv to Use Enclosure For *all* Fields helped me get there....

Using fputs instead of fputscsv and force "" on null values is the short answer for me. Works beautifully... code is below:


function combine_and_email_csv_files() {
  // Get the current time and date
  $now = new DateTime();
  $date_string = $now->format('Y-m-d_H-i-s');

  // Get the specified directories
  $source_directory = get_option('csv_file_combiner_source_directory');
  $destination_directory = get_option('csv_file_combiner_destination_directory');

  // Load the CSV files from the source directory
  $csv_files = glob("$source_directory/*.txt");

  // Create an empty array to store the combined CSV data
  $combined_csv_data = array();

  // Loop through the CSV files
  foreach ($csv_files as $file) {
    // Load the CSV data from the file
    $csv_data = array_map('str_getcsv', file($file));

    // Add the CSV data to the combined CSV data array
    $combined_csv_data = array_merge($combined_csv_data, $csv_data);
  }

  // Create the combined CSV file
  $combined_csv_file = fopen("$destination_directory/$date_string.txt", 'w');

  // Write the combined CSV data to the file
  foreach ($combined_csv_data as $line) {
    // Enclose each value in double quotes
    $line = array_map(function($val) {
      if (empty($val)) {
        return "\"\"";
      }
      return "\"$val\"";
    }, $line);

    // Convert the line array to a CSV formatted string
    $line_string = implode(',', $line) . "\n";

    // Write the string to the file
    fputs($combined_csv_file, $line_string);
  }

Kyle A
  • 11
  • 6
  • You need to do more than just wrap quotes around strings, you also need to escape quotes _in_ the strings. You may also want to differentiate between empty strings, nulls, empty arrays, and other things that PHP classifies as "empty". eg: `empty(0) === true` – Sammitch Dec 16 '22 at 21:33
  • The current version of this has been sent off for QA testing... We'll see. It seems to look fine though.... I can compare the original txt/csv in my IDE and they are exact... but combined, new lines are appropriate, etc.. checking in excel or the like, loads perfectly, identical information. What exactly do I gain from this extra approach and why is it needed? I'm already checking if it's empty... – Kyle A Dec 16 '22 at 23:12
  • If QA doesn't fail this code, then your QA dept fails at their job. Your code will change `0` and `'0'` to `""` and that is definitively a bug. The other things I mentioned are just good ideas. – Sammitch Dec 16 '22 at 23:19
  • QA of the output file.... Not the code. Sure, I see what you're saying. Could you care to show, where and how you would implement using 'empty(0) === true' ? I'm sorry... I'm fresh; be gentle. I was exited i got it working after the first hurdle tbh. Also, in the description of my problem I give the example input... There is never a time in those input files there would be just 0 or '0' It is possible there could be a "0" but, I'm pretty sure that would be invalid data rejected before it became this txt/csv – Kyle A Dec 17 '22 at 00:14
  • From [the docs](https://www.php.net/manual/en/function.empty.php): "Returns true if var does not exist or has a value that is empty or equal to zero, aka falsey, see [conversion to boolean](https://www.php.net/manual/en/language.types.boolean.php#language.types.boolean.casting)." Practical example: https://3v4l.org/U7A0j – Sammitch Dec 17 '22 at 00:42
  • Would this work then? `$line = array_map(function($val) { if (is_null($val)) { return "\"NULL\""; } elseif (gettype($val) === 'array' && empty($val)) { return "\"ARRAY\""; } elseif (empty($val)) { return "\"\""; } return "\"" . addslashes($val) . "\""; }, $line);` – Kyle A Dec 17 '22 at 00:55
  • I think that an important question is: does the `,,` vs `"","",""` even matter? Both are valid CSV regardless of how they look. Does the unmolested CSV decode properly on the other side of this process? – Sammitch Dec 17 '22 at 01:04
  • It has to be in that spec... It's going into an OLD fulfillment system (Whatever that means, I wish I had more detail) but the output I gave them that was the original code that output `,,,` Unfortunately was rejected immediately, this is the spec. `"","","",""` Hence, my frustration... – Kyle A Dec 17 '22 at 01:10
  • And I'll also clarify... The unmolested CSV from the first version of this code works perfectly outside of the fact when they open the file they see it does not contain `"",""`. That's the first test to pass... Then the second test is QA with prior days worth of csvs and mine... They sent that second versions output to QA testing... :/ – Kyle A Dec 17 '22 at 01:22
  • Ok, looking at your question fresh. You are not modifying the data in any way, you're just concatenating two files, but with extra steps that are breaking the format required by your legacy system. Skip literally everything to do with CSV and just dump all the file contents into the destination file as-is. The only extra thing you might want to do is check that each file ends with a newline so that you don't have the last/first lines joined between files. – Sammitch Dec 17 '22 at 01:25
  • I tried writing an encoder, but just handling the "string contains a `"`, but is not already escaped in one of two different ways" case is a nightmare, my code has deteriorated into an un-runnable mess, I've utterly forgotten that I'm at work, and I can't spend any more time on this right now. – Sammitch Dec 17 '22 at 01:28
  • lol, thank you for the time Sammitch... It's greatly appreciated. My work ended 6 hours ago and I'm still thinking about it.... :/ – Kyle A Dec 17 '22 at 01:43
  • Simplicity is the ultimate sophistication... It works much better this way. Why must I always complicate things... Thank you! I went out on a limb and forwarded this file to QA just to be sure... And honestly, I'm checking file size, this ends up 1KB larger, I'm guessing it did what you said prior. This is the sure fire way for my specific solution anyways. Maybe I can stop thinking about work now.... – Kyle A Dec 17 '22 at 02:16
0

Thank you Sammitch

After much haggling with this problem... Sammitch pointed out why not just concat the files... Simplicity is the ultimate sophistication... right?

*Note: this will only work for my specific circumstance. All I'm doing now is concating the files and checking each file ends with a new line and just plain skipping the csv manipulation.

Code below:

function combine_and_email_csv_files() {
  // Get the current time and date
  $now = new DateTime();
  $date_string = $now->format('Y-m-d_H-i-s');

  // Get the specified directories
  $source_directory = get_option('csv_file_combiner_source_directory');
  $destination_directory = get_option('csv_file_combiner_destination_directory');

  // Load the files from the source directory
  $files = glob("$source_directory/*.txt");

  // Create the combined file
  $combined_file = fopen("$destination_directory/$date_string.txt", 'w');

  // Loop through the files
  foreach ($files as $file) {
    // Read the contents of the file
    $contents = file_get_contents($file);

    // Ensure that the file ends with a newline character
    if (substr($contents, -1) != "\n") {
      $contents .= "\n";
    }

    // Write the contents of the file to the combined file
    fwrite($combined_file, $contents);
  }

  // Close the combined file
  fclose($combined_file);
Kyle A
  • 11
  • 6