2

Hi Im trying to make a function that takes in a CSV file then insert it into the file whitch works but not to what i need it to do.

at the moment i am using explode to get the rows from the file..

explode(",", $linearray);

this workes but if there is something like

field1,field2,field3,field4,"some text, some other text",field6

i get this array

array(
[0]=>field1,
[1]=>field2,
[2]=>field3,
[3]=>field4,
[4]=>"some text,
[5]=>some other text",
[6]=>field6
)

which is not the outcome i want. i know that preg_split can do it for me but im not that good at regular expressions. the outcome i want is.

field1,field2,field3,field4,"some text, some other text",field6

array(
[0]=>field1,
[1]=>field2,
[2]=>field3,
[3]=>field4,
[4]=>some text, some other text,
[5]=>field6
)

please help.

functions for CSV file from a PHP CLASS i have written

    $lineseparator = "\n";
    $fieldseparator = "\n";

function ReadFile(){
    $this->csvcontent = fread($this->_file,$this->size);
    fclose($this->_file);
    return ($this->csvcontent)? true : false ;
}
function InsertFileToSQL(){
    $query = "";
    $i_count = 0;
    $queries = "";
    $linearray = array();
    $file_array = explode($this->lineseparator,$this->csvcontent);
    $lines = count($file_array);
    foreach($file_array as $key => $value) {
        $value = trim($value," \t");
        $value = str_replace("\r","",$value);
        /***********************************************************************************************************
        This line escapes the special character. remove it if entries are already escaped in the csv file
        ************************************************************************************************************/
        $value = str_replace("'","\'",$value);
        $value = str_replace("\"","",$value);
        /***********************************************************************************************************/

        $linearray = explode($this->fieldseparator,$value);

        foreach($linearray as $key2 => $value2){
            // Format all fields that match a date format the Reformat for SQL.
            $date = explode("/", $value2);
            if(count( $date ) == 3 ){
                $linearray[$key2] = $date[2]."-".$date[1]."-".$date[0];
            }
        }

        $linemysql = implode("','",$linearray);
        if($linemysql != "" && $linemysql != NULL){
            if($this->csvheader ){
                if($key != 0){
                    if($this->addauto == 1){
                        $query = "INSERT INTO `$this->db_table` VALUES (NULL,'$linemysql');";
                    }else{
                        $query = "INSERT INTO `$this->db_table` VALUES ('$linemysql');";
                    }
                }else{
                    $lines--;
                }
                $insert = mysql_query($query) or die(mysql_error());
                if($insert){
                    $queries .= $query . "\n";
                    $i_count++;
                }

            }else{
                if($this->addauto == 1){
                    $query = "INSERT INTO `$this->db_table` VALUES (NULL,'$linemysql');";
                }else{
                    $query = "INSERT INTO `$this->db_table` VALUES ('$linemysql');";
                }
                $insert = mysql_query($query) or die((mysql_error()." in QUERY: ".$query));
                if($insert){
                    $queries .= $query . "\n";
                    $i_count++;
                }

            }
        }else{
            $this->null_row++;
            $lines--;
        }


    }
    if($this->save) {
        $f = fopen($this->output_location.$this->outputfile, 'a+');

        if ($f) {
          @fputs($f, $queries);
          @fclose($f);
        }else{
            echo("Error writing to the output file.", 'error');
        }

    }
    $lines--;//fix array count
    $text = "";
    if($i_count - $this->null_row  != 0){$i_count = $i_count - $this->null_row ;$text .= "<br>$i_count Records were inserted Successfully.";}
    echo("Found a Total of $lines Record(s) in this csv file.<br>$this->null_row Record(s) were/are Blank or Null.$text", 'success');
}
MasterT
  • 623
  • 1
  • 9
  • 23
  • 4
    str_getcsv() is better for this - http://www.php.net/manual/en/function.str-getcsv.php; or if you're reading from the CSV file, then use fgetcsv() - http://www.php.net/manual/en/function.fgetcsv.php – Mark Baker May 10 '13 at 12:28
  • 1
    http://stackoverflow.com/questions/2805427/how-to-extract-data-from-csv-file-in-php Remember to look for someone who's already asked the same question! – Wolfman Joe May 10 '13 at 12:29
  • i know that there is alot of questions but none with the answer i need. PHP functions on CSV files works but not correctly on the server its on. that why im asking for the regex way as this is a workaround till the server issues are resolved. – MasterT May 10 '13 at 12:37
  • @MasterT - you should use PHP's CSV functions. Regex is the **wrong answer** for this kind of problem. If you really need an alternative solution, a simple parser would be a better solution than regex, but I still don't get what's wrong with the built-in CSV functions -- If they're not working for you, maybe you should ask for help with that rather than trying to get help with regex. – Spudley May 10 '13 at 13:01
  • Also, another angle: if you're doing this to import data into a mySQL database, be aware that mySQL has a function for importing CSV files directly, which is *much* quicker than doing it via PHP. See [`LOAD DATA INFILE`](http://dev.mysql.com/doc/refman/5.1/en/load-data.html) – Spudley May 10 '13 at 13:03
  • So you are trying to solve a completely different problem instead of trying to solve the original one? `str_getcsv` is available as of PHP 5.3 only, but `fgetcsv` has been around since PHP 4 – and since you want to read from a file, that should do. So, what exactly _is_ your problem with this on the server? Let’s try to solve that problem first, instead of trying to tinker with a “workaround” that might not even be necessary … – CBroe May 10 '13 at 13:03
  • but what is the regex for it though i have this "/[\s]*[,][\s]*/" this will split by comma no matter the spaces all i need is the bit to ignore commas in double quotes, and im happy. im aware of the CSV functions and have used them but they do not work to the desired spec to what i need. – MasterT May 10 '13 at 13:23
  • the server has been reported this mornign it was missing files in PHP-config area of the server but the functions for CSV dont work to desired spec. – MasterT May 10 '13 at 13:24

3 Answers3

2

I think your answer is in here:

exploding a string using a regular expression

As @Casimir et Hippolyte has said in that page:

You can do the job using preg_match_all

$string="a,b,c,(d,e,f),g,'h, i j.',k";

preg_match_all("~'[^']++'|\([^)]++\)|[^,]++~", $string,$result);
print_r($result[0]);

Explanation:

The trick is to match parenthesis before the ,

~          Pattern delimiter
'
[^']       All charaters but not a single quote
++         one or more time in [possessive][1] mode
'
|          or
\([^)]++\) the same with parenthesis
|          or
[^,]       All characters but not a comma
++
~

if you have more than one delimiter like quotes (that are the same for open and close), you can write your pattern like this, using a capture group:

$string="a,b,c,(d,e,f),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";

preg_match_all("~(['#@°]).*?\1|\([^)]++\)|[^,]++~", $string,$result);
print_r($result[0]);

explanation:

(['#@°])   one character in the class is captured in group 1
.*?        any character zero or more time in lazy mode 
\1         group 1 content

With nested parenthesis:

$string="a,b,(c,(d,(e),f),t),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";

preg_match_all("~(['#@°]).*?\1|(\((?>[^()]++|(?-1)?)*\))|[^,]++~", $string,$result);
print_r($result[0]);
Community
  • 1
  • 1
SAVAFA
  • 818
  • 8
  • 23
0

I'm not going to answer the question that was asked because he's asking for the wrong solution to his problem. However, I hope this solution will be better for him:

Looking at the code in the question, the OP is basically reading a CSV file via PHP and importing it into a mysql database.

MySQL actually offers a method of doing this directly with it's LOAD DATA INFILE syntax, without having to parse the file in PHP at all. It is much quicker than processing it via PHP, and completely avoid the whole problem that the OP is having.

In PHP you simply need to do the following:

$query = <<<eof
  LOAD DATA INFILE {$filename} INTO {$table}
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\r\n'
  IGNORE 1 LINES
  field1, field2, field3, field4, field5, field6
eof;
mysqli_query($conn, $query);

You may need to modify that query a bit for some of the more complex stuff in your code (ie converting date formats, etc), but once you've got the hang of the LOAD DATA INFILE syntax, you'll find that's fairly simple to incorporate.

I hope that helps.

Spudley
  • 166,037
  • 39
  • 233
  • 307
  • Ok just to clear up some stuff coz your all going off on one now. im doing it this way so that manipulating the data before hand is easier, im doing it the regex way as there is some text formatting and word replacement also being used. the function i pasted in is just the basic version of it so other people can use it if needed. if anyone could just tell me the regex way it would be great. – MasterT May 10 '13 at 13:39
-1

You can use preg_split with the PREG_SPLIT_DELIM_CAPTURE option.

$str = field1,field2,field3,field4,"some text, some other text",field6;

then something like this

$match = preg_split("ypir expression", $str, null, PREG_SPLIT_DELIM_CAPTURE);
chandresh_cool
  • 11,753
  • 3
  • 30
  • 45