11 Commits
0.3.0 ... 0.4.3

Author SHA1 Message Date
zynode
fc22bee7ca parseCSV 0.4.3 beta
- Issue #4. Added an option for setting sorting
  type behavior when sorting data.
  Simply set $csv->sort_type to "regular", "numeric",
  or "string".

- Issue #6. Raw loaded file data is now cleared from
  file_data property when it has been successfully
  parsed to keep parseCSV's memory footprint to a
  minimum. Specifically handy when using mulitple
  instances of parseCSV to process large files.

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@41 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-06-30 21:41:42 +00:00
zynode
9b64cb07c4 Fixed Issue #6 - Automatically clears $file_data property after successful parsing of input data. Set the $keep_file_data property to true to keep it around for debugging.
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@39 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-06-30 20:38:59 +00:00
zynode
2bfa3de220 Addressed Issue #4. Added option for sorting behavior type.
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@38 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-06-30 20:01:50 +00:00
zynode
c522cd87a7 fixed a small changelog typo in the 0.4.2 trunk and tag
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@36 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-05-31 18:12:15 +00:00
zynode
0395362d66 parseCSV 0.4.2
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@34 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-05-31 17:59:21 +00:00
zynode
20a6a9d1b1 parseCSV 0.4.1 beta
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@31 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-05-30 12:20:39 +00:00
zynode
70366e3085 parseCSV 0.4 beta
- Error reporting for files/data which is corrupt
  or has formatting errors like using double
  quotes in a field without enclosing quotes. Or
  not escaping double quotes with a second one.

- parse() method does not require input anymore
  if the "$object->file" property has been set.

I'm calling this a beta release due to the heavy
modifications to the core parsing logic required
for error reporting to work. I have tested the
new code quite extensively, I'm fairly confident
that it still parses exactly as it always has.

The second reason I'm calling it a beta release
is cause I'm sure the error reporting code will
need more refinements and tweaks to detect more
types of errors, as it's only picking two types
or syntax errors right now. However, it seems
these two are the most common errors that you
would be likely to come across.

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@28 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-04-11 18:13:37 +00:00
zynode
2dfd35b988 added error reporting/validation of parsed data, still needs some more testing before release tho...
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@27 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-04-07 16:17:44 +00:00
zynode
ae00f949f0 parseCSV 0.3.2
This is primarily a bug-fix release for a critical
bug which was brought to my attention.

- Fixed a critical bug in conditions parsing which
  would generate corrupt matching patterns causing
  the condition(s) to not work at all in some
  situations.

- Fixed a small code error which would cause PHP to
  generate a invalid offset notice when zero length
  values were fed into the unparse() method to
  generate CSV data from an array.

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@22 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-03-31 22:50:36 +00:00
zynode
4e76da5eff minor fix to a bug which caused notice errors to be generated when _enclose_value() was fed a zero character long string
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@20 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-03-31 20:55:33 +00:00
zynode
7762e71316 parseCSV 0.3.1
- Small change to default output settings to
  conform with RFC 4180 (http://rfc.net/rfc4180.html).
  Only the LF (line feed) character was used
  by default to separate rows, rather than
  CRLF (carriage return & line feed).

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@17 339761fc-0c37-0410-822d-8b8cac1f6a97
2007-08-31 22:45:22 +00:00
5 changed files with 294 additions and 56 deletions

View File

@@ -1,3 +1,122 @@
parseCSV 0.4.3 beta
-----------------------------------
Date: 1-July-2008
- Issue #4. Added an option for setting sorting
type behavior when sorting data.
Simply set $csv->sort_type to "regular", "numeric",
or "string".
- Issue #6. Raw loaded file data is now cleared from
file_data property when it has been successfully
parsed to keep parseCSV's memory footprint to a
minimum. Specifically handy when using mulitple
instances of parseCSV to process large files.
-----------------------------------
parseCSV 0.4.2 beta
-----------------------------------
Date: 31-May-2008
- IMPORTANT! If you're using the output(),
method please note that the first parameter
has been completely removed as it was
technically just useless. Instead, the second
parameter (filename) doubles as its replacement.
Simply put, if filename is not set or null, the
output() method will not output a downloadable
file. Please update your existing code
when using 0.4.2 and later :)
- Small fix to the headers sent by the output()
method.
- Added a download example using the output()
method to the examples folder.
-----------------------------------
parseCSV 0.4.1 beta
-----------------------------------
Date: 29-May-2008
- Fixed a small bug in how the output() method
handles input data.
-----------------------------------
parseCSV 0.4 beta
-----------------------------------
Date: 11-Apr-2008
- Error reporting for files/data which is corrupt
or has formatting errors like using double
quotes in a field without enclosing quotes. Or
not escaping double quotes with a second one.
- parse() method does not require input anymore
if the "$object->file" property has been set.
I'm calling this a beta release due to the heavy
modifications to the core parsing logic required
for error reporting to work. I have tested the
new code quite extensively, I'm fairly confident
that it still parses exactly as it always has.
The second reason I'm calling it a beta release
is cause I'm sure the error reporting code will
need more refinements and tweaks to detect more
types of errors, as it's only picking two types
or syntax errors right now. However, it seems
these two are the most common errors that you
would be likely to come across.
-----------------------------------
parseCSV 0.3.2
-----------------------------------
Date: 1-Apr-2008
This is primarily a bug-fix release for a critical
bug which was brought to my attention.
- Fixed a critical bug in conditions parsing which
would generate corrupt matching patterns causing
the condition(s) to not work at all in some
situations.
- Fixed a small code error which would cause PHP to
generate a invalid offset notice when zero length
values were fed into the unparse() method to
generate CSV data from an array.
Notice: If you have been using the "parsecsv-stable"
branch as an external in any of your projects,
please use the "stable/parsecsv" branch from this
point on as I will eventually remove the former due
to it's stupid naming.
-----------------------------------
parseCSV 0.3.1
-----------------------------------
Date: 1-Sep-2007
- Small change to default output settings to
conform with RFC 4180 (http://rfc.net/rfc4180.html).
Only the LF (line feed) character was used
by default to separate rows, rather than
CRLF (carriage return & line feed).
-----------------------------------
parseCSV 0.3.0 parseCSV 0.3.0
----------------------------------- -----------------------------------
Date: 9-Aug-2007 Date: 9-Aug-2007
@@ -18,6 +137,9 @@ Date: 9-Aug-2007
- Minor changes and optimizations, and a few - Minor changes and optimizations, and a few
spelling corrections. Oops :) spelling corrections. Oops :)
- Included more complex code examples in the
parseCSV download.
----------------------------------- -----------------------------------

View File

@@ -13,7 +13,11 @@ $csv = new parseCSV();
# Parse '_books.csv' using automatic delimiter detection... # Parse '_books.csv' using automatic delimiter detection...
$csv->auto('_books.csv'); $csv->auto('_books.csv');
# ...or if you know the delimiter, use the parse() function. # ...or if you know the delimiter, set the delimiter character
# if its not the default comma...
// $csv->delimiter = "\t"; # tab delimited
# ...and then use the parse() function.
// $csv->parse('_books.csv'); // $csv->parse('_books.csv');

34
examples/download.php Normal file
View File

@@ -0,0 +1,34 @@
<?php
# include parseCSV class.
require_once('../parsecsv.lib.php');
# create new parseCSV object.
$csv = new parseCSV();
# Parse '_books.csv' using automatic delimiter detection...
$csv->auto('_books.csv');
# ...or if you know the delimiter, set the delimiter character
# if its not the default comma...
// $csv->delimiter = "\t"; # tab delimited
# ...and then use the parse() function.
// $csv->parse('_books.csv');
# now we have data in $csv->data, at which point we can modify
# it to our hearts content, like removing the last item...
array_pop($csv->data);
# then we output the file to the browser as a downloadable file...
$csv->output('books.csv');
# ...when the first parameter is given and is not null, the
# output method will itself send the correct headers and the
# data to download the output as a CSV file. if it's not set
# or is set to null, output will only return the generated CSV
# output data, and will not output to the browser itself.
?>

View File

@@ -4,7 +4,7 @@ class parseCSV {
/* /*
Class: parseCSV v0.3.0 Class: parseCSV v0.4.3 beta
http://code.google.com/p/parsecsv-for-php/ http://code.google.com/p/parsecsv-for-php/
@@ -94,6 +94,12 @@ class parseCSV {
var $sort_by = null; var $sort_by = null;
var $sort_reverse = false; var $sort_reverse = false;
# sort behavior passed to ksort/krsort functions
# regular = SORT_REGULAR
# numeric = SORT_NUMERIC
# string = SORT_STRING
var $sort_type = null;
# delimiter (comma) and enclosure (double quote) # delimiter (comma) and enclosure (double quote)
var $delimiter = ','; var $delimiter = ',';
var $enclosure = '"'; var $enclosure = '"';
@@ -123,12 +129,14 @@ class parseCSV {
var $output_encoding = 'ISO-8859-1'; var $output_encoding = 'ISO-8859-1';
# used by unparse(), save(), and output() functions # used by unparse(), save(), and output() functions
var $linefeed = "\n"; var $linefeed = "\r\n";
# only used by output() function # only used by output() function
var $output_delimiter = ','; var $output_delimiter = ',';
var $output_filename = 'data.csv'; var $output_filename = 'data.csv';
# keep raw file data in memory after successful parsing (useful for debugging)
var $keep_file_data = false;
/** /**
* Internal variables * Internal variables
@@ -140,6 +148,19 @@ class parseCSV {
# loaded file contents # loaded file contents
var $file_data; var $file_data;
# error while parsing input data
# 0 = No errors found. Everything should be fine :)
# 1 = Hopefully correctable syntax error was found.
# 2 = Enclosure character (double quote by default)
# was found in non-enclosed field. This means
# the file is either corrupt, or does not
# standard CSV formatting. Please validate
# the parsed data yourself.
var $error = 0;
# detailed error info
var $error_info = array();
# array of field values in data parsed # array of field values in data parsed
var $titles = array(); var $titles = array();
@@ -170,6 +191,7 @@ class parseCSV {
* @return nothing * @return nothing
*/ */
function parse ($input = null, $offset = null, $limit = null, $conditions = null) { function parse ($input = null, $offset = null, $limit = null, $conditions = null) {
if ( $input === null ) $input = $this->file;
if ( !empty($input) ) { if ( !empty($input) ) {
if ( $offset !== null ) $this->offset = $offset; if ( $offset !== null ) $this->offset = $offset;
if ( $limit !== null ) $this->limit = $limit; if ( $limit !== null ) $this->limit = $limit;
@@ -202,20 +224,19 @@ class parseCSV {
/** /**
* Generate CSV based string for output * Generate CSV based string for output
* @param output if true, prints headers and strings to browser * @param filename if specified, headers and data will be output directly to browser as a downloable file
* @param filename filename sent to browser in headers if output is true
* @param data 2D array with data * @param data 2D array with data
* @param fields field names * @param fields field names
* @param delimiter delimiter used to separate data * @param delimiter delimiter used to separate data
* @return CSV data using delimiter of choice, or default * @return CSV data using delimiter of choice, or default
*/ */
function output ($output = true, $filename = null, $data = array(), $fields = array(), $delimiter = null) { function output ($filename = null, $data = array(), $fields = array(), $delimiter = null) {
if ( empty($filename) ) $filename = $this->output_filename; if ( empty($filename) ) $filename = $this->output_filename;
if ( $delimiter === null ) $delimiter = $this->output_delimiter; if ( $delimiter === null ) $delimiter = $this->output_delimiter;
$data = $this->unparse($data, $fields, null, null, $delimiter); $data = $this->unparse($data, $fields, null, null, $delimiter);
if ( $output ) { if ( $filename !== null ) {
header('Content-type: application/csv'); header('Content-type: application/csv');
header('Content-Disposition: inline; filename="'.$filename.'"'); header('Content-Disposition: attachment; filename="'.$filename.'"');
echo $data; echo $data;
} }
return $data; return $data;
@@ -272,12 +293,12 @@ class parseCSV {
$pch = ( isset($data{$i-1}) ) ? $data{$i-1} : false ; $pch = ( isset($data{$i-1}) ) ? $data{$i-1} : false ;
// open and closing quotes // open and closing quotes
if ( $ch == $enclosure && (!$enclosed || $nch != $enclosure) ) { if ( $ch == $enclosure ) {
$enclosed = ( $enclosed ) ? false : true ; if ( !$enclosed || $nch != $enclosure ) {
$enclosed = ( $enclosed ) ? false : true ;
// inline quotes } elseif ( $enclosed ) {
} elseif ( $ch == $enclosure && $enclosed ) { $i++;
$i++; }
// end of row // end of row
} elseif ( ($ch == "\n" && $pch != "\r" || $ch == "\r") && !$enclosed ) { } elseif ( ($ch == "\n" && $pch != "\r" || $ch == "\r") && !$enclosed ) {
@@ -311,13 +332,12 @@ class parseCSV {
// capture most probable delimiter // capture most probable delimiter
ksort($filtered); ksort($filtered);
$delimiter = reset($filtered); $this->delimiter = reset($filtered);
$this->delimiter = $delimiter;
// parse data // parse data
if ( $parse ) $this->data = $this->parse_string(); if ( $parse ) $this->data = $this->parse_string();
return $delimiter; return $this->delimiter;
} }
@@ -349,6 +369,8 @@ class parseCSV {
} else return false; } else return false;
} }
$white_spaces = str_replace($this->delimiter, '', " \t\x0B\0");
$rows = array(); $rows = array();
$row = array(); $row = array();
$row_count = 0; $row_count = 0;
@@ -365,22 +387,66 @@ class parseCSV {
$nch = ( isset($data{$i+1}) ) ? $data{$i+1} : false ; $nch = ( isset($data{$i+1}) ) ? $data{$i+1} : false ;
$pch = ( isset($data{$i-1}) ) ? $data{$i-1} : false ; $pch = ( isset($data{$i-1}) ) ? $data{$i-1} : false ;
// open and closing quotes // open/close quotes, and inline quotes
if ( $ch == $this->enclosure && (!$enclosed || $nch != $this->enclosure) ) { if ( $ch == $this->enclosure ) {
$enclosed = ( $enclosed ) ? false : true ; if ( !$enclosed ) {
if ( $enclosed ) $was_enclosed = true; if ( ltrim($current, $white_spaces) == '' ) {
$enclosed = true;
// inline quotes $was_enclosed = true;
} elseif ( $ch == $this->enclosure && $enclosed ) { } else {
$current .= $ch; $this->error = 2;
$i++; $error_row = count($rows) + 1;
$error_col = $col + 1;
if ( !isset($this->error_info[$error_row.'-'.$error_col]) ) {
$this->error_info[$error_row.'-'.$error_col] = array(
'type' => 2,
'info' => 'Syntax error found on row '.$error_row.'. Non-enclosed fields can not contain double-quotes.',
'row' => $error_row,
'field' => $error_col,
'field_name' => (!empty($head[$col])) ? $head[$col] : null,
);
}
$current .= $ch;
}
} elseif ($nch == $this->enclosure) {
$current .= $ch;
$i++;
} elseif ( $nch != $this->delimiter && $nch != "\r" && $nch != "\n" ) {
for ( $x=($i+1); isset($data{$x}) && ltrim($data{$x}, $white_spaces) == ''; $x++ ) {}
if ( $data{$x} == $this->delimiter ) {
$enclosed = false;
$i = $x;
} else {
if ( $this->error < 1 ) {
$this->error = 1;
}
$error_row = count($rows) + 1;
$error_col = $col + 1;
if ( !isset($this->error_info[$error_row.'-'.$error_col]) ) {
$this->error_info[$error_row.'-'.$error_col] = array(
'type' => 1,
'info' =>
'Syntax error found on row '.(count($rows) + 1).'. '.
'A single double-quote was found within an enclosed string. '.
'Enclosed double-quotes must be escaped with a second double-quote.',
'row' => count($rows) + 1,
'field' => $col + 1,
'field_name' => (!empty($head[$col])) ? $head[$col] : null,
);
}
$current .= $ch;
$enclosed = false;
}
} else {
$enclosed = false;
}
// end of field/row // end of field/row
} elseif ( ($ch == $this->delimiter || ($ch == "\n" && $pch != "\r") || $ch == "\r") && !$enclosed ) { } elseif ( ($ch == $this->delimiter || $ch == "\n" || $ch == "\r") && !$enclosed ) {
if ( !$was_enclosed ) $current = trim($current);
$key = ( !empty($head[$col]) ) ? $head[$col] : $col ; $key = ( !empty($head[$col]) ) ? $head[$col] : $col ;
$row[$key] = $current; $row[$key] = ( $was_enclosed ) ? $current : trim($current) ;
$current = ''; $current = '';
$was_enclosed = false;
$col++; $col++;
// end of row // end of row
@@ -405,6 +471,7 @@ class parseCSV {
if ( $this->sort_by === null && $this->limit !== null && count($rows) == $this->limit ) { if ( $this->sort_by === null && $this->limit !== null && count($rows) == $this->limit ) {
$i = $strlen; $i = $strlen;
} }
if ( $ch == "\r" && $nch == "\n" ) $i++;
} }
// append character to current field // append character to current field
@@ -414,11 +481,20 @@ class parseCSV {
} }
$this->titles = $head; $this->titles = $head;
if ( !empty($this->sort_by) ) { if ( !empty($this->sort_by) ) {
( $this->sort_reverse ) ? krsort($rows) : ksort($rows) ; $sort_type = SORT_REGULAR;
if ( $this->sort_type == 'numeric' ) {
$sort_type = SORT_NUMERIC;
} elseif ( $this->sort_type == 'string' ) {
$sort_type = SORT_STRING;
}
( $this->sort_reverse ) ? krsort($rows, $sort_type) : ksort($rows, $sort_type) ;
if ( $this->offset !== null || $this->limit !== null ) { if ( $this->offset !== null || $this->limit !== null ) {
$rows = array_slice($rows, ($this->offset === null ? 0 : $this->offset) , $this->limit, true); $rows = array_slice($rows, ($this->offset === null ? 0 : $this->offset) , $this->limit, true);
} }
} }
if ( !$this->keep_file_data ) {
$this->file_data = null;
}
return $rows; return $rows;
} }
@@ -441,7 +517,7 @@ class parseCSV {
$entry = array(); $entry = array();
// create heading // create heading
if ( $this->heading && !$append ) { if ( $this->heading && !$append && !empty($fields) ) {
foreach( $fields as $key => $value ) { foreach( $fields as $key => $value ) {
$entry[] = $this->_enclose_value($value); $entry[] = $this->_enclose_value($value);
} }
@@ -503,11 +579,11 @@ class parseCSV {
function _validate_row_conditions ($row = array(), $conditions = null) { function _validate_row_conditions ($row = array(), $conditions = null) {
if ( !empty($row) ) { if ( !empty($row) ) {
if ( !empty($conditions) ) { if ( !empty($conditions) ) {
$conditions = (strpos($conditions, 'OR') !== false) ? explode('OR', $conditions) : array($conditions) ; $conditions = (strpos($conditions, ' OR ') !== false) ? explode(' OR ', $conditions) : array($conditions) ;
$or = ''; $or = '';
foreach( $conditions as $key => $value ) { foreach( $conditions as $key => $value ) {
if ( strpos($value, 'AND') !== false ) { if ( strpos($value, ' AND ') !== false ) {
$value = explode('AND', $value); $value = explode(' AND ', $value);
$and = ''; $and = '';
foreach( $value as $k => $v ) { foreach( $value as $k => $v ) {
$and .= $this->_validate_row_condition($row, $v); $and .= $this->_validate_row_condition($row, $v);
@@ -601,11 +677,13 @@ class parseCSV {
* @return Processed value * @return Processed value
*/ */
function _enclose_value ($value = null) { function _enclose_value ($value = null) {
$delimiter = preg_quote($this->delimiter, '/'); if ( $value !== null && $value != '' ) {
$enclosure = preg_quote($this->enclosure, '/'); $delimiter = preg_quote($this->delimiter, '/');
if ( preg_match("/".$delimiter."|".$enclosure."|\n|\r/i", $value) || ($value{0} == ' ' || substr($value, -1) == ' ') ) { $enclosure = preg_quote($this->enclosure, '/');
$value = str_replace($this->enclosure, $this->enclosure.$this->enclosure, $value); if ( preg_match("/".$delimiter."|".$enclosure."|\n|\r/i", $value) || ($value{0} == ' ' || substr($value, -1) == ' ') ) {
$value = $this->enclosure.$value.$this->enclosure; $value = str_replace($this->enclosure, $this->enclosure.$this->enclosure, $value);
$value = $this->enclosure.$value.$this->enclosure;
}
} }
return $value; return $value;
} }