7 Commits
0.2.1 ... 0.4.0

Author SHA1 Message Date
zynode
70366e3085 parseCSV 0.4 beta
- Error reporting for files/data which is corrupt
  or has formatting errors like using double
  quotes in a field without enclosing quotes. Or
  not escaping double quotes with a second one.

- parse() method does not require input anymore
  if the "$object->file" property has been set.

I'm calling this a beta release due to the heavy
modifications to the core parsing logic required
for error reporting to work. I have tested the
new code quite extensively, I'm fairly confident
that it still parses exactly as it always has.

The second reason I'm calling it a beta release
is cause I'm sure the error reporting code will
need more refinements and tweaks to detect more
types of errors, as it's only picking two types
or syntax errors right now. However, it seems
these two are the most common errors that you
would be likely to come across.

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@28 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-04-11 18:13:37 +00:00
zynode
2dfd35b988 added error reporting/validation of parsed data, still needs some more testing before release tho...
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@27 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-04-07 16:17:44 +00:00
zynode
ae00f949f0 parseCSV 0.3.2
This is primarily a bug-fix release for a critical
bug which was brought to my attention.

- Fixed a critical bug in conditions parsing which
  would generate corrupt matching patterns causing
  the condition(s) to not work at all in some
  situations.

- Fixed a small code error which would cause PHP to
  generate a invalid offset notice when zero length
  values were fed into the unparse() method to
  generate CSV data from an array.

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@22 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-03-31 22:50:36 +00:00
zynode
4e76da5eff minor fix to a bug which caused notice errors to be generated when _enclose_value() was fed a zero character long string
git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@20 339761fc-0c37-0410-822d-8b8cac1f6a97
2008-03-31 20:55:33 +00:00
zynode
7762e71316 parseCSV 0.3.1
- Small change to default output settings to
  conform with RFC 4180 (http://rfc.net/rfc4180.html).
  Only the LF (line feed) character was used
  by default to separate rows, rather than
  CRLF (carriage return & line feed).

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@17 339761fc-0c37-0410-822d-8b8cac1f6a97
2007-08-31 22:45:22 +00:00
zynode
e28b3d0f9d parseCSV 0.3.0
- Changed to the MIT license.

- Added offset and limit options.

- Added SQL-like conditions for quickly
  filtering out entries. Documentation on the
  condition syntax is forthcoming.

- Small parsing modification to comply
  with some recent changes to the specifications
  outlined on Wikipedia's Comma-separated values
  article.

- Minor changes and optimizations, and a few
  spelling corrections. Oops :)

git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@14 339761fc-0c37-0410-822d-8b8cac1f6a97
2007-08-09 09:17:54 +00:00
zynode
9c389ed0c1 conditions, limit and offset are functional. conditions needs a rewrite tho.
added examples to trunk.


git-svn-id: http://parsecsv-for-php.googlecode.com/svn/trunk@13 339761fc-0c37-0410-822d-8b8cac1f6a97
2007-08-08 18:48:11 +00:00
7 changed files with 594 additions and 116 deletions

View File

@@ -1,3 +1,97 @@
parseCSV 0.4 beta
-----------------------------------
Date: 11-Apr-2008
- Error reporting for files/data which is corrupt
or has formatting errors like using double
quotes in a field without enclosing quotes. Or
not escaping double quotes with a second one.
- parse() method does not require input anymore
if the "$object->file" property has been set.
I'm calling this a beta release due to the heavy
modifications to the core parsing logic required
for error reporting to work. I have tested the
new code quite extensively, I'm fairly confident
that it still parses exactly as it always has.
The second reason I'm calling it a beta release
is cause I'm sure the error reporting code will
need more refinements and tweaks to detect more
types of errors, as it's only picking two types
or syntax errors right now. However, it seems
these two are the most common errors that you
would be likely to come across.
-----------------------------------
parseCSV 0.3.2
-----------------------------------
Date: 1-Apr-2008
This is primarily a bug-fix release for a critical
bug which was brought to my attention.
- Fixed a critical bug in conditions parsing which
would generate corrupt matching patterns causing
the condition(s) to not work at all in some
situations.
- Fixed a small code error which would cause PHP to
generate a invalid offset notice when zero length
values were fed into the unparse() method to
generate CSV data from an array.
Notice: If you have been using the "parsecsv-stable"
branch as an external in any of your projects,
please use the "stable/parsecsv" branch from this
point on as I will eventually remove the former due
to it's stupid naming.
-----------------------------------
parseCSV 0.3.1
-----------------------------------
Date: 1-Sep-2007
- Small change to default output settings to
conform with RFC 4180 (http://rfc.net/rfc4180.html).
Only the LF (line feed) character was used
by default to separate rows, rather than
CRLF (carriage return & line feed).
-----------------------------------
parseCSV 0.3.0
-----------------------------------
Date: 9-Aug-2007
- Changed to the MIT license.
- Added offset and limit options.
- Added SQL-like conditions for quickly
filtering out entries. Documentation on the
condition syntax is forthcoming.
- Small parsing modification to comply
with some recent changes to the specifications
outlined on Wikipedia's Comma-separated values
article.
- Minor changes and optimizations, and a few
spelling corrections. Oops :)
- Included more complex code examples in the
parseCSV download.
-----------------------------------
parseCSV 0.2.1 parseCSV 0.2.1
----------------------------------- -----------------------------------
Date: 8-Aug-2007 Date: 8-Aug-2007

19
License.txt Normal file
View File

@@ -0,0 +1,19 @@
Copyright (c) 2007 Jim Myhrberg (jim@zydev.info).
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

15
examples/_books.csv Normal file
View File

@@ -0,0 +1,15 @@
rating,title,author,type,asin,tags,review
0,The Killing Kind,John Connolly,Book,0340771224,,i still haven't had time to read this one...
0,The Third Secret,Steve Berry,Book,0340899263,,need to find time to read this book
3,The Last Templar,Raymond Khoury,Book,0752880705,,
5,The Traveller,John Twelve Hawks,Book,059305430X,,
4,Crisis Four,Andy Mcnab,Book,0345428080,,
5,Prey,Michael Crichton,Book,0007154534,,
3,The Broker (Paperback),John Grisham,Book,0440241588,book johngrisham,"good book, but is slow in the middle"
3,Without Blood (Paperback),Alessandro Baricco,Book,1841955744,,
5,State of Fear (Paperback),Michael Crichton,Book,0061015733,,
4,The Rule of Four (Paperback),Ian Caldwell,Book,0099451956,book bestseller,
4,Deception Point (Paperback),Dan Brown,Book,0671027387,book danbrown bestseller,
5,Digital Fortress : A Thriller (Mass Market Paperback),Dan Brown,Book,0312995423,book danbrown bestseller,
5,Angels & Demons (Mass Market Paperback),Dan Brown,Book,0671027360,book danbrown bestseller,
4,The Da Vinci Code (Hardcover),Dan Brown," Book ",0385504209,book movie danbrown bestseller davinci,
1 rating title author type asin tags review
2 0 The Killing Kind John Connolly Book 0340771224 i still haven't had time to read this one...
3 0 The Third Secret Steve Berry Book 0340899263 need to find time to read this book
4 3 The Last Templar Raymond Khoury Book 0752880705
5 5 The Traveller John Twelve Hawks Book 059305430X
6 4 Crisis Four Andy Mcnab Book 0345428080
7 5 Prey Michael Crichton Book 0007154534
8 3 The Broker (Paperback) John Grisham Book 0440241588 book johngrisham good book, but is slow in the middle
9 3 Without Blood (Paperback) Alessandro Baricco Book 1841955744
10 5 State of Fear (Paperback) Michael Crichton Book 0061015733
11 4 The Rule of Four (Paperback) Ian Caldwell Book 0099451956 book bestseller
12 4 Deception Point (Paperback) Dan Brown Book 0671027387 book danbrown bestseller
13 5 Digital Fortress : A Thriller (Mass Market Paperback) Dan Brown Book 0312995423 book danbrown bestseller
14 5 Angels & Demons (Mass Market Paperback) Dan Brown Book 0671027360 book danbrown bestseller
15 4 The Da Vinci Code (Hardcover) Dan Brown Book 0385504209 book movie danbrown bestseller davinci

48
examples/basic.php Normal file
View File

@@ -0,0 +1,48 @@
<pre>
<?php
# include parseCSV class.
require_once('../parsecsv.lib.php');
# create new parseCSV object.
$csv = new parseCSV();
# Parse '_books.csv' using automatic delimiter detection...
$csv->auto('_books.csv');
# ...or if you know the delimiter, set the delimiter character
# if its not the default comma...
// $csv->delimiter = "\t"; # tab delimited
# ...and then use the parse() function.
// $csv->parse('_books.csv');
# Output result.
// print_r($csv->data);
?>
</pre>
<style type="text/css" media="screen">
table { background-color: #BBB; }
th { background-color: #EEE; }
td { background-color: #FFF; }
</style>
<table border="0" cellspacing="1" cellpadding="3">
<tr>
<?php foreach ($csv->titles as $value): ?>
<th><?php echo $value; ?></th>
<?php endforeach; ?>
</tr>
<?php foreach ($csv->data as $key => $row): ?>
<tr>
<?php foreach ($row as $value): ?>
<td><?php echo $value; ?></td>
<?php endforeach; ?>
</tr>
<?php endforeach; ?>
</table>

48
examples/conditions.php Normal file
View File

@@ -0,0 +1,48 @@
<pre>
<?php
# include parseCSV class.
require_once('../parsecsv.lib.php');
# create new parseCSV object.
$csv = new parseCSV();
# Example conditions:
// $csv->conditions = 'title contains paperback OR title contains hardcover';
$csv->conditions = 'author does not contain dan brown';
// $csv->conditions = 'rating < 4 OR author is John Twelve Hawks';
// $csv->conditions = 'rating > 4 AND author is Dan Brown';
# Parse '_books.csv' using automatic delimiter detection.
$csv->auto('_books.csv');
# Output result.
// print_r($csv->data);
?>
</pre>
<style type="text/css" media="screen">
table { background-color: #BBB; }
th { background-color: #EEE; }
td { background-color: #FFF; }
</style>
<table border="0" cellspacing="1" cellpadding="3">
<tr>
<?php foreach ($csv->titles as $value): ?>
<th><?php echo $value; ?></th>
<?php endforeach; ?>
</tr>
<?php foreach ($csv->data as $key => $row): ?>
<tr>
<?php foreach ($row as $value): ?>
<td><?php echo $value; ?></td>
<?php endforeach; ?>
</tr>
<?php endforeach; ?>
</table>

61
examples/limit.php Normal file
View File

@@ -0,0 +1,61 @@
<pre>
<?php
# include parseCSV class.
require_once('../parsecsv.lib.php');
# create new parseCSV object.
$csv = new parseCSV();
# if sorting is enabled, the whole CSV file
# will be processed and sorted and then rows
# are extracted based on offset and limit.
#
# if sorting is not enabled, then the least
# amount of rows to satisfy offset and limit
# settings will be processed. this is useful
# with large files when you only need the
# first 20 rows for example.
$csv->sort_by = 'title';
# offset from the beginning of the file,
# ignoring the first X number of rows.
$csv->offset = 2;
# limit the number of returned rows.
$csv->limit = 3;
# Parse '_books.csv' using automatic delimiter detection.
$csv->auto('_books.csv');
# Output result.
// print_r($csv->data);
?>
</pre>
<style type="text/css" media="screen">
table { background-color: #BBB; }
th { background-color: #EEE; }
td { background-color: #FFF; }
</style>
<table border="0" cellspacing="1" cellpadding="3">
<tr>
<?php foreach ($csv->titles as $value): ?>
<th><?php echo $value; ?></th>
<?php endforeach; ?>
</tr>
<?php foreach ($csv->data as $key => $row): ?>
<tr>
<?php foreach ($row as $value): ?>
<td><?php echo $value; ?></td>
<?php endforeach; ?>
</tr>
<?php endforeach; ?>
</table>

View File

@@ -4,18 +4,40 @@ class parseCSV {
/* /*
Class: parseCSV v0.2.1 Class: parseCSV v0.4 beta
http://code.google.com/p/parsecsv-for-php/ http://code.google.com/p/parsecsv-for-php/
Created by Jim Myhrberg (jim@zydev.info).
Fully conforms to the specifications lined out on wikipedia: Fully conforms to the specifications lined out on wikipedia:
- http://en.wikipedia.org/wiki/Comma-separated_values - http://en.wikipedia.org/wiki/Comma-separated_values
Based on the concept of this class: Based on the concept of Ming Hong Ng's CsvFileParser class:
- http://minghong.blogspot.com/2006/07/csv-parser-for-php.html - http://minghong.blogspot.com/2006/07/csv-parser-for-php.html
Copyright (c) 2007 Jim Myhrberg (jim@zydev.info).
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Code Examples Code Examples
---------------- ----------------
# general usage # general usage
@@ -54,28 +76,12 @@ class parseCSV {
---------------- ----------------
----------
This program is free software; you can redistributeit and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version. http://www.gnu.org/licenses/gpl.txt
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc., 59
Temple Place, Suite 330, Boston, MA 02111-1307 USA
----------
*/ */
/** /**
* Configuration * Configuration
* - set these options with $object->var_name = 'value';
*/ */
# use first line/entry as field names # use first line/entry as field names
@@ -92,15 +98,24 @@ class parseCSV {
var $delimiter = ','; var $delimiter = ',';
var $enclosure = '"'; var $enclosure = '"';
# basic SQL-like conditions for row matching
var $conditions = null;
# number of rows to ignore from beginning of data
var $offset = null;
# limits the number of returned rows to specified amount
var $limit = null;
# number of rows to analyze when attempting to auto-detect delimiter # number of rows to analyze when attempting to auto-detect delimiter
var $auto_depth = 15; var $auto_depth = 15;
# characters to ignore when attempting to auto-detect delimiter # characters to ignore when attempting to auto-detect delimiter
var $auto_non_chars = "a-zA-Z0-9\n\r"; var $auto_non_chars = "a-zA-Z0-9\n\r";
# prefered delimiter characters, only used when all filtering method # preferred delimiter characters, only used when all filtering method
# returns multiple possible delimiters (happens very rarely) # returns multiple possible delimiters (happens very rarely)
var $auto_prefered = ",;\t.:|"; var $auto_preferred = ",;\t.:|";
# character encoding options # character encoding options
var $convert_encoding = false; var $convert_encoding = false;
@@ -108,7 +123,7 @@ class parseCSV {
var $output_encoding = 'ISO-8859-1'; var $output_encoding = 'ISO-8859-1';
# used by unparse(), save(), and output() functions # used by unparse(), save(), and output() functions
var $linefeed = "\n"; var $linefeed = "\r\n";
# only used by output() function # only used by output() function
var $output_delimiter = ','; var $output_delimiter = ',';
@@ -125,6 +140,19 @@ class parseCSV {
# loaded file contents # loaded file contents
var $file_data; var $file_data;
# error while parsing input data
# 0 = No errors found. Everything should be fine :)
# 1 = Hopefully correctable syntax error was found.
# 2 = Enclosure character (double quote by default)
# was found in non-enclosed field. This means
# the file is either corrupt, or does not
# standard CSV formatting. Please validate
# the parsed data yourself.
var $error = 0;
# detailed error info
var $error_info = array();
# array of field values in data parsed # array of field values in data parsed
var $titles = array(); var $titles = array();
@@ -137,7 +165,10 @@ class parseCSV {
* @param input CSV file or string * @param input CSV file or string
* @return nothing * @return nothing
*/ */
function parseCSV ($input = null) { function parseCSV ($input = null, $offset = null, $limit = null, $conditions = null) {
if ( $offset !== null ) $this->offset = $offset;
if ( $limit !== null ) $this->limit = $limit;
if ( count($conditions) > 0 ) $this->conditions = $conditions;
if ( !empty($input) ) $this->parse($input); if ( !empty($input) ) $this->parse($input);
} }
@@ -151,8 +182,12 @@ class parseCSV {
* @param input CSV file or string * @param input CSV file or string
* @return nothing * @return nothing
*/ */
function parse ($input = null) { function parse ($input = null, $offset = null, $limit = null, $conditions = null) {
if ( $input === null ) $input = $this->file;
if ( !empty($input) ) { if ( !empty($input) ) {
if ( $offset !== null ) $this->offset = $offset;
if ( $limit !== null ) $this->limit = $limit;
if ( count($conditions) > 0 ) $this->conditions = $conditions;
if ( is_readable($input) ) { if ( is_readable($input) ) {
$this->data = $this->parse_file($input); $this->data = $this->parse_file($input);
} else { } else {
@@ -176,7 +211,7 @@ class parseCSV {
if ( empty($file) ) $file = &$this->file; if ( empty($file) ) $file = &$this->file;
$mode = ( $append ) ? 'at' : 'wt' ; $mode = ( $append ) ? 'at' : 'wt' ;
$is_php = ( preg_match('/\.php$/i', $file) ) ? true : false ; $is_php = ( preg_match('/\.php$/i', $file) ) ? true : false ;
return $this->wfile($file, $this->unparse($data, $fields, $append, $is_php), $mode); return $this->_wfile($file, $this->unparse($data, $fields, $append, $is_php), $mode);
} }
/** /**
@@ -185,7 +220,7 @@ class parseCSV {
* @param filename filename sent to browser in headers if output is true * @param filename filename sent to browser in headers if output is true
* @param data 2D array with data * @param data 2D array with data
* @param fields field names * @param fields field names
* @param delimiter delimiter used to seperate data * @param delimiter delimiter used to separate data
* @return CSV data using delimiter of choice, or default * @return CSV data using delimiter of choice, or default
*/ */
function output ($output = true, $filename = null, $data = array(), $fields = array(), $delimiter = null) { function output ($output = true, $filename = null, $data = array(), $fields = array(), $delimiter = null) {
@@ -214,24 +249,24 @@ class parseCSV {
/** /**
* Auto-Detect Delimiter: Find delimiter by analyzing a specific number of * Auto-Detect Delimiter: Find delimiter by analyzing a specific number of
* rows to determin most probable delimiter character * rows to determine most probable delimiter character
* @param file local CSV file * @param file local CSV file
* @param parse true/false parse file directly * @param parse true/false parse file directly
* @param search_depth number of rows to analyze * @param search_depth number of rows to analyze
* @param prefered prefered delimiter characters * @param preferred preferred delimiter characters
* @param enclosure enclosure character, default is double quote ("). * @param enclosure enclosure character, default is double quote (").
* @return delimiter character * @return delimiter character
*/ */
function auto ($file = null, $parse = true, $search_depth = null, $prefered = null, $enclosure = null) { function auto ($file = null, $parse = true, $search_depth = null, $preferred = null, $enclosure = null) {
if ( $file === null ) $file = $this->file; if ( $file === null ) $file = $this->file;
if ( empty($search_depth) ) $search_depth = $this->auto_depth; if ( empty($search_depth) ) $search_depth = $this->auto_depth;
if ( $enclosure === null ) $enclosure = $this->enclosure; if ( $enclosure === null ) $enclosure = $this->enclosure;
if ( $prefered === null ) $prefered = $this->auto_prefered; if ( $preferred === null ) $preferred = $this->auto_preferred;
if ( empty($this->file_data) ) { if ( empty($this->file_data) ) {
if ( $this->check_data($file) ) { if ( $this->_check_data($file) ) {
$data = &$this->file_data; $data = &$this->file_data;
} else return false; } else return false;
} else { } else {
@@ -246,17 +281,17 @@ class parseCSV {
// walk specific depth finding posssible delimiter characters // walk specific depth finding posssible delimiter characters
for ( $i=0; $i < $strlen; $i++ ) { for ( $i=0; $i < $strlen; $i++ ) {
$ch = $data[$i]; $ch = $data{$i};
$nch = ( isset($data[$i+1]) ) ? $data[$i+1] : false ; $nch = ( isset($data{$i+1}) ) ? $data{$i+1} : false ;
$pch = ( isset($data[$i-1]) ) ? $data[$i-1] : false ; $pch = ( isset($data{$i-1}) ) ? $data{$i-1} : false ;
// open and closing quotes // open and closing quotes
if ( $ch == $enclosure && (!$enclosed || $nch != $enclosure) ) { if ( $ch == $enclosure ) {
if ( !$enclosed || $nch != $enclosure ) {
$enclosed = ( $enclosed ) ? false : true ; $enclosed = ( $enclosed ) ? false : true ;
} elseif ( $enclosed ) {
// inline quotes
} elseif ( $ch == $enclosure && $enclosed ) {
$i++; $i++;
}
// end of row // end of row
} elseif ( ($ch == "\n" && $pch != "\r" || $ch == "\r") && !$enclosed ) { } elseif ( ($ch == "\n" && $pch != "\r" || $ch == "\r") && !$enclosed ) {
@@ -283,20 +318,19 @@ class parseCSV {
$depth = ( $to_end ) ? $n-1 : $n ; $depth = ( $to_end ) ? $n-1 : $n ;
$filtered = array(); $filtered = array();
foreach( $chars as $char => $value ) { foreach( $chars as $char => $value ) {
if ( $match = $this->check_count($char, $value, $depth, $prefered) ) { if ( $match = $this->_check_count($char, $value, $depth, $preferred) ) {
$filtered[$match] = $char; $filtered[$match] = $char;
} }
} }
// capture most probable delimiter // capture most probable delimiter
ksort($filtered); ksort($filtered);
$delimiter = reset($filtered); $this->delimiter = reset($filtered);
$this->delimiter = $delimiter;
// parse data // parse data
if ( $parse ) $this->data = $this->parse_string(); if ( $parse ) $this->data = $this->parse_string();
return $delimiter; return $this->delimiter;
} }
@@ -305,34 +339,6 @@ class parseCSV {
// ----- [ Core Functions ] --------------------- // ----- [ Core Functions ] ---------------------
// ============================================== // ==============================================
/**
* Load local file or string
* @param input local CSV file
* @return true or false
*/
function load_data ($input = null) {
$data = null;
$file = null;
if ( $input === null ) {
$file = $this->file;
} elseif ( file_exists($input) ) {
$file = $input;
} else {
$data = $input;
}
if ( !empty($data) || $data = $this->rfile($file) ) {
if ( $this->file != $file ) $this->file = $file;
if ( preg_match('/\.php$/i', $file) && preg_match('/<\?.*?\?>(.*)/ims', $data, $strip) ) {
$data = ltrim($strip[1]);
}
if ( $this->convert_encoding ) $data = iconv($this->input_encoding, $this->output_encoding, $data);
if ( substr($data, -1) != "\n" ) $data .= "\n";
$this->file_data = &$data;
return true;
}
return false;
}
/** /**
* Read file to string and call parse_string() * Read file to string and call parse_string()
* @param file local CSV file * @param file local CSV file
@@ -351,11 +357,13 @@ class parseCSV {
*/ */
function parse_string ($data = null) { function parse_string ($data = null) {
if ( empty($data) ) { if ( empty($data) ) {
if ( $this->check_data() ) { if ( $this->_check_data() ) {
$data = &$this->file_data; $data = &$this->file_data;
} else return false; } else return false;
} }
$white_spaces = str_replace($this->delimiter, '', " \t\x0B\0");
$rows = array(); $rows = array();
$row = array(); $row = array();
$row_count = 0; $row_count = 0;
@@ -363,33 +371,80 @@ class parseCSV {
$head = ( !empty($this->fields) ) ? $this->fields : array() ; $head = ( !empty($this->fields) ) ? $this->fields : array() ;
$col = 0; $col = 0;
$enclosed = false; $enclosed = false;
$was_enclosed = false;
$strlen = strlen($data); $strlen = strlen($data);
// walk through each character // walk through each character
for ( $i=0; $i < $strlen; $i++ ) { for ( $i=0; $i < $strlen; $i++ ) {
$ch = $data[$i]; $ch = $data{$i};
$nch = ( isset($data[$i+1]) ) ? $data[$i+1] : false ; $nch = ( isset($data{$i+1}) ) ? $data{$i+1} : false ;
$pch = ( isset($data[$i-1]) ) ? $data[$i-1] : false ; $pch = ( isset($data{$i-1}) ) ? $data{$i-1} : false ;
// open and closing quotes // open/close quotes, and inline quotes
if ( $ch == $this->enclosure && (!$enclosed || $nch != $this->enclosure) ) { if ( $ch == $this->enclosure ) {
$enclosed = ( $enclosed ) ? false : true ; if ( !$enclosed ) {
if ( ltrim($current, $white_spaces) == '' ) {
// inline quotes $enclosed = true;
} elseif ( $ch == $this->enclosure && $enclosed ) { $was_enclosed = true;
} else {
$this->error = 2;
$error_row = count($rows) + 1;
$error_col = $col + 1;
if ( !isset($this->error_info[$error_row.'-'.$error_col]) ) {
$this->error_info[$error_row.'-'.$error_col] = array(
'type' => 2,
'info' => 'Syntax error found on row '.$error_row.'. Non-enclosed fields can not contain double-quotes.',
'row' => $error_row,
'field' => $error_col,
'field_name' => (!empty($head[$col])) ? $head[$col] : null,
);
}
$current .= $ch;
}
} elseif ($nch == $this->enclosure) {
$current .= $ch; $current .= $ch;
$i++; $i++;
} elseif ( $nch != $this->delimiter && $nch != "\r" && $nch != "\n" ) {
for ( $x=($i+1); isset($data{$x}) && ltrim($data{$x}, $white_spaces) == ''; $x++ ) {}
if ( $data{$x} == $this->delimiter ) {
$enclosed = false;
$i = $x;
} else {
if ( $this->error < 1 ) {
$this->error = 1;
}
$error_row = count($rows) + 1;
$error_col = $col + 1;
if ( !isset($this->error_info[$error_row.'-'.$error_col]) ) {
$this->error_info[$error_row.'-'.$error_col] = array(
'type' => 1,
'info' =>
'Syntax error found on row '.(count($rows) + 1).'. '.
'A single double-quote was found within an enclosed string. '.
'Enclosed double-quotes must be escaped with a second double-quote.',
'row' => count($rows) + 1,
'field' => $col + 1,
'field_name' => (!empty($head[$col])) ? $head[$col] : null,
);
}
$current .= $ch;
$enclosed = false;
}
} else {
$enclosed = false;
}
// end of field/row // end of field/row
} elseif ( ($ch == $this->delimiter || ($ch == "\n" && $pch != "\r") || $ch == "\r") && !$enclosed ) { } elseif ( ($ch == $this->delimiter || $ch == "\n" || $ch == "\r") && !$enclosed ) {
$current = trim($current);
$key = ( !empty($head[$col]) ) ? $head[$col] : $col ; $key = ( !empty($head[$col]) ) ? $head[$col] : $col ;
$row[$key] = $current; $row[$key] = ( $was_enclosed ) ? $current : trim($current) ;
$current = ''; $current = '';
$was_enclosed = false;
$col++; $col++;
// end of row // end of row
if ( $ch == "\n" || $ch == "\r" ) { if ( $ch == "\n" || $ch == "\r" ) {
if ( $this->_validate_offset($row_count) && $this->_validate_row_conditions($row, $this->conditions) ) {
if ( $this->heading && empty($head) ) { if ( $this->heading && empty($head) ) {
$head = $row; $head = $row;
} elseif ( empty($this->fields) || (!empty($this->fields) && (($this->heading && $row_count > 0) || !$this->heading)) ) { } elseif ( empty($this->fields) || (!empty($this->fields) && (($this->heading && $row_count > 0) || !$this->heading)) ) {
@@ -402,9 +457,14 @@ class parseCSV {
} else $rows[$row[$this->sort_by]] = $row; } else $rows[$row[$this->sort_by]] = $row;
} else $rows[] = $row; } else $rows[] = $row;
} }
}
$row = array(); $row = array();
$col = 0; $col = 0;
$row_count++; $row_count++;
if ( $this->sort_by === null && $this->limit !== null && count($rows) == $this->limit ) {
$i = $strlen;
}
if ( $ch == "\r" && $nch == "\n" ) $i++;
} }
// append character to current field // append character to current field
@@ -415,6 +475,9 @@ class parseCSV {
$this->titles = $head; $this->titles = $head;
if ( !empty($this->sort_by) ) { if ( !empty($this->sort_by) ) {
( $this->sort_reverse ) ? krsort($rows) : ksort($rows) ; ( $this->sort_reverse ) ? krsort($rows) : ksort($rows) ;
if ( $this->offset !== null || $this->limit !== null ) {
$rows = array_slice($rows, ($this->offset === null ? 0 : $this->offset) , $this->limit, true);
}
} }
return $rows; return $rows;
} }
@@ -440,7 +503,7 @@ class parseCSV {
// create heading // create heading
if ( $this->heading && !$append ) { if ( $this->heading && !$append ) {
foreach( $fields as $key => $value ) { foreach( $fields as $key => $value ) {
$entry[] = $this->enclose_value($value); $entry[] = $this->_enclose_value($value);
} }
$string .= implode($delimiter, $entry).$this->linefeed; $string .= implode($delimiter, $entry).$this->linefeed;
$entry = array(); $entry = array();
@@ -449,7 +512,7 @@ class parseCSV {
// create data // create data
foreach( $data as $key => $row ) { foreach( $data as $key => $row ) {
foreach( $row as $field => $value ) { foreach( $row as $field => $value ) {
$entry[] = $this->enclose_value($value); $entry[] = $this->_enclose_value($value);
} }
$string .= implode($delimiter, $entry).$this->linefeed; $string .= implode($delimiter, $entry).$this->linefeed;
$entry = array(); $entry = array();
@@ -458,24 +521,154 @@ class parseCSV {
return $string; return $string;
} }
/**
* Load local file or string
* @param input local CSV file
* @return true or false
*/
function load_data ($input = null) {
$data = null;
$file = null;
if ( $input === null ) {
$file = $this->file;
} elseif ( file_exists($input) ) {
$file = $input;
} else {
$data = $input;
}
if ( !empty($data) || $data = $this->_rfile($file) ) {
if ( $this->file != $file ) $this->file = $file;
if ( preg_match('/\.php$/i', $file) && preg_match('/<\?.*?\?>(.*)/ims', $data, $strip) ) {
$data = ltrim($strip[1]);
}
if ( $this->convert_encoding ) $data = iconv($this->input_encoding, $this->output_encoding, $data);
if ( substr($data, -1) != "\n" ) $data .= "\n";
$this->file_data = &$data;
return true;
}
return false;
}
// ============================================== // ==============================================
// ----- [ Internal Functions ] ----------------- // ----- [ Internal Functions ] -----------------
// ============================================== // ==============================================
/**
* Validate a row against specified conditions
* @param row array with values from a row
* @param conditions specified conditions that the row must match
* @return true of false
*/
function _validate_row_conditions ($row = array(), $conditions = null) {
if ( !empty($row) ) {
if ( !empty($conditions) ) {
$conditions = (strpos($conditions, ' OR ') !== false) ? explode(' OR ', $conditions) : array($conditions) ;
$or = '';
foreach( $conditions as $key => $value ) {
if ( strpos($value, ' AND ') !== false ) {
$value = explode(' AND ', $value);
$and = '';
foreach( $value as $k => $v ) {
$and .= $this->_validate_row_condition($row, $v);
}
$or .= (strpos($and, '0') !== false) ? '0' : '1' ;
} else {
$or .= $this->_validate_row_condition($row, $value);
}
}
return (strpos($or, '1') !== false) ? true : false ;
}
return true;
}
return false;
}
/**
* Validate a row against a single condition
* @param row array with values from a row
* @param condition specified condition that the row must match
* @return true of false
*/
function _validate_row_condition ($row, $condition) {
$operators = array(
'=', 'equals', 'is',
'!=', 'is not',
'<', 'is less than',
'>', 'is greater than',
'<=', 'is less than or equals',
'>=', 'is greater than or equals',
'contains',
'does not contain',
);
$operators_regex = array();
foreach( $operators as $value ) {
$operators_regex[] = preg_quote($value, '/');
}
$operators_regex = implode('|', $operators_regex);
if ( preg_match('/^(.+) ('.$operators_regex.') (.+)$/i', trim($condition), $capture) ) {
$field = $capture[1];
$op = $capture[2];
$value = $capture[3];
if ( preg_match('/^([\'\"]{1})(.*)([\'\"]{1})$/i', $value, $capture) ) {
if ( $capture[1] == $capture[3] ) {
$value = $capture[2];
$value = str_replace("\\n", "\n", $value);
$value = str_replace("\\r", "\r", $value);
$value = str_replace("\\t", "\t", $value);
$value = stripslashes($value);
}
}
if ( array_key_exists($field, $row) ) {
if ( ($op == '=' || $op == 'equals' || $op == 'is') && $row[$field] == $value ) {
return '1';
} elseif ( ($op == '!=' || $op == 'is not') && $row[$field] != $value ) {
return '1';
} elseif ( ($op == '<' || $op == 'is less than' ) && $row[$field] < $value ) {
return '1';
} elseif ( ($op == '>' || $op == 'is greater than') && $row[$field] > $value ) {
return '1';
} elseif ( ($op == '<=' || $op == 'is less than or equals' ) && $row[$field] <= $value ) {
return '1';
} elseif ( ($op == '>=' || $op == 'is greater than or equals') && $row[$field] >= $value ) {
return '1';
} elseif ( $op == 'contains' && preg_match('/'.preg_quote($value, '/').'/i', $row[$field]) ) {
return '1';
} elseif ( $op == 'does not contain' && !preg_match('/'.preg_quote($value, '/').'/i', $row[$field]) ) {
return '1';
} else {
return '0';
}
}
}
return '1';
}
/**
* Validates if the row is within the offset or not if sorting is disabled
* @param current_row the current row number being processed
* @return true of false
*/
function _validate_offset ($current_row) {
if ( $this->sort_by === null && $this->offset !== null && $current_row < $this->offset ) return false;
return true;
}
/** /**
* Enclose values if needed * Enclose values if needed
* - only used by unparse() * - only used by unparse()
* @param value string to process * @param value string to process
* @return Processed value * @return Processed value
*/ */
function enclose_value ($value = null) { function _enclose_value ($value = null) {
if ( $value !== null && $value != '' ) {
$delimiter = preg_quote($this->delimiter, '/'); $delimiter = preg_quote($this->delimiter, '/');
$enclosure = preg_quote($this->enclosure, '/'); $enclosure = preg_quote($this->enclosure, '/');
if ( preg_match("/".$delimiter."|".$enclosure."|\n|\r/i", $value) ) { if ( preg_match("/".$delimiter."|".$enclosure."|\n|\r/i", $value) || ($value{0} == ' ' || substr($value, -1) == ' ') ) {
$value = str_replace($this->enclosure, $this->enclosure.$this->enclosure, $value); $value = str_replace($this->enclosure, $this->enclosure.$this->enclosure, $value);
$value = $this->enclosure.$value.$this->enclosure; $value = $this->enclosure.$value.$this->enclosure;
} }
}
return $value; return $value;
} }
@@ -484,7 +677,7 @@ class parseCSV {
* @param file local filename * @param file local filename
* @return true or false * @return true or false
*/ */
function check_data ($file = null) { function _check_data ($file = null) {
if ( empty($this->file_data) ) { if ( empty($this->file_data) ) {
if ( $file === null ) $file = $this->file; if ( $file === null ) $file = $this->file;
return $this->load_data($file); return $this->load_data($file);
@@ -498,7 +691,7 @@ class parseCSV {
* - only used by find_delimiter() * - only used by find_delimiter()
* @return special string used for delimiter selection, or false * @return special string used for delimiter selection, or false
*/ */
function check_count ($char, $array, $depth, $prefered) { function _check_count ($char, $array, $depth, $preferred) {
if ( $depth == count($array) ) { if ( $depth == count($array) ) {
$first = null; $first = null;
$equal = null; $equal = null;
@@ -517,7 +710,7 @@ class parseCSV {
} }
if ( $equal ) { if ( $equal ) {
$match = ( $almost ) ? 2 : 1 ; $match = ( $almost ) ? 2 : 1 ;
$pref = strpos($prefered, $char); $pref = strpos($preferred, $char);
$pref = ( $pref !== false ) ? str_pad($pref, 3, '0', STR_PAD_LEFT) : '999' ; $pref = ( $pref !== false ) ? str_pad($pref, 3, '0', STR_PAD_LEFT) : '999' ;
return $pref.$match.'.'.(99999 - str_pad($first, 5, '0', STR_PAD_LEFT)); return $pref.$match.'.'.(99999 - str_pad($first, 5, '0', STR_PAD_LEFT));
} else return false; } else return false;
@@ -529,7 +722,7 @@ class parseCSV {
* @param file local filename * @param file local filename
* @return Data from file, or false on failure * @return Data from file, or false on failure
*/ */
function rfile ($file = null){ function _rfile ($file = null) {
if ( is_readable($file) ) { if ( is_readable($file) ) {
if ( !($fh = fopen($file, 'r')) ) return false; if ( !($fh = fopen($file, 'r')) ) return false;
$data = fread($fh, filesize($file)); $data = fread($fh, filesize($file));
@@ -547,7 +740,7 @@ class parseCSV {
* @param lock flock() mode * @param lock flock() mode
* @return true or false * @return true or false
*/ */
function wfile($file, $string = '', $mode = 'wb', $lock = 2){ function _wfile ($file, $string = '', $mode = 'wb', $lock = 2) {
if ( $fp = fopen($file, $mode) ) { if ( $fp = fopen($file, $mode) ) {
flock($fp, $lock); flock($fp, $lock);
$re = fwrite($fp, $string); $re = fwrite($fp, $string);