Merge branch 'master' into offset-comment-and-tests

Conflicts:
	tests/methods/ParseTest.php
This commit is contained in:
Fonata
2018-03-17 12:42:18 +01:00
15 changed files with 841 additions and 336 deletions

View File

@@ -1,264 +1,282 @@
ParseCSV 1.0.0-rc.2 ParseCSV dev-master
----------------------------------- -----------------------------------
Date: unreleased Date: unreleased
- Renamed class from parseCSV to Csv and added name- - New function getTotalDataRowCount() - useful if
space "ParseCsv" for PSR compliance. $limit is set - see pull request #122.
- Added support for MS Excel's "sep=" to detect the -----------------------------------
delimiter (Issue #60).
- Added support for mb_convert_encoding() instead of ParseCSV 1.0.0
iconv() - see issue #109 -----------------------------------
Date: 3-March-2018
- A number of minor bug fixes - see GitHub issues
- Renamed class from parseCSV to Csv and added name-
----------------------------------- space "ParseCsv" for PSR compliance.
- Added support for MS Excel's "sep=" to detect the
parseCSV 0.4.3 beta delimiter (Issue #60).
-----------------------------------
Date: 1-July-2008 - Added data type detection - function getDatatypes()
guesses the type of each column.
- Issue #4. Added an option for setting sorting
type behavior when sorting data. - MIME: output() sends correct MIME type to browser
Simply set $csv->sort_type to "regular", "numeric", if the separator is a tab tab (Issue #79).
or "string".
- Added support for mb_convert_encoding() instead of
- Issue #6. Raw loaded file data is now cleared from iconv() - see issue #109.
file_data property when it has been successfully
parsed to keep parseCSV's memory footprint to a - A number of minor bug fixes - see GitHub issues.
minimum. Specifically handy when using multiple
instances of parseCSV to process large files. - Added many more unit tests.
----------------------------------- -----------------------------------
parseCSV 0.4.2 beta parseCSV 0.4.3 beta
----------------------------------- -----------------------------------
Date: 31-May-2008 Date: 1-July-2008
- IMPORTANT! If you're using the output(), - Issue #4. Added an option for setting sorting
method please note that the first parameter type behavior when sorting data.
has been completely removed as it was Simply set $csv->sort_type to "regular", "numeric",
technically just useless. Instead, the second or "string".
parameter (filename) doubles as its replacement.
Simply put, if filename is not set or null, the - Issue #6. Raw loaded file data is now cleared from
output() method will not output a downloadable file_data property when it has been successfully
file. Please update your existing code parsed to keep parseCSV's memory footprint to a
when using 0.4.2 and later :) minimum. Specifically handy when using multiple
instances of parseCSV to process large files.
- Small fix to the headers sent by the output()
method. -----------------------------------
- Added a download example using the output()
method to the examples folder. parseCSV 0.4.2 beta
-----------------------------------
----------------------------------- Date: 31-May-2008
- IMPORTANT! If you're using the output(),
parseCSV 0.4.1 beta method please note that the first parameter
----------------------------------- has been completely removed as it was
Date: 29-May-2008 technically just useless. Instead, the second
parameter (filename) doubles as its replacement.
- Fixed a small bug in how the output() method Simply put, if filename is not set or null, the
handles input data. output() method will not output a downloadable
file. Please update your existing code
----------------------------------- when using 0.4.2 and later :)
- Small fix to the headers sent by the output()
parseCSV 0.4 beta method.
-----------------------------------
Date: 11-Apr-2008 - Added a download example using the output()
method to the examples folder.
- Error reporting for files/data which is corrupt
or has formatting errors like using double -----------------------------------
quotes in a field without enclosing quotes. Or
not escaping double quotes with a second one.
parseCSV 0.4.1 beta
- parse() method does not require input anymore -----------------------------------
if the "$object->file" property has been set. Date: 29-May-2008
I'm calling this a beta release due to the heavy - Fixed a small bug in how the output() method
modifications to the core parsing logic required handles input data.
for error reporting to work. I have tested the
new code quite extensively, I'm fairly confident -----------------------------------
that it still parses exactly as it always has.
The second reason I'm calling it a beta release parseCSV 0.4 beta
is cause I'm sure the error reporting code will -----------------------------------
need more refinements and tweaks to detect more Date: 11-Apr-2008
types of errors, as it's only picking two types
or syntax errors right now. However, it seems - Error reporting for files/data which is corrupt
these two are the most common errors that you or has formatting errors like using double
would be likely to come across. quotes in a field without enclosing quotes. Or
not escaping double quotes with a second one.
-----------------------------------
- parse() method does not require input anymore
if the "$object->file" property has been set.
parseCSV 0.3.2
----------------------------------- I'm calling this a beta release due to the heavy
Date: 1-Apr-2008 modifications to the core parsing logic required
for error reporting to work. I have tested the
This is primarily a bug-fix release for a critical new code quite extensively, I'm fairly confident
bug which was brought to my attention. that it still parses exactly as it always has.
- Fixed a critical bug in conditions parsing which The second reason I'm calling it a beta release
would generate corrupt matching patterns causing is cause I'm sure the error reporting code will
the condition(s) to not work at all in some need more refinements and tweaks to detect more
situations. types of errors, as it's only picking two types
or syntax errors right now. However, it seems
- Fixed a small code error which would cause PHP to these two are the most common errors that you
generate a invalid offset notice when zero length would be likely to come across.
values were fed into the unparse() method to
generate CSV data from an array. -----------------------------------
Notice: If you have been using the "parsecsv-stable"
branch as an external in any of your projects, parseCSV 0.3.2
please use the "stable/parsecsv" branch from this -----------------------------------
point on as I will eventually remove the former due Date: 1-Apr-2008
to it's stupid naming.
This is primarily a bug-fix release for a critical
----------------------------------- bug which was brought to my attention.
- Fixed a critical bug in conditions parsing which
parseCSV 0.3.1 would generate corrupt matching patterns causing
----------------------------------- the condition(s) to not work at all in some
Date: 1-Sep-2007 situations.
- Small change to default output settings to - Fixed a small code error which would cause PHP to
conform with RFC 4180 (http://rfc.net/rfc4180.html). generate a invalid offset notice when zero length
Only the LF (line feed) character was used values were fed into the unparse() method to
by default to separate rows, rather than generate CSV data from an array.
CRLF (carriage return & line feed).
Notice: If you have been using the "parsecsv-stable"
----------------------------------- branch as an external in any of your projects,
please use the "stable/parsecsv" branch from this
point on as I will eventually remove the former due
parseCSV 0.3.0 to it's stupid naming.
-----------------------------------
Date: 9-Aug-2007 -----------------------------------
- Changed to the MIT license.
parseCSV 0.3.1
- Added offset and limit options. -----------------------------------
Date: 1-Sep-2007
- Added SQL-like conditions for quickly
filtering out entries. Documentation on the - Small change to default output settings to
condition syntax is forthcoming. conform with RFC 4180 (http://rfc.net/rfc4180.html).
Only the LF (line feed) character was used
- Small parsing modification to comply by default to separate rows, rather than
with some recent changes to the specifications CRLF (carriage return & line feed).
outlined on Wikipedia's Comma-separated values
article. -----------------------------------
- Minor changes and optimizations, and a few
spelling corrections. Oops :) parseCSV 0.3.0
-----------------------------------
- Included more complex code examples in the Date: 9-Aug-2007
parseCSV download.
- Changed to the MIT license.
-----------------------------------
- Added offset and limit options.
parseCSV 0.2.1 - Added SQL-like conditions for quickly
----------------------------------- filtering out entries. Documentation on the
Date: 8-Aug-2007 condition syntax is forthcoming.
- Fixed stupid code which caused auto function - Small parsing modification to comply
to not work in some situations. with some recent changes to the specifications
outlined on Wikipedia's Comma-separated values
----------------------------------- article.
- Minor changes and optimizations, and a few
parseCSV 0.2.0 beta spelling corrections. Oops :)
-----------------------------------
Date: 2-Jan-2007 - Included more complex code examples in the
parseCSV download.
- Added auto() function to automatically detect
delimiter character. -----------------------------------
Useful for user upload in case delimiter is
comma (,), tab, or semi-colon (;). Some
versions of MS Excel for Windows use parseCSV 0.2.1
semi-colons instead of commas when saving to -----------------------------------
CSV files. Date: 8-Aug-2007
It uses a process of elimination to eliminate
characters that can not be the delimiter, - Fixed stupid code which caused auto function
so it should work on all CSV-structured files to not work in some situations.
almost no matter what the delimiter is.
-----------------------------------
- Generally updated some of the core workings
to increase performance, and offer better
support for large (1MB and up) files. parseCSV 0.2.0 beta
-----------------------------------
- Added code examples to header comment. Date: 2-Jan-2007
----------------------------------- - Added auto() function to automatically detect
delimiter character.
Useful for user upload in case delimiter is
parseCSV 0.1.6 beta comma (,), tab, or semi-colon (;). Some
----------------------------------- versions of MS Excel for Windows use
Date: 22-Dec-2006 semi-colons instead of commas when saving to
CSV files.
- Updated output() function. It uses a process of elimination to eliminate
characters that can not be the delimiter,
----------------------------------- so it should work on all CSV-structured files
almost no matter what the delimiter is.
parseCSV 0.1.5 beta - Generally updated some of the core workings
----------------------------------- to increase performance, and offer better
Date: 22-Dec-2006 support for large (1MB and up) files.
- Added output() function for easy output to - Added code examples to header comment.
browser, for downloading features for example.
-----------------------------------
-----------------------------------
parseCSV 0.1.6 beta
parseCSV 0.1.4 beta -----------------------------------
----------------------------------- Date: 22-Dec-2006
Date: 17-Dec-2006
- Updated output() function.
- Minor changes and fixes
-----------------------------------
-----------------------------------
parseCSV 0.1.5 beta
parseCSV 0.1.3 beta -----------------------------------
----------------------------------- Date: 22-Dec-2006
Date: 17-Dec-2006
- Added output() function for easy output to
- Added GPL v2.0 license. browser, for downloading features for example.
----------------------------------- -----------------------------------
parseCSV 0.1.2 beta parseCSV 0.1.4 beta
----------------------------------- -----------------------------------
Date: 17-Dec-2006 Date: 17-Dec-2006
- Added encoding() function for easier character - Minor changes and fixes
encoding configuration.
-----------------------------------
-----------------------------------
parseCSV 0.1.3 beta
parseCSV 0.1.1 beta -----------------------------------
----------------------------------- Date: 17-Dec-2006
Date: 24-Nov-2006
- Added GPL v2.0 license.
- Added support for a PHP die command on first
line of csv files if they have a .php extension -----------------------------------
to protect secure data from being displayed
directly to the browser.
parseCSV 0.1.2 beta
----------------------------------- -----------------------------------
Date: 17-Dec-2006
parseCSV 0.1 beta - Added encoding() function for easier character
----------------------------------- encoding configuration.
Date: 23-Nov-2006
-----------------------------------
- Initial release
----------------------------------- parseCSV 0.1.1 beta
-----------------------------------
Date: 24-Nov-2006
- Added support for a PHP die command on first
line of csv files if they have a .php extension
to protect secure data from being displayed
directly to the browser.
-----------------------------------
parseCSV 0.1 beta
-----------------------------------
Date: 23-Nov-2006
- Initial release
-----------------------------------

View File

@@ -12,6 +12,23 @@ and third-party support for handling CSV data in PHP.
[csv]: http://en.wikipedia.org/wiki/Comma-separated_values [csv]: http://en.wikipedia.org/wiki/Comma-separated_values
## Features
* ParseCsv is a complete and fully featured CSV solution for PHP
* Supports enclosed values, enclosed commas, double quotes and new lines.
* Automatic delimiter character detection.
* Sort data by specific fields/columns.
* Easy data manipulation.
* Basic SQL-like _conditions_, _offset_ and _limit_ options for filtering
data.
* Error detection for incorrectly formatted input. It attempts to be
intelligent, but can not be trusted 100% due to the structure of CSV, and
how different programs like Excel for example outputs CSV data.
* Support for character encoding conversion using PHP's
`iconv()` and `mb_convert_encoding()` functions.
* Supports PHP 5.4 and higher.
It certainly works with PHP 7.2 and all versions in between.
## Installation ## Installation
Installation is easy using Composer. Just run the following on the Installation is easy using Composer. Just run the following on the
@@ -33,23 +50,6 @@ repository or extract the
[ZIP](https://github.com/parsecsv/parsecsv-for-php/archive/master.zip). [ZIP](https://github.com/parsecsv/parsecsv-for-php/archive/master.zip).
To use ParseCSV, you then have to add a `require 'parsecsv.lib.php';` line. To use ParseCSV, you then have to add a `require 'parsecsv.lib.php';` line.
## Features
* ParseCsv is a complete and fully featured CSV solution for PHP
* Supports enclosed values, enclosed commas, double quotes and new lines.
* Automatic delimiter character detection.
* Sort data by specific fields/columns.
* Easy data manipulation.
* Basic SQL-like _conditions_, _offset_ and _limit_ options for filtering
data.
* Error detection for incorrectly formatted input. It attempts to be
intelligent, but can not be trusted 100% due to the structure of CSV, and
how different programs like Excel for example outputs CSV data.
* Support for character encoding conversion using PHP's
`iconv()` and `mb_convert_encoding()` functions.
* Supports PHP 5.4 and higher.
It certainly works with PHP 7.2 and all versions in between.
## Example Usage ## Example Usage
**General** **General**
@@ -77,6 +77,40 @@ $csv->auto('data.csv');
print_r($csv->data); print_r($csv->data);
``` ```
**Parse data with offset**
* ignoring the first X (e.g. two) rows
```php
$csv = new ParseCsv\Csv();
$csv->offset = 2;
$csv->parse('data.csv');
print_r($csv->data);
```
**Limit the number of returned data rows**
```php
$csv = new ParseCsv\Csv();
$csv->limit = 5;
$csv->parse('data.csv');
print_r($csv->data);
```
**Get total number of data rows without parsing whole data**
* Excluding heading line if present (see $csv->header property)
```php
$csv = new ParseCsv\Csv();
$csv->load_data('data.csv');
$count = $csv->getTotalRowCount();
print_r($count);
```
**Get most common data type for each column (Requires PHP >= 5.5)**
```php
$csv = new ParseCsv\Csv('data.csv');
$csv->getDatatypes()
print_r($csv->data_types);
```
**Modify data in a CSV file** **Modify data in a CSV file**
```php ```php

View File

@@ -6,6 +6,10 @@
// Check if people used Composer to include this project in theirs // Check if people used Composer to include this project in theirs
if (!file_exists(__DIR__ . '/vendor/autoload.php')) { if (!file_exists(__DIR__ . '/vendor/autoload.php')) {
require __DIR__ . '/src/enums/AbstractEnum.php';
require __DIR__ . '/src/enums/DatatypeEnum.php';
require __DIR__ . '/src/enums/FileProcessingModeEnum.php';
require __DIR__ . '/src/enums/SortEnum.php';
require __DIR__ . '/src/extensions/DatatypeTrait.php'; require __DIR__ . '/src/extensions/DatatypeTrait.php';
require __DIR__ . '/src/Csv.php'; require __DIR__ . '/src/Csv.php';
} else { } else {

View File

@@ -2,12 +2,13 @@
namespace ParseCsv; namespace ParseCsv;
use ParseCsv\enums\FileProcessingModeEnum;
use ParseCsv\enums\SortEnum;
use ParseCsv\extensions\DatatypeTrait; use ParseCsv\extensions\DatatypeTrait;
class Csv { class Csv {
/* /*
Class: ParseCSV 1.0.0-rc.2
https://github.com/parsecsv/parsecsv-for-php https://github.com/parsecsv/parsecsv-for-php
Fully conforms to the specifications lined out on Wikipedia: Fully conforms to the specifications lined out on Wikipedia:
@@ -89,7 +90,7 @@ class Csv {
* *
* @var string|null * @var string|null
*/ */
public $sort_type = null; public $sort_type = SortEnum::SORT_TYPE_REGULAR;
/** /**
* Delimiter * Delimiter
@@ -299,12 +300,34 @@ class Csv {
* Class constructor * Class constructor
* *
* @param string|null $input The CSV string or a direct filepath * @param string|null $input The CSV string or a direct filepath
* @param integer|null $offset Number of rows to ignore from the beginning of the data * @param integer|null $offset Number of rows to ignore from the beginning
* @param integer|null $limit Limits the number of returned rows to specified amount * of the data
* @param string|null $conditions Basic SQL-like conditions for row matching * @param integer|null $limit Limits the number of returned rows to
* @param null|true $keep_file_data Keep raw file data in memory after successful parsing (useful for debugging) * specified amount
* @param string|null $conditions Basic SQL-like conditions for row
* matching
* @param null|true $keep_file_data Keep raw file data in memory after
* successful parsing (useful for debugging)
*/ */
public function __construct($input = null, $offset = null, $limit = null, $conditions = null, $keep_file_data = null) { public function __construct($input = null, $offset = null, $limit = null, $conditions = null, $keep_file_data = null) {
$this->init($offset, $limit, $conditions, $keep_file_data);
if (!empty($input)) {
$this->parse($input);
}
}
/**
* @param integer|null $offset Number of rows to ignore from the beginning
* of the data
* @param integer|null $limit Limits the number of returned rows to
* specified amount
* @param string|null $conditions Basic SQL-like conditions for row
* matching
* @param null|true $keep_file_data Keep raw file data in memory after
* successful parsing (useful for debugging)
*/
public function init($offset = null, $limit = null, $conditions = null, $keep_file_data = null) {
if (!is_null($offset)) { if (!is_null($offset)) {
$this->offset = $offset; $this->offset = $offset;
} }
@@ -320,10 +343,6 @@ class Csv {
if (!is_null($keep_file_data)) { if (!is_null($keep_file_data)) {
$this->keep_file_data = $keep_file_data; $this->keep_file_data = $keep_file_data;
} }
if (!empty($input)) {
$this->parse($input);
}
} }
// ============================================== // ==============================================
@@ -346,32 +365,29 @@ class Csv {
$input = $this->file; $input = $this->file;
} }
if (!empty($input)) { if (empty($input)) {
if (!is_null($offset)) { // todo: but why true?
$this->offset = $offset; return true;
} }
if (!is_null($limit)) { $this->init($offset, $limit, $conditions);
$this->limit = $limit;
}
if (!is_null($conditions)) {
$this->conditions = $conditions;
}
if (strlen($input) <= PHP_MAXPATHLEN && is_readable($input)) { if (strlen($input) <= PHP_MAXPATHLEN && is_readable($input)) {
$this->data = $this->parse_file($input); $this->file = $input;
} else { $this->data = $this->parse_file();
$this->file_data = &$input; } else {
$this->data = $this->parse_string(); $this->file = null;
} $this->file_data = &$input;
$this->data = $this->parse_string();
}
if ($this->data === false) { if ($this->data === false) {
return false; return false;
}
} }
return true; return true;
} }
/** /**
@@ -381,16 +397,16 @@ class Csv {
* @param string $file File location to save to * @param string $file File location to save to
* @param array $data 2D array of data * @param array $data 2D array of data
* @param bool $append Append current data to end of target CSV, if file exists * @param bool $append Append current data to end of target CSV, if file exists
* @param array $fields Field names * @param array $fields Field names. Sets the header. If it is not set $this->titles would be used instead.
* *
* @return bool * @return bool
*/ */
public function save($file = '', $data = array(), $append = false, $fields = array()) { public function save($file = '', $data = array(), $append = FileProcessingModeEnum::MODE_FILE_OVERWRITE, $fields = array()) {
if (empty($file)) { if (empty($file)) {
$file = &$this->file; $file = &$this->file;
} }
$mode = $append ? 'ab' : 'wb'; $mode = FileProcessingModeEnum::getAppendMode($append);
$is_php = preg_match('/\.php$/i', $file) ? true : false; $is_php = preg_match('/\.php$/i', $file) ? true : false;
return $this->_wfile($file, $this->unparse($data, $fields, $append, $is_php), $mode); return $this->_wfile($file, $this->unparse($data, $fields, $append, $is_php), $mode);
@@ -510,6 +526,44 @@ class Csv {
return $this->delimiter; return $this->delimiter;
} }
/**
* Get total number of data rows (exclusive heading line if present) in csv
* without parsing whole data.
*
* @return bool|int
*/
public function getTotalDataRowCount() {
if (empty($this->file_data)) {
return false;
}
$data = $this->file_data;
$this->_detect_and_remove_sep_row_from_data($data);
$pattern = sprintf('/(%1$s[^%1$s]*%1$s)/i', $this->enclosure);
preg_match_all($pattern, $data, $matches);
foreach ($matches[0] as $match) {
if (empty($match) || (strpos($match, $this->enclosure) === false)) {
continue;
}
$replace = str_replace(["\r", "\n"], '', $match);
$data = str_replace($match, $replace, $data);
}
$headingRow = $this->heading ? 1 : 0;
$count = substr_count($data, "\r")
+ substr_count($data, "\n")
- substr_count($data, "\r\n")
- $headingRow;
return $count;
}
// ============================================== // ==============================================
// ----- [ Core Functions ] --------------------- // ----- [ Core Functions ] ---------------------
// ============================================== // ==============================================
@@ -522,7 +576,7 @@ class Csv {
* *
* @return array|bool * @return array|bool
*/ */
public function parse_file($file = null) { protected function parse_file($file = null) {
if (is_null($file)) { if (is_null($file)) {
$file = $this->file; $file = $this->file;
} }
@@ -545,7 +599,7 @@ class Csv {
* *
* @return array|false - 2D array with CSV data, or false on failure * @return array|false - 2D array with CSV data, or false on failure
*/ */
public function parse_string($data = null) { protected function parse_string($data = null) {
if (empty($data)) { if (empty($data)) {
if ($this->_check_data()) { if ($this->_check_data()) {
$data = &$this->file_data; $data = &$this->file_data;
@@ -696,13 +750,7 @@ class Csv {
$this->titles = $head; $this->titles = $head;
if (!empty($this->sort_by)) { if (!empty($this->sort_by)) {
$sort_type = SORT_REGULAR; $sort_type = SortEnum::getSorting($this->sort_type);
if ($this->sort_type == 'numeric') {
$sort_type = SORT_NUMERIC;
} elseif ($this->sort_type == 'string') {
$sort_type = SORT_STRING;
}
$this->sort_reverse ? krsort($rows, $sort_type) : ksort($rows, $sort_type); $this->sort_reverse ? krsort($rows, $sort_type) : ksort($rows, $sort_type);
if ($this->offset !== null || $this->limit !== null) { if ($this->offset !== null || $this->limit !== null) {
@@ -730,7 +778,7 @@ class Csv {
* *
* @return string CSV data * @return string CSV data
*/ */
public function unparse($data = array(), $fields = array(), $append = false, $is_php = false, $delimiter = null) { public function unparse($data = array(), $fields = array(), $append = FileProcessingModeEnum::MODE_FILE_OVERWRITE, $is_php = false, $delimiter = null) {
if (!is_array($data) || empty($data)) { if (!is_array($data) || empty($data)) {
$data = &$this->data; $data = &$this->data;
} }
@@ -747,8 +795,15 @@ class Csv {
$entry = array(); $entry = array();
// create heading // create heading
$fieldOrder = $this->_validate_fields_for_unparse($fields);
if (!$fieldOrder && !empty($data)) {
$column_count = count($data[0]);
$columns = range(0, $column_count - 1, 1);
$fieldOrder = array_combine($columns, $columns);
}
if ($this->heading && !$append && !empty($fields)) { if ($this->heading && !$append && !empty($fields)) {
foreach ($fields as $key => $column_name) { foreach ($fieldOrder as $column_name) {
$entry[] = $this->_enclose_value($column_name, $delimiter); $entry[] = $this->_enclose_value($column_name, $delimiter);
} }
@@ -758,7 +813,8 @@ class Csv {
// create data // create data
foreach ($data as $key => $row) { foreach ($data as $key => $row) {
foreach ($row as $cell_value) { foreach (array_keys($fieldOrder) as $index){
$cell_value = $row[$index];
$entry[] = $this->_enclose_value($cell_value, $delimiter); $entry[] = $this->_enclose_value($cell_value, $delimiter);
} }
@@ -773,6 +829,42 @@ class Csv {
return $string; return $string;
} }
private function _validate_fields_for_unparse($fields){
// this is needed because sometime titles property is overwritten instead of using fields parameter!
$titlesOnParse = !empty($this->data) ? array_keys($this->data[0]) : array();
if (empty($fields)){
$fields = $this->titles;
}
if (empty($fields)){
return array();
}
// both are identical, also in ordering
if (array_values($fields) === array_values($titlesOnParse)){
return array_combine($fields, $fields);
}
// if renaming given by: $oldName => $newName (maybe with reorder and / or subset):
// todo: this will only work if titles are unique
$fieldOrder = array_intersect(array_flip($fields), $titlesOnParse);
if (!empty($fieldOrder)) {
return array_flip($fieldOrder);
}
$fieldOrder = array_intersect($fields, $titlesOnParse);
if (!empty($fieldOrder)) {
return array_combine($fieldOrder, $fieldOrder);
}
// original titles are not given in fields. that is okay if count is okay.
if (count($fields) != count($titlesOnParse)) {
throw new \UnexpectedValueException('The specified fields do not match any titles and do not match column count.');
}
return array_combine($titlesOnParse, $fields);
}
/** /**
* Load local file or string * Load local file or string
* *
@@ -885,12 +977,19 @@ class Csv {
*/ */
protected function _validate_row_condition($row, $condition) { protected function _validate_row_condition($row, $condition) {
$operators = array( $operators = array(
'=', 'equals', 'is', '=',
'!=', 'is not', 'equals',
'<', 'is less than', 'is',
'>', 'is greater than', '!=',
'<=', 'is less than or equals', 'is not',
'>=', 'is greater than or equals', '<',
'is less than',
'>',
'is greater than',
'<=',
'is less than or equals',
'>=',
'is greater than or equals',
'contains', 'contains',
'does not contain', 'does not contain',
); );

View File

@@ -0,0 +1,40 @@
<?php
namespace ParseCsv\enums;
use ReflectionClass;
abstract class AbstractEnum {
/**
* Creates a new value of some type
*
* @param mixed $value
*
* @throws \UnexpectedValueException if incompatible type is given.
*/
public function __construct($value)
{
if (!$this->isValid($value)) {
throw new \UnexpectedValueException("Value '$value' is not part of the enum " . get_called_class());
}
$this->value = $value;
}
public static function getConstants(){
$class = get_called_class();
$reflection = new \ReflectionClass($class);
return $reflection->getConstants();
}
/**
* Check if enum value is valid
*
* @param $value
*
* @return bool
*/
public static function isValid($value)
{
return in_array($value, static::getConstants(), true);
}
}

View File

@@ -9,7 +9,7 @@ namespace ParseCsv\enums;
* *
* todo: needs a basic parent enum class for error handling. * todo: needs a basic parent enum class for error handling.
*/ */
class DatatypeEnum { class DatatypeEnum extends AbstractEnum {
const __DEFAULT = self::TYPE_STRING; const __DEFAULT = self::TYPE_STRING;

View File

@@ -0,0 +1,28 @@
<?php
namespace ParseCsv\enums;
/**
* Class FileProcessingEnum
*
* @package ParseCsv\enums
*
* todo extends a basic enum class after merging #121
*/
class FileProcessingModeEnum {
const __default = self::MODE_FILE_OVERWRITE;
const MODE_FILE_APPEND = true;
const MODE_FILE_OVERWRITE = false;
public static function getAppendMode($mode) {
if ($mode == self::MODE_FILE_APPEND){
return 'ab';
}
return 'wb';
}
}

28
src/enums/SortEnum.php Normal file
View File

@@ -0,0 +1,28 @@
<?php
namespace ParseCsv\enums;
class SortEnum extends AbstractEnum {
const __DEFAULT = self::SORT_TYPE_REGULAR;
const SORT_TYPE_REGULAR = 'regular';
const SORT_TYPE_NUMERIC = 'numeric';
const SORT_TYPE_STRING = 'string';
private static $sorting = array(
self::SORT_TYPE_REGULAR => SORT_REGULAR,
self::SORT_TYPE_STRING => SORT_STRING,
self::SORT_TYPE_NUMERIC => SORT_NUMERIC
);
public static function getSorting($type){
if (array_key_exists($type, self::$sorting)){
return self::$sorting[$type];
}
return self::$sorting[self::__DEFAULT];
}
}

View File

@@ -2,6 +2,8 @@
namespace ParseCsv\extensions; namespace ParseCsv\extensions;
use ParseCsv\enums\DatatypeEnum;
trait DatatypeTrait { trait DatatypeTrait {
/** /**
@@ -47,7 +49,7 @@ trait DatatypeTrait {
* *
* @access public * @access public
* *
* @uses getDatatypeFromString * @uses DatatypeEnum::getValidTypeFromSample
* *
* @return array|bool * @return array|bool
*/ */
@@ -62,7 +64,7 @@ trait DatatypeTrait {
$result = []; $result = [];
foreach ($this->titles as $cName) { foreach ($this->titles as $cName) {
$column = array_column($this->data, $cName); $column = array_column($this->data, $cName);
$cDatatypes = array_map('ParseCsv\enums\DatatypeEnum::getValidTypeFromSample', $column); $cDatatypes = array_map(DatatypeEnum::class . '::getValidTypeFromSample', $column);
$result[$cName] = $this->getMostFrequentDatatypeForColumn($cDatatypes); $result[$cName] = $this->getMostFrequentDatatypeForColumn($cDatatypes);
} }
@@ -71,4 +73,41 @@ trait DatatypeTrait {
return !empty($this->data_types) ? $this->data_types : []; return !empty($this->data_types) ? $this->data_types : [];
} }
/**
* Check data type of titles / first row for auto detecting if this could be
* a heading line.
*
* Requires PHP >= 5.5
*
* @access public
*
* @uses DatatypeEnum::getValidTypeFromSample
*
* @return bool
*/
public function autoDetectFileHasHeading(){
if (empty($this->data)){
throw new \UnexpectedValueException('No data set yet.');
}
if ($this->heading){
$firstRow = $this->titles;
} else {
$firstRow = $this->data[0];
}
$firstRow = array_filter($firstRow);
if (empty($firstRow)){
return false;
}
$firstRowDatatype = array_map(DatatypeEnum::class . '::getValidTypeFromSample', $firstRow);
if ($this->getMostFrequentDatatypeForColumn($firstRowDatatype) !== DatatypeEnum::TYPE_STRING){
return false;
}
return true;
}
} }

View File

@@ -0,0 +1,76 @@
<?php
namespace ParseCsv\tests\methods;
use ParseCsv\Csv;
use PHPUnit\Framework\TestCase;
class DataRowCountTest extends TestCase {
/**
* CSV
* The CSV object
*
* @access protected
* @var Csv
*/
protected $csv;
/**
* Setup
* Setup our test environment objects
*
* @access public
*/
public function setUp() {
$this->csv = new Csv();
}
public function countRowsProvider() {
return [
'auto-double-enclosure' => [
'auto-double-enclosure.csv',
2,
],
'auto-single-enclosure' => [
'auto-single-enclosure.csv',
2,
],
'UTF-8_sep_row' => [
'datatype.csv',
3,
],
];
}
/**
* @dataProvider countRowsProvider
*
* @param string $file
* @param int $expectedRows
*/
public function testGetTotalRowCountFromFile($file, $expectedRows) {
$this->csv->heading = true;
$this->csv->load_data(__DIR__ . '/fixtures/' . $file);
$this->assertEquals($expectedRows, $this->csv->getTotalDataRowCount());
}
public function testGetTotalRowCountMissingEndingLineBreak() {
$this->csv->heading = false;
$this->csv->enclosure = '"';
$sInput = "86545235689,a\r\n34365587654,b\r\n13469874576,\"c\r\nd\"";
$this->csv->load_data($sInput);
$this->assertEquals(3, $this->csv->getTotalDataRowCount());
}
public function testGetTotalRowCountSingleEnclosure() {
$this->csv->heading = false;
$this->csv->enclosure = "'";
$sInput = "86545235689,a\r\n34365587654,b\r\n13469874576,\'c\r\nd\'";
$this->csv->load_data($sInput);
$this->assertEquals(3, $this->csv->getTotalDataRowCount());
}
}

View File

@@ -85,7 +85,8 @@ class ParseTest extends TestCase {
$this->csv->enclosure = '"'; $this->csv->enclosure = '"';
$sInput = "86545235689,a\r\n34365587654,b\r\n13469874576,\"c\r\nd\""; $sInput = "86545235689,a\r\n34365587654,b\r\n13469874576,\"c\r\nd\"";
$expected_data = [86545235689, 34365587654, 13469874576]; $expected_data = [86545235689, 34365587654, 13469874576];
$actual_data = $this->csv->parse_string($sInput);
$actual_data = $this->invokeMethod($this->csv, 'parse_string', array($sInput));
$actual_column = array_map('reset', $actual_data); $actual_column = array_map('reset', $actual_data);
$this->assertEquals($expected_data, $actual_column); $this->assertEquals($expected_data, $actual_column);
$this->assertEquals([ $this->assertEquals([
@@ -153,6 +154,34 @@ class ParseTest extends TestCase {
$this->assertEquals($expected, $this->csv->data_types); $this->assertEquals($expected, $this->csv->data_types);
} }
/**
* @depends testSepRowAutoDetection
*/
public function testAutoDetectFileHasHeading(){
if (!function_exists('array_column')) {
// getDatatypes requires array_column, but that
// function is only available in PHP >= 5.5
return;
}
$this->csv->auto(__DIR__ . '/fixtures/datatype.csv');
$this->assertTrue($this->csv->autoDetectFileHasHeading());
$this->csv->heading = false;
$this->csv->auto(__DIR__ . '/fixtures/datatype.csv');
$this->assertTrue($this->csv->autoDetectFileHasHeading());
$this->csv->heading = false;
$sInput = "86545235689\r\n34365587654\r\n13469874576";
$this->csv->auto($sInput);
$this->assertFalse($this->csv->autoDetectFileHasHeading());
$this->csv->heading = true;
$sInput = "86545235689\r\n34365587654\r\n13469874576";
$this->csv->auto($sInput);
$this->assertFalse($this->csv->autoDetectFileHasHeading());
}
protected function _get_magazines_data() { protected function _get_magazines_data() {
return [ return [
[ [
@@ -194,4 +223,22 @@ class ParseTest extends TestCase {
$this->assertArrayHasKey('column1', $csv->data[0], 'Data parsed incorrectly with enclosure ' . $enclosure); $this->assertArrayHasKey('column1', $csv->data[0], 'Data parsed incorrectly with enclosure ' . $enclosure);
$this->assertEquals('value1', $csv->data[0]['column1'], 'Data parsed incorrectly with enclosure ' . $enclosure); $this->assertEquals('value1', $csv->data[0]['column1'], 'Data parsed incorrectly with enclosure ' . $enclosure);
} }
/**
* Call protected/private method of a class.
*
* @param object &$object Instantiated object that we will run method on.
* @param string $methodName Method name to call
* @param array $parameters Array of parameters to pass into method.
*
* @return mixed Method return.
*/
private function invokeMethod(&$object, $methodName, array $parameters = array())
{
$reflection = new \ReflectionClass(get_class($object));
$method = $reflection->getMethod($methodName);
$method->setAccessible(true);
return $method->invokeArgs($object, $parameters);
}
} }

View File

@@ -49,6 +49,13 @@ class SaveTest extends TestCase
$this->saveAndCompare($expected); $this->saveAndCompare($expected);
} }
public function testSaveWithNewHeader() {
$this->csv->linefeed = "\n";
$this->csv->titles = array("NewTitle");
$expected = "NewTitle\n0444\n5555\n";
$this->saveAndCompare($expected);
}
public function testSaveWithoutHeader() { public function testSaveWithoutHeader() {
$this->csv->linefeed = "\n"; $this->csv->linefeed = "\n";
$this->csv->heading = false; $this->csv->heading = false;

View File

@@ -0,0 +1,62 @@
<?php
namespace ParseCsv\tests\methods;
use ParseCsv\Csv;
use PHPUnit\Framework\TestCase;
class UnparseTest extends Testcase {
/** @var Csv */
private $csv;
/**
* Setup our test environment objects; will be called before each test.
*/
public function setUp() {
$this->csv = new Csv();
$this->csv->auto(__DIR__ . '/fixtures/auto-double-enclosure.csv');
}
public function testUnparseDefault() {
$expected = "column1,column2\rvalue1,value2\rvalue3,value4\r";
$this->unparseAndCompare($expected);
}
public function testUnparseDefaultWithoutHeading(){
$this->csv->heading = false;
$this->csv->auto(__DIR__ . '/fixtures/auto-double-enclosure.csv');
$expected = "column1,column2\rvalue1,value2\rvalue3,value4\r";
$this->unparseAndCompare($expected);
}
public function testUnparseRenameFields() {
$expected = "C1,C2\rvalue1,value2\rvalue3,value4\r";
$this->unparseAndCompare($expected, array("C1", "C2"));
}
public function testReorderFields() {
$expected = "column2,column1\rvalue2,value1\rvalue4,value3\r";
$this->unparseAndCompare($expected, array("column2", "column1"));
}
public function testSubsetFields() {
$expected = "column1\rvalue1\rvalue3\r";
$this->unparseAndCompare($expected, array("column1"));
}
public function testReorderAndRenameFields() {
$fields = array(
'column2' => 'C2',
'column1' => 'C1',
);
$expected = "C2,C1\rvalue2,value1\rvalue4,value3\r";
$this->unparseAndCompare($expected, $fields);
}
private function unparseAndCompare($expected, $fields = array()) {
$str = $this->csv->unparse($this->csv->data, $fields);
$this->assertEquals($expected, $str);
}
}

View File

@@ -57,7 +57,7 @@ class DefaultValuesPropertiesTest extends TestCase {
} }
public function test_sort_type_default() { public function test_sort_type_default() {
$this->assertNull($this->csv->sort_type); $this->assertEquals('regular', $this->csv->sort_type);
} }
public function test_delimiter_default() { public function test_delimiter_default() {

View File

@@ -3,6 +3,7 @@
namespace ParseCsv\tests\properties; namespace ParseCsv\tests\properties;
use ParseCsv\Csv; use ParseCsv\Csv;
use ParseCsv\enums\SortEnum;
use PHPUnit\Framework\TestCase; use PHPUnit\Framework\TestCase;
class PublicPropertiesTest extends TestCase { class PublicPropertiesTest extends TestCase {
@@ -145,4 +146,26 @@ class PublicPropertiesTest extends TestCase {
$this->assertCount($counter, $this->properties); $this->assertCount($counter, $this->properties);
} }
public function testDefaultSortTypeIsRegular(){
$this->assertEquals(SortEnum::SORT_TYPE_REGULAR, $this->csv->sort_type);
}
public function testSetSortType(){
$this->csv->sort_type = 'numeric';
$this->assertEquals(SortEnum::SORT_TYPE_NUMERIC, $this->csv->sort_type);
$this->csv->sort_type = 'string';
$this->assertEquals(SortEnum::SORT_TYPE_STRING, $this->csv->sort_type);
}
public function testGetSorting(){
$this->csv->sort_type = 'numeric';
$sorting = SortEnum::getSorting($this->csv->sort_type);
$this->assertEquals(SORT_NUMERIC, $sorting);
$this->csv->sort_type = 'string';
$sorting = SortEnum::getSorting($this->csv->sort_type);
$this->assertEquals(SORT_STRING, $sorting);
}
} }