the whole shebang

This commit is contained in:
2014-11-25 16:42:40 +01:00
parent 7f74c0613e
commit ab1334c0cf
3686 changed files with 496409 additions and 1 deletions

125
vendor/patchwork/utf8/README.md vendored Normal file
View File

@@ -0,0 +1,125 @@
Patchwork UTF-8
===============
Patchwork UTF-8 provides both :
- a portability layer for Unicode handling in PHP, and
- a class that mirrors the quasi complete set of native string functions,
enhanced to UTF-8 [grapheme clusters](http://unicode.org/reports/tr29/)
awareness.
It can also serve as a documentation source referencing the practical problems
that arise when handling UTF-8 in PHP: Unicode concepts, related algorithms,
bugs in PHP core, workarounds, etc.
Portability
-----------
Unicode handling in PHP is best performed using a combo of `mbstring`, `iconv`,
`intl` and `pcre` with the `u` flag enabled. But when an application is expected
to run on many servers, you should be aware that these 4 extensions are not
always enabled.
Patchwork UTF-8 provides pure PHP implementations for 3 of those 4 extensions.
Here is the set of portability-fallbacks that are currently implemented:
- *utf8_encode, utf8_decode*,
- `mbstring`: *mb_convert_encoding, mb_decode_mimeheader, mb_encode_mimeheader,
mb_convert_case, mb_internal_encoding, mb_list_encodings, mb_strlen,
mb_strpos, mb_strrpos, mb_strtolower, mb_strtoupper, mb_substitute_character,
mb_substr, mb_stripos, mb_stristr, mb_strrchr, mb_strrichr, mb_strripos,
mb_strstr*,
- `iconv`: *iconv, iconv_mime_decode, iconv_mime_decode_headers,
iconv_get_encoding, iconv_set_encoding, iconv_mime_encode, ob_iconv_handler,
iconv_strlen, iconv_strpos, iconv_strrpos, iconv_substr*,
- `intl`: *Normalizer, grapheme_extract, grapheme_stripos, grapheme_stristr,
grapheme_strlen, grapheme_strpos, grapheme_strripos, grapheme_strrpos,
grapheme_strstr, grapheme_substr*.
`pcre` compiled with unicode support is required.
Patchwork\Utf8
--------------
[Grapheme clusters](http://unicode.org/reports/tr29/) should always be
considered when working with generic Unicode strings. The `Patchwork\Utf8`
class implements the quasi-complete set of native string functions that need
UTF-8 grapheme clusters awareness. Function names, arguments and behavior
carefully replicates native PHP string functions so that usage is very easy.
Some more functions are also provided to help handling UTF-8 strings:
- *isUtf8()*: checks if a string contains well formed UTF-8 data,
- *toAscii()*: generic UTF-8 to ASCII transliteration,
- *strtocasefold()*: unicode transformation for caseless matching,
- *strtonatfold()*: generic case sensitive transformation for collation matching
Mirrored string functions are:
*strlen, substr, strpos, stripos, strrpos, strripos, strstr, stristr, strrchr,
strrichr, strtolower, strtoupper, wordwrap, chr, count_chars, ltrim, ord, rtrim,
trim, str_ireplace, str_pad, str_shuffle, str_split, str_word_count, strcmp,
strnatcmp, strcasecmp, strnatcasecmp, strncasecmp, strncmp, strcspn, strpbrk,
strrev, strspn, strtr, substr_compare, substr_count, substr_replace, ucfirst,
lcfirst, ucwords, number_format, utf8_encode, utf8_decode*.
Missing are *printf*-family functions.
Usage
-----
The recommended way to install Patchwork UTF-8 is [through
composer](http://getcomposer.org). Just create a `composer.json` file and run
the `php composer.phar install` command to install it:
{
"require": {
"patchwork/utf8": "1.1.*"
}
}
Then, early in your bootstrap sequence, you have to configure your environment:
```php
\Patchwork\Utf8\Bootup::initAll(); // Enables the portablity layer and configures PHP for UTF-8
\Patchwork\Utf8\Bootup::filterRequestUri(); // Redirects to an UTF-8 encoded URL if it's not already the case
\Patchwork\Utf8\Bootup::filterRequestInputs(); // Sanitizes HTTP inputs to UTF-8 NFC
```
Run `phpunit` in the `tests/` directory to see the code in action.
Make sure that you are confident about using UTF-8 by reading
[Character Sets / Character Encoding Issues](http://www.phpwact.org/php/i18n/charsets)
and [Handling UTF-8 with PHP](http://www.phpwact.org/php/i18n/utf-8),
or [PHP et UTF-8](http://julp.lescigales.org/articles/3-php-et-utf-8.html) for french readers.
You should also get familar with the concept of
[Unicode Normalization](http://en.wikipedia.org/wiki/Unicode_equivalence) and
[Grapheme Clusters](http://unicode.org/reports/tr29/).
Do not blindly replace all use of PHP's string functions. Most of the time you
will not need to, and you will be introducing a significant performance overhead
to your application.
Screen your input on the *outer perimeter* so that only well formed UTF-8 pass
through. When dealing with badly formed UTF-8, you should not try to fix it.
Instead, consider it as ISO-8859-1 and use `utf8_encode()` to get an UTF-8
string. Don't forget also to choose one unicode normalization form and stick to
it. NFC is the most in use today.
This library is orthogonal to `mbstring.func_overload` and will not work if the
php.ini setting is enabled.
Licensing
---------
Patchwork\Utf8 is free software; you can redistribute it and/or modify it under
the terms of the (at your option):
- [Apache License v2.0](http://apache.org/licenses/LICENSE-2.0.txt), or
- [GNU General Public License v2.0](http://gnu.org/licenses/gpl-2.0.txt).
Unicode handling requires tedious work to be implemented and maintained on the
long run. As such, contributions such as unit tests, bug reports, comments or
patches licensed under both licenses are really welcomed.
I hope many projects could adopt this code and together help solve the unicode
subject for PHP.

View File

@@ -0,0 +1,17 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
/**
* Normalizer plugs Patchwork\PHP\Shim\Normalizer as a PHP implementation
* of intl's Normalizer when the intl extension in not enabled.
*/
class Normalizer extends Patchwork\PHP\Shim\Normalizer
{
}

View File

@@ -0,0 +1,645 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork\PHP\Shim;
/**
* iconv implementation in pure PHP, UTF-8 centric.
*
* Implemented:
* - iconv - Convert string to requested character encoding
* - iconv_mime_decode - Decodes a MIME header field
* - iconv_mime_decode_headers - Decodes multiple MIME header fields at once
* - iconv_get_encoding - Retrieve internal configuration variables of iconv extension
* - iconv_set_encoding - Set current setting for character encoding conversion
* - iconv_mime_encode - Composes a MIME header field
* - ob_iconv_handler - Convert character encoding as output buffer handler
* - iconv_strlen - Returns the character count of string
* - iconv_strpos - Finds position of first occurrence of a needle within a haystack
* - iconv_strrpos - Finds the last occurrence of a needle within a haystack
* - iconv_substr - Cut out part of a string
*
* Charsets available for convertion are defined by files
* in the charset/ directory and by Iconv::$alias below.
* You're welcome to send back any addition you make.
*/
class Iconv
{
const
ERROR_ILLEGAL_CHARACTER = 'iconv(): Detected an illegal character in input string',
ERROR_WRONG_CHARSET = 'iconv(): Wrong charset, conversion from `%s\' to `%s\' is not allowed';
protected static
$input_encoding = 'utf-8',
$output_encoding = 'utf-8',
$internal_encoding = 'utf-8',
$alias = array(
'utf8' => 'utf-8',
'ascii' => 'us-ascii',
'tis-620' => 'iso-8859-11',
'cp1250' => 'windows-1250',
'cp1251' => 'windows-1251',
'cp1252' => 'windows-1252',
'cp1253' => 'windows-1253',
'cp1254' => 'windows-1254',
'cp1255' => 'windows-1255',
'cp1256' => 'windows-1256',
'cp1257' => 'windows-1257',
'cp1258' => 'windows-1258',
'shift-jis' => 'cp932',
'shift_jis' => 'cp932',
'latin1' => 'iso-8859-1',
'latin2' => 'iso-8859-2',
'latin3' => 'iso-8859-3',
'latin4' => 'iso-8859-4',
'latin5' => 'iso-8859-9',
'latin6' => 'iso-8859-10',
'latin7' => 'iso-8859-13',
'latin8' => 'iso-8859-14',
'latin9' => 'iso-8859-15',
'latin10' => 'iso-8859-16',
'iso8859-1' => 'iso-8859-1',
'iso8859-2' => 'iso-8859-2',
'iso8859-3' => 'iso-8859-3',
'iso8859-4' => 'iso-8859-4',
'iso8859-5' => 'iso-8859-5',
'iso8859-6' => 'iso-8859-6',
'iso8859-7' => 'iso-8859-7',
'iso8859-8' => 'iso-8859-8',
'iso8859-9' => 'iso-8859-9',
'iso8859-10' => 'iso-8859-10',
'iso8859-11' => 'iso-8859-11',
'iso8859-12' => 'iso-8859-12',
'iso8859-13' => 'iso-8859-13',
'iso8859-14' => 'iso-8859-14',
'iso8859-15' => 'iso-8859-15',
'iso8859-16' => 'iso-8859-16',
'iso_8859-1' => 'iso-8859-1',
'iso_8859-2' => 'iso-8859-2',
'iso_8859-3' => 'iso-8859-3',
'iso_8859-4' => 'iso-8859-4',
'iso_8859-5' => 'iso-8859-5',
'iso_8859-6' => 'iso-8859-6',
'iso_8859-7' => 'iso-8859-7',
'iso_8859-8' => 'iso-8859-8',
'iso_8859-9' => 'iso-8859-9',
'iso_8859-10' => 'iso-8859-10',
'iso_8859-11' => 'iso-8859-11',
'iso_8859-12' => 'iso-8859-12',
'iso_8859-13' => 'iso-8859-13',
'iso_8859-14' => 'iso-8859-14',
'iso_8859-15' => 'iso-8859-15',
'iso_8859-16' => 'iso-8859-16',
'iso88591' => 'iso-8859-1',
'iso88592' => 'iso-8859-2',
'iso88593' => 'iso-8859-3',
'iso88594' => 'iso-8859-4',
'iso88595' => 'iso-8859-5',
'iso88596' => 'iso-8859-6',
'iso88597' => 'iso-8859-7',
'iso88598' => 'iso-8859-8',
'iso88599' => 'iso-8859-9',
'iso885910' => 'iso-8859-10',
'iso885911' => 'iso-8859-11',
'iso885912' => 'iso-8859-12',
'iso885913' => 'iso-8859-13',
'iso885914' => 'iso-8859-14',
'iso885915' => 'iso-8859-15',
'iso885916' => 'iso-8859-16',
),
$translit_map = array(),
$convert_map = array(),
$error_handler,
$last_error,
$ulen_mask = array("\xC0" => 2, "\xD0" => 2, "\xE0" => 3, "\xF0" => 4),
$is_valid_utf8;
static function iconv($in_charset, $out_charset, $str)
{
if ('' === (string) $str) return '';
// Prepare for //IGNORE and //TRANSLIT
$TRANSLIT = $IGNORE = '';
$out_charset = strtolower($out_charset);
$in_charset = strtolower($in_charset );
'' === $out_charset && $out_charset = 'iso-8859-1';
'' === $in_charset && $in_charset = 'iso-8859-1';
if ('//translit' === substr($out_charset, -10))
{
$TRANSLIT = '//TRANSLIT';
$out_charset = substr($out_charset, 0, -10);
}
if ('//ignore' === substr($out_charset, -8))
{
$IGNORE = '//IGNORE';
$out_charset = substr($out_charset, 0, -8);
}
'//translit' === substr($in_charset, -10) && $in_charset = substr($in_charset, 0, -10);
'//ignore' === substr($in_charset, -8) && $in_charset = substr($in_charset, 0, -8);
isset(self::$alias[ $in_charset]) && $in_charset = self::$alias[ $in_charset];
isset(self::$alias[$out_charset]) && $out_charset = self::$alias[$out_charset];
// Load charset maps
if ( ('utf-8' !== $in_charset && !self::loadMap('from.', $in_charset, $in_map))
|| ('utf-8' !== $out_charset && !self::loadMap( 'to.', $out_charset, $out_map)) )
{
user_error(sprintf(self::ERROR_WRONG_CHARSET, $in_charset, $out_charset));
return false;
}
if ('utf-8' !== $in_charset)
{
// Convert input to UTF-8
$result = '';
if (self::map_to_utf8($result, $in_map, $str, $IGNORE)) $str = $result;
else $str = false;
self::$is_valid_utf8 = true;
}
else
{
self::$is_valid_utf8 = preg_match('//u', $str);
if (!self::$is_valid_utf8 && !$IGNORE)
{
user_error(self::ERROR_ILLEGAL_CHARACTER);
return false;
}
if ('utf-8' === $out_charset)
{
// UTF-8 validation
$str = self::utf8_to_utf8($str, $IGNORE);
}
}
if ('utf-8' !== $out_charset && false !== $str)
{
// Convert output to UTF-8
$result = '';
if (self::map_from_utf8($result, $out_map, $str, $IGNORE, $TRANSLIT)) return $result;
else return false;
}
else return $str;
}
static function iconv_mime_decode_headers($str, $mode = 0, $charset = INF)
{
INF === $charset && $charset = self::$internal_encoding;
false !== strpos($str, "\r") && $str = strtr(str_replace("\r\n", "\n", $str), "\r", "\n");
$str = explode("\n\n", $str, 2);
$headers = array();
$str = preg_split('/\n(?![ \t])/', $str[0]);
foreach ($str as $str)
{
$str = self::iconv_mime_decode($str, $mode, $charset);
if (false === $str) return false;
$str = explode(':', $str, 2);
if (2 === count($str))
{
if (isset($headers[$str[0]]))
{
is_array($headers[$str[0]]) || $headers[$str[0]] = array($headers[$str[0]]);
$headers[$str[0]][] = ltrim($str[1]);
}
else $headers[$str[0]] = ltrim($str[1]);
}
}
return $headers;
}
static function iconv_mime_decode($str, $mode = 0, $charset = INF)
{
INF === $charset && $charset = self::$internal_encoding;
if (ICONV_MIME_DECODE_CONTINUE_ON_ERROR & $mode) $charset .= '//IGNORE';
false !== strpos($str, "\r") && $str = strtr(str_replace("\r\n", "\n", $str), "\r", "\n");
$str = preg_split('/\n(?![ \t])/', rtrim($str), 2);
$str = preg_replace('/[ \t]*\n[ \t]+/', ' ', rtrim($str[0]));
$str = preg_split('/=\?([^?]+)\?([bqBQ])\?(.*?)\?=/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
$result = self::iconv('utf-8', $charset, $str[0]);
if (false === $result) return false;
$i = 1;
$len = count($str);
while ($i < $len)
{
$c = strtolower($str[$i]);
if ( (ICONV_MIME_DECODE_CONTINUE_ON_ERROR & $mode)
&& 'utf-8' !== $c
&& !isset(self::$alias[$c])
&& !self::loadMap('from.', $c, $d) ) $d = false;
else if ('B' === strtoupper($str[$i+1])) $d = base64_decode($str[$i+2]);
else $d = rawurldecode(strtr(str_replace('%', '%25', $str[$i+2]), '=_', '% '));
if (false !== $d)
{
$result .= self::iconv($c, $charset, $d);
$d = self::iconv('utf-8' , $charset, $str[$i+3]);
if ('' !== trim($d)) $result .= $d;
}
else if (ICONV_MIME_DECODE_CONTINUE_ON_ERROR & $mode)
{
$result .= "=?{$str[$i]}?{$str[$i+1]}?{$str[$i+2]}?={$str[$i+3]}";
}
else
{
$result = false;
break;
}
$i += 4;
}
return $result;
}
static function iconv_get_encoding($type = 'all')
{
switch ($type)
{
case 'input_encoding' : return self::$input_encoding;
case 'output_encoding' : return self::$output_encoding;
case 'internal_encoding': return self::$internal_encoding;
}
return array(
'input_encoding' => self::$input_encoding,
'output_encoding' => self::$output_encoding,
'internal_encoding' => self::$internal_encoding
);
}
static function iconv_set_encoding($type, $charset)
{
switch ($type)
{
case 'input_encoding' : self::$input_encoding = $charset; break;
case 'output_encoding' : self::$output_encoding = $charset; break;
case 'internal_encoding': self::$internal_encoding = $charset; break;
default: return false;
}
return true;
}
static function iconv_mime_encode($field_name, $field_value, $pref = INF)
{
is_array($pref) || $pref = array();
$pref += array(
'scheme' => 'B',
'input-charset' => self::$internal_encoding,
'output-charset' => self::$internal_encoding,
'line-length' => 76,
'line-break-chars' => "\r\n"
);
preg_match('/[\x80-\xFF]/', $field_name) && $field_name = '';
$scheme = strtoupper(substr($pref['scheme'], 0, 1));
$in = strtolower($pref['input-charset']);
$out = strtolower($pref['output-charset']);
if ('utf-8' !== $in && false === $field_value = self::iconv($in, 'utf-8', $field_value)) return false;
preg_match_all('/./us', $field_value, $chars);
$chars = isset($chars[0]) ? $chars[0] : array();
$line_break = (int) $pref['line-length'];
$line_start = "=?{$pref['output-charset']}?{$scheme}?";
$line_length = strlen($field_name) + 2 + strlen($line_start) + 2;
$line_offset = strlen($line_start) + 3;
$line_data = '';
$field_value = array();
$Q = 'Q' === $scheme;
foreach ($chars as $c)
{
if ('utf-8' !== $out && false === $c = self::iconv('utf-8', $out, $c)) return false;
$o = $Q
? $c = preg_replace_callback(
'/[=_\?\x00-\x1F\x80-\xFF]/',
array(__CLASS__, 'qp_byte_callback'),
$c
)
: base64_encode($line_data . $c);
if (isset($o[$line_break - $line_length]))
{
$Q || $line_data = base64_encode($line_data);
$field_value[] = $line_start . $line_data . '?=';
$line_length = $line_offset;
$line_data = '';
}
$line_data .= $c;
$Q && $line_length += strlen($c);
}
if ('' !== $line_data)
{
$Q || $line_data = base64_encode($line_data);
$field_value[] = $line_start . $line_data . '?=';
}
return $field_name . ': ' . implode($pref['line-break-chars'] . ' ', $field_value);
}
static function ob_iconv_handler($buffer, $mode)
{
return self::iconv(self::$internal_encoding, self::$output_encoding, $buffer);
}
static function iconv_strlen($s, $encoding = INF)
{
/**/ if (extension_loaded('xml'))
return self::strlen1($s, $encoding);
/**/ else
return self::strlen2($s, $encoding);
}
static function strlen1($s, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if (0 !== strncasecmp($encoding, 'utf-8', 5) && false === $s = self::iconv($encoding, 'utf-8', $s)) return false;
return strlen(utf8_decode($s));
}
static function strlen2($s, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if (0 !== strncasecmp($encoding, 'utf-8', 5) && false === $s = self::iconv($encoding, 'utf-8', $s)) return false;
$ulen_mask = self::$ulen_mask;
$i = 0; $j = 0;
$len = strlen($s);
while ($i < $len)
{
$u = $s[$i] & "\xF0";
$i += isset($ulen_mask[$u]) ? $ulen_mask[$u] : 1;
++$j;
}
return $j;
}
static function iconv_strpos($haystack, $needle, $offset = 0, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if (0 !== strncasecmp($encoding, 'utf-8', 5))
{
if (false === $haystack = self::iconv($encoding, 'utf-8', $haystack)) return false;
if (false === $needle = self::iconv($encoding, 'utf-8', $needle)) return false;
}
if ($offset = (int) $offset) $haystack = self::iconv_substr($haystack, $offset, 2147483647, 'utf-8');
$pos = strpos($haystack, $needle);
return false === $pos ? false : ($offset + ($pos ? self::iconv_strlen(substr($haystack, 0, $pos), 'utf-8') : 0));
}
static function iconv_strrpos($haystack, $needle, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if (0 !== strncasecmp($encoding, 'utf-8', 5))
{
if (false === $haystack = self::iconv($encoding, 'utf-8', $haystack)) return false;
if (false === $needle = self::iconv($encoding, 'utf-8', $needle)) return false;
}
$pos = isset($needle[0]) ? strrpos($haystack, $needle) : false;
return false === $pos ? false : self::iconv_strlen($pos ? substr($haystack, 0, $pos) : $haystack, 'utf-8');
}
static function iconv_substr($s, $start, $length = 2147483647, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if (0 === strncasecmp($encoding, 'utf-8', 5)) $encoding = INF;
else if (false === $s = self::iconv($encoding, 'utf-8', $s)) return false;
$slen = self::iconv_strlen($s, 'utf-8');
$start = (int) $start;
if (0 > $start) $start += $slen;
if (0 > $start) return false;
if ($start >= $slen) return false;
$rx = $slen - $start;
if (0 > $length) $length += $rx;
if (0 === $length) return '';
if (0 > $length) return false;
if ($length > $rx) $length = $rx;
$rx = '/^' . ($start ? self::preg_offset($start) : '') . '(' . self::preg_offset($length) . ')/u';
$s = preg_match($rx, $s, $s) ? $s[1] : '';
if (INF === $encoding) return $s;
else return self::iconv('utf-8', $encoding, $s);
}
protected static function loadMap($type, $charset, &$map)
{
if (!isset(self::$convert_map[$type . $charset]))
{
if (false === $map = self::getData($type . $charset))
{
if ('to.' === $type && self::loadMap('from.', $charset, $map)) $map = array_flip($map);
else return false;
}
self::$convert_map[$type . $charset] = $map;
}
else $map = self::$convert_map[$type . $charset];
return true;
}
protected static function utf8_to_utf8($str, $IGNORE)
{
$ulen_mask = self::$ulen_mask;
$valid = self::$is_valid_utf8;
$u = $str;
$i = $j = 0;
$len = strlen($str);
while ($i < $len)
{
if ($str[$i] < "\x80") $u[$j++] = $str[$i++];
else
{
$ulen = $str[$i] & "\xF0";
$ulen = isset($ulen_mask[$ulen]) ? $ulen_mask[$ulen] : 1;
$uchr = substr($str, $i, $ulen);
if (1 === $ulen || !($valid || preg_match('/^.$/us', $uchr)))
{
if ($IGNORE)
{
++$i;
continue;
}
user_error(self::ERROR_ILLEGAL_CHARACTER);
return false;
}
else $i += $ulen;
$u[$j++] = $uchr[0];
isset($uchr[1]) && 0 !== ($u[$j++] = $uchr[1])
&& isset($uchr[2]) && 0 !== ($u[$j++] = $uchr[2])
&& isset($uchr[3]) && 0 !== ($u[$j++] = $uchr[3]);
}
}
return substr($u, 0, $j);
}
protected static function map_to_utf8(&$result, $map, $str, $IGNORE)
{
$len = strlen($str);
for ($i = 0; $i < $len; ++$i)
{
if (isset($str[$i+1], $map[$str[$i] . $str[$i+1]])) $result .= $map[$str[$i] . $str[++$i]];
else if (isset($map[$str[$i]])) $result .= $map[$str[$i]];
else if (!$IGNORE)
{
user_error(self::ERROR_ILLEGAL_CHARACTER);
return false;
}
}
return true;
}
protected static function map_from_utf8(&$result, $map, $str, $IGNORE, $TRANSLIT)
{
$ulen_mask = self::$ulen_mask;
$valid = self::$is_valid_utf8;
if ($TRANSLIT) self::$translit_map or self::$translit_map = self::getData('translit');
$i = 0;
$len = strlen($str);
while ($i < $len)
{
if ($str[$i] < "\x80") $uchr = $str[$i++];
else
{
$ulen = $str[$i] & "\xF0";
$ulen = isset($ulen_mask[$ulen]) ? $ulen_mask[$ulen] : 1;
$uchr = substr($str, $i, $ulen);
if ($IGNORE && (1 === $ulen || !($valid || preg_match('/^.$/us', $uchr))))
{
++$i;
continue;
}
else $i += $ulen;
}
if (isset($map[$uchr]))
{
$result .= $map[$uchr];
}
else if ($TRANSLIT)
{
if (isset(self::$translit_map[$uchr]))
{
$uchr = self::$translit_map[$uchr];
}
else if ($uchr >= "\xC3\x80")
{
$uchr = \Normalizer::normalize($uchr, \Normalizer::NFD);
$uchr = preg_split('/(.)/', $uchr, 2, PREG_SPLIT_DELIM_CAPTURE);
if (isset($uchr[2][0])) $uchr = $uchr[1];
else if ($IGNORE) continue;
else return false;
}
$str = $uchr . substr($str, $i);
$len = strlen($str);
$i = 0;
}
else if (!$IGNORE)
{
return false;
}
}
return true;
}
protected static function qp_byte_callback($m)
{
return '=' . strtoupper(dechex(ord($m[0])));
}
protected static function preg_offset($offset)
{
$rx = array();
$offset = (int) $offset;
while ($offset > 65535)
{
$rx[] = '.{65535}';
$offset -= 65535;
}
return implode('', $rx) . '.{' . $offset . '}';
}
protected static function getData($file)
{
$file = __DIR__ . '/charset/' . $file . '.ser';
if (file_exists($file)) return unserialize(file_get_contents($file));
else return false;
}
}

View File

@@ -0,0 +1,139 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork\PHP\Shim;
/**
* Partial intl implementation in pure PHP.
*
* Implemented:
* - grapheme_extract - Extract a sequence of grapheme clusters from a text buffer, which must be encoded in UTF-8
* - grapheme_stripos - Find position (in grapheme units) of first occurrence of a case-insensitive string
* - grapheme_stristr - Returns part of haystack string from the first occurrence of case-insensitive needle to the end of haystack
* - grapheme_strlen - Get string length in grapheme units
* - grapheme_strpos - Find position (in grapheme units) of first occurrence of a string
* - grapheme_strripos - Find position (in grapheme units) of last occurrence of a case-insensitive string
* - grapheme_strrpos - Find position (in grapheme units) of last occurrence of a string
* - grapheme_strstr - Returns part of haystack string from the first occurrence of needle to the end of haystack
* - grapheme_substr - Return part of a string
*/
class Intl
{
static function grapheme_extract($s, $size, $type = GRAPHEME_EXTR_COUNT, $start = 0, &$next = 0)
{
$s = (string) substr($s, $start);
$size = (int) $size;
$type = (int) $type;
$start = (int) $start;
if ('' === $s || 0 > $size || 0 > $start || 0 > $type || 2 < $type) return false;
if (0 === $size) return '';
$next = $start;
$s = preg_split('/(' . GRAPHEME_CLUSTER_RX . ')/u', "\r\n" . $s, $size + 1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
if (!isset($s[1])) return false;
$i = 1;
$ret = '';
do
{
if (GRAPHEME_EXTR_COUNT === $type) --$size;
else if (GRAPHEME_EXTR_MAXBYTES === $type) $size -= strlen($s[$i]);
else $size -= iconv_strlen($s[$i], 'UTF-8//IGNORE');
if ($size >= 0) $ret .= $s[$i];
}
while (isset($s[++$i]) && $size > 0);
$next += strlen($ret);
return $ret;
}
static function grapheme_strlen($s)
{
$s = (string) $s;
preg_replace('/' . GRAPHEME_CLUSTER_RX . '/u', '', $s, -1, $len);
return 0 === $len && '' !== $s ? null : $len;
}
static function grapheme_substr($s, $start, $len = 2147483647)
{
preg_match_all('/' . GRAPHEME_CLUSTER_RX . '/u', $s, $s);
$slen = count($s[0]);
$start = (int) $start;
if (0 > $start) $start += $slen;
if (0 > $start) return false;
if ($start >= $slen) return false;
$rem = $slen - $start;
if (0 > $len) $len += $rem;
if (0 === $len) return '';
if (0 > $len) return false;
if ($len > $rem) $len = $rem;
return implode('', array_slice($s[0], $start, $len));
}
static function grapheme_substr_workaround62759($s, $start, $len)
{
// Intl based http://bugs.php.net/62759 and 55562 workaround
if (2147483647 == $len) return grapheme_substr($s, $start);
$slen = grapheme_strlen($s);
$start = (int) $start;
if (0 > $start) $start += $slen;
if (0 > $start) return false;
if ($start >= $slen) return false;
$rem = $slen - $start;
if (0 > $len) $len += $rem;
if (0 === $len) return '';
if (0 > $len) return false;
if ($len > $rem) $len = $rem;
return grapheme_substr($s, $start, $len);
}
static function grapheme_strpos ($s, $needle, $offset = 0) {return self::grapheme_position($s, $needle, $offset, 0);}
static function grapheme_stripos ($s, $needle, $offset = 0) {return self::grapheme_position($s, $needle, $offset, 1);}
static function grapheme_strrpos ($s, $needle, $offset = 0) {return self::grapheme_position($s, $needle, $offset, 2);}
static function grapheme_strripos($s, $needle, $offset = 0) {return self::grapheme_position($s, $needle, $offset, 3);}
static function grapheme_stristr ($s, $needle, $before_needle = false) {return mb_stristr($s, $needle, $before_needle, 'UTF-8');}
static function grapheme_strstr ($s, $needle, $before_needle = false) {return mb_strstr ($s, $needle, $before_needle, 'UTF-8');}
protected static function grapheme_position($s, $needle, $offset, $mode)
{
if ($offset > 0) $s = (string) self::grapheme_substr($s, $offset);
else if ($offset < 0) $offset = 0;
if ('' === (string) $needle) return false;
if ('' === (string) $s) return false;
switch ($mode)
{
case 0: $needle = iconv_strpos ($s, $needle, 0, 'UTF-8'); break;
case 1: $needle = mb_stripos ($s, $needle, 0, 'UTF-8'); break;
case 2: $needle = iconv_strrpos($s, $needle, 'UTF-8'); break;
default: $needle = mb_strripos ($s, $needle, 0, 'UTF-8'); break;
}
return $needle ? self::grapheme_strlen(iconv_substr($s, 0, $needle, 'UTF-8')) + $offset : $needle;
}
}

View File

@@ -0,0 +1,340 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork\PHP\Shim;
/**
* Partial mbstring implementation in PHP, iconv based, UTF-8 centric.
*
* Implemented:
* - mb_convert_encoding - Convert character encoding
* - mb_decode_mimeheader - Decode string in MIME header field
* - mb_encode_mimeheader - Encode string for MIME header XXX NATIVE IMPLEMENTATION IS REALLY BUGGED
* - mb_convert_case - Perform case folding on a string
* - mb_internal_encoding - Set/Get internal character encoding
* - mb_list_encodings - Returns an array of all supported encodings
* - mb_strlen - Get string length
* - mb_strpos - Find position of first occurrence of string in a string
* - mb_strrpos - Find position of last occurrence of a string in a string
* - mb_strtolower - Make a string lowercase
* - mb_strtoupper - Make a string uppercase
* - mb_substitute_character - Set/Get substitution character
* - mb_substr - Get part of string
* - mb_stripos - Finds position of first occurrence of a string within another, case insensitive
* - mb_stristr - Finds first occurrence of a string within another, case insensitive
* - mb_strrchr - Finds the last occurrence of a character in a string within another
* - mb_strrichr - Finds the last occurrence of a character in a string within another, case insensitive
* - mb_strripos - Finds position of last occurrence of a string within another, case insensitive
* - mb_strstr - Finds first occurrence of a string within anothers
*
* Not implemented:
* - mb_check_encoding - Check if the string is valid for the specified encoding
* - mb_convert_kana - Convert "kana" one from another ("zen-kaku", "han-kaku" and more)
* - mb_convert_variables - Convert character code in variable(s)
* - mb_decode_numericentity - Decode HTML numeric string reference to character
* - mb_detect_encoding - Detect character encoding
* - mb_detect_order - Set/Get character encoding detection order
* - mb_encode_numericentity - Encode character to HTML numeric string reference
* - mb_ereg* - Regular expression with multibyte support
* - mb_get_info - Get internal settings of mbstring
* - mb_http_input - Detect HTTP input character encoding
* - mb_http_output - Set/Get HTTP output character encoding
* - mb_language - Set/Get current language
* - mb_list_encodings_alias_names - Returns an array of all supported alias encodings
* - mb_list_mime_names - Returns an array or string of all supported mime names
* - mb_output_handler - Callback function converts character encoding in output buffer
* - mb_parse_str - Parse GET/POST/COOKIE data and set global variable
* - mb_preferred_mime_name - Get MIME charset string
* - mb_regex_encoding - Returns current encoding for multibyte regex as string
* - mb_regex_set_options - Set/Get the default options for mbregex functions
* - mb_send_mail - Send encoded mail
* - mb_split - Split multibyte string using regular expression
* - mb_strcut - Get part of string
* - mb_strimwidth - Get truncated string with specified width
* - mb_strwidth - Return width of string
* - mb_substr_count - Count the number of substring occurrences
*/
class Mbstring
{
const MB_CASE_FOLD = PHP_INT_MAX;
protected static
$internal_encoding = 'UTF-8',
$caseFold = array(
array('µ','ſ',"\xCD\x85",'ς',"\xCF\x90","\xCF\x91","\xCF\x95","\xCF\x96","\xCF\xB0","\xCF\xB1","\xCF\xB5","\xE1\xBA\x9B","\xE1\xBE\xBE"),
array('μ','s','ι', 'σ','β', 'θ', 'φ', 'π', 'κ', 'ρ', 'ε', "\xE1\xB9\xA1",'ι' )
);
static function mb_convert_encoding($s, $to_encoding, $from_encoding = INF)
{
INF === $from_encoding && $from_encoding = self::$internal_encoding;
$from_encoding = strtolower($from_encoding);
$to_encoding = strtolower($to_encoding);
if ('base64' === $from_encoding)
{
$s = base64_decode($s);
$from_encoding = $to_encoding;
}
if ('base64' === $to_encoding) return base64_encode($s);
if ('html-entities' === $to_encoding)
{
'html-entities' === $from_encoding && $from_encoding = 'Windows-1252';
'utf-8' === $from_encoding || $s = iconv($from_encoding, 'UTF-8//IGNORE', $s);
return preg_replace_callback('/[\x80-\xFF]+/', array(__CLASS__, 'html_encoding_callback'), $s);
}
if ('html-entities' === $from_encoding)
{
$s = html_entity_decode($s, ENT_COMPAT, 'UTF-8');
$from_encoding = 'UTF-8';
}
return iconv($from_encoding, $to_encoding . '//IGNORE', $s);
}
static function mb_decode_mimeheader($s)
{
return iconv_mime_decode($s, 2, self::$internal_encoding . '//IGNORE');
}
static function mb_encode_mimeheader($s, $charset = INF, $transfer_encoding = INF, $linefeed = INF, $indent = INF)
{
user_error('mb_encode_mimeheader() is bugged. Please use iconv_mime_encode() instead', E_USER_WARNING);
}
static function mb_convert_case($s, $mode, $encoding = INF)
{
if ('' === $s) return '';
INF === $encoding && $encoding = self::$internal_encoding;
if ('UTF-8' === strtoupper($encoding)) $encoding = INF;
else $s = iconv($encoding, 'UTF-8//IGNORE', $s);
if (MB_CASE_UPPER == $mode)
{
static $upper;
isset($upper) || $upper = self::getData('upperCase');
$map = $upper;
}
else
{
if (self::MB_CASE_FOLD === $mode) $s = str_replace(self::$caseFold[0], self::$caseFold[1], $s);
static $lower;
isset($lower) || $lower = self::getData('lowerCase');
$map = $lower;
}
static $ulen_mask = array("\xC0" => 2, "\xD0" => 2, "\xE0" => 3, "\xF0" => 4);
$i = 0;
$len = strlen($s);
while ($i < $len)
{
$ulen = $s[$i] < "\x80" ? 1 : $ulen_mask[$s[$i] & "\xF0"];
$uchr = substr($s, $i, $ulen);
$i += $ulen;
if (isset($map[$uchr]))
{
$uchr = $map[$uchr];
$nlen = strlen($uchr);
if ($nlen == $ulen)
{
$nlen = $i;
do $s[--$nlen] = $uchr[--$ulen];
while ($ulen);
}
else
{
$s = substr_replace($s, $uchr, $i - $ulen, $ulen);
$len += $nlen - $ulen;
$i += $nlen - $ulen;
}
}
}
if (MB_CASE_TITLE == $mode)
{
$s = preg_replace_callback('/\b\p{Ll}/u', array(__CLASS__, 'title_case_callback'), $s);
}
if (INF === $encoding) return $s;
else return iconv('UTF-8', $encoding, $s);
}
static function mb_internal_encoding($encoding = INF)
{
if (INF === $encoding) return self::$internal_encoding;
if ('UTF-8' === strtoupper($encoding) || false !== @iconv($encoding, $encoding, ' '))
{
self::$internal_encoding = $encoding;
return true;
}
return false;
}
static function mb_list_encodings()
{
return array('UTF-8');
}
static function mb_strlen($s, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
return iconv_strlen($s, $encoding . '//IGNORE');
}
static function mb_strpos ($haystack, $needle, $offset = 0, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if ('' === (string) $needle)
{
user_error(__METHOD__ . ': Empty delimiter', E_USER_WARNING);
return false;
}
else return iconv_strpos($haystack, $needle, $offset, $encoding . '//IGNORE');
}
static function mb_strrpos($haystack, $needle, $offset = 0, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if ($offset != (int) $offset)
{
$offset = 0;
}
else if ($offset = (int) $offset)
{
$haystack = self::mb_substr($haystack, $offset, 2147483647, $encoding);
}
$pos = iconv_strrpos($haystack, $needle, $encoding . '//IGNORE');
return false !== $pos ? $offset + $pos : false;
}
static function mb_strtolower($s, $encoding = INF)
{
return self::mb_convert_case($s, MB_CASE_LOWER, $encoding);
}
static function mb_strtoupper($s, $encoding = INF)
{
return self::mb_convert_case($s, MB_CASE_UPPER, $encoding);
}
static function mb_substitute_character($c = INF)
{
return INF !== $c ? false : 'none';
}
static function mb_substr($s, $start, $length = 2147483647, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
if ($start < 0)
{
$start = iconv_strlen($s, $encoding . '//IGNORE') + $start;
if ($start < 0) $start = 0;
}
if ($length < 0)
{
$length = iconv_strlen($s, $encoding . '//IGNORE') + $length - $start;
if ($length < 0) return '';
}
return (string) iconv_substr($s, $start, $length, $encoding . '//IGNORE');
}
static function mb_stripos($haystack, $needle, $offset = 0, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
$haystack = self::mb_convert_case($haystack, self::MB_CASE_FOLD, $encoding);
$needle = self::mb_convert_case($needle, self::MB_CASE_FOLD, $encoding);
return self::mb_strpos($haystack, $needle, $offset, $encoding);
}
static function mb_stristr($haystack, $needle, $part = false, $encoding = INF)
{
$pos = self::mb_stripos($haystack, $needle, 0, $encoding);
return self::getSubpart($pos, $part, $haystack, $encoding);
}
static function mb_strrchr($haystack, $needle, $part = false, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
$needle = self::mb_substr($needle, 0, 1, $encoding);
$pos = iconv_strrpos($haystack, $needle, $encoding);
return self::getSubpart($pos, $part, $haystack, $encoding);
}
static function mb_strrichr($haystack, $needle, $part = false, $encoding = INF)
{
$needle = self::mb_substr($needle, 0, 1, $encoding);
$pos = self::mb_strripos($haystack, $needle, $encoding);
return self::getSubpart($pos, $part, $haystack, $encoding);
}
static function mb_strripos($haystack, $needle, $offset = 0, $encoding = INF)
{
INF === $encoding && $encoding = self::$internal_encoding;
$haystack = self::mb_convert_case($haystack, self::MB_CASE_FOLD, $encoding);
$needle = self::mb_convert_case($needle, self::MB_CASE_FOLD, $encoding);
return self::mb_strrpos($haystack, $needle, $offset, $encoding);
}
static function mb_strstr($haystack, $needle, $part = false, $encoding = INF)
{
$pos = strpos($haystack, $needle);
if (false === $pos) return false;
if ($part) return substr($haystack, 0, $pos);
else return substr($haystack, $pos);
}
protected static function getSubpart($pos, $part, $haystack, $encoding)
{
INF === $encoding && $encoding = self::$internal_encoding;
if (false === $pos) return false;
if ($part) return self::mb_substr($haystack, 0, $pos, $encoding);
else return self::mb_substr($haystack, $pos, 2147483647, $encoding);
}
protected static function html_encoding_callback($m)
{
return htmlentities($m[0], ENT_COMPAT, 'UTF-8');
}
protected static function title_case_callback($s)
{
return self::mb_convert_case($s[0], MB_CASE_UPPER, 'UTF-8');
}
protected static function getData($file)
{
$file = __DIR__ . '/unidata/' . $file . '.ser';
if (file_exists($file)) return unserialize(file_get_contents($file));
else return false;
}
}

View File

@@ -0,0 +1,295 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork\PHP\Shim;
/**
* Normalizer is a PHP fallback implementation of the Normalizer class provided by the intl extension.
*
* It has been validated with Unicode 6.1 Normalization Conformance Test.
* See http://www.unicode.org/reports/tr15/ for detailed info about Unicode normalizations.
*/
class Normalizer
{
const
NONE = 1,
FORM_D = 2, NFD = 2,
FORM_KD = 3, NFKD = 3,
FORM_C = 4, NFC = 4,
FORM_KC = 5, NFKC = 5;
protected static
$C, $D, $KD, $cC,
$ulen_mask = array("\xC0" => 2, "\xD0" => 2, "\xE0" => 3, "\xF0" => 4),
$ASCII = "\x20\x65\x69\x61\x73\x6E\x74\x72\x6F\x6C\x75\x64\x5D\x5B\x63\x6D\x70\x27\x0A\x67\x7C\x68\x76\x2E\x66\x62\x2C\x3A\x3D\x2D\x71\x31\x30\x43\x32\x2A\x79\x78\x29\x28\x4C\x39\x41\x53\x2F\x50\x22\x45\x6A\x4D\x49\x6B\x33\x3E\x35\x54\x3C\x44\x34\x7D\x42\x7B\x38\x46\x77\x52\x36\x37\x55\x47\x4E\x3B\x4A\x7A\x56\x23\x48\x4F\x57\x5F\x26\x21\x4B\x3F\x58\x51\x25\x59\x5C\x09\x5A\x2B\x7E\x5E\x24\x40\x60\x7F\x00\x01\x02\x03\x04\x05\x06\x07\x08\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F";
static function isNormalized($s, $form = self::NFC)
{
if (strspn($s, self::$ASCII) === strlen($s)) return true;
if (self::NFC === $form && preg_match('//u', $s) && !preg_match('/[^\x00-\x{2FF}]/u', $s)) return true;
return false; // Pretend false as quick checks implementented in PHP won't be so quick
}
static function normalize($s, $form = self::NFC)
{
if (!preg_match('//u', $s)) return false;
switch ($form)
{
case self::NONE: return $s;
case self::NFC: $C = true; $K = false; break;
case self::NFD: $C = false; $K = false; break;
case self::NFKC: $C = true; $K = true; break;
case self::NFKD: $C = false; $K = true; break;
default: return false;
}
if (!strlen($s)) return '';
if ($K && empty(self::$KD)) self::$KD = self::getData('compatibilityDecomposition');
if (empty(self::$D))
{
self::$D = self::getData('canonicalDecomposition');
self::$cC = self::getData('combiningClass');
}
if ($C)
{
if (empty(self::$C)) self::$C = self::getData('canonicalComposition');
return self::recompose(self::decompose($s, $K));
}
else return self::decompose($s, $K);
}
protected static function recompose($s)
{
$ASCII = self::$ASCII;
$compMap = self::$C;
$combClass = self::$cC;
$ulen_mask = self::$ulen_mask;
$result = $tail = '';
$i = $s[0] < "\x80" ? 1 : $ulen_mask[$s[0] & "\xF0"];
$len = strlen($s);
$last_uchr = substr($s, 0, $i);
$last_ucls = isset($combClass[$last_uchr]) ? 256 : 0;
while ($i < $len)
{
if ($s[$i] < "\x80")
{
// ASCII chars
if ($tail)
{
$last_uchr .= $tail;
$tail = '';
}
if ($j = strspn($s, $ASCII, $i+1))
{
$last_uchr .= substr($s, $i, $j);
$i += $j;
}
$result .= $last_uchr;
$last_uchr = $s[$i];
++$i;
}
else
{
$ulen = $ulen_mask[$s[$i] & "\xF0"];
$uchr = substr($s, $i, $ulen);
if ($last_uchr < "\xE1\x84\x80" || "\xE1\x84\x92" < $last_uchr
|| $uchr < "\xE1\x85\xA1" || "\xE1\x85\xB5" < $uchr
|| $last_ucls)
{
// Table lookup and combining chars composition
$ucls = isset($combClass[$uchr]) ? $combClass[$uchr] : 0;
if (isset($compMap[$last_uchr . $uchr]) && (!$last_ucls || $last_ucls < $ucls))
{
$last_uchr = $compMap[$last_uchr . $uchr];
}
else if ($last_ucls = $ucls) $tail .= $uchr;
else
{
if ($tail)
{
$last_uchr .= $tail;
$tail = '';
}
$result .= $last_uchr;
$last_uchr = $uchr;
}
}
else
{
// Hangul chars
$L = ord($last_uchr[2]) - 0x80;
$V = ord($uchr[2]) - 0xA1;
$T = 0;
$uchr = substr($s, $i + $ulen, 3);
if ("\xE1\x86\xA7" <= $uchr && $uchr <= "\xE1\x87\x82")
{
$T = ord($uchr[2]) - 0xA7;
0 > $T && $T += 0x40;
$ulen += 3;
}
$L = 0xAC00 + ($L * 21 + $V) * 28 + $T;
$last_uchr = chr(0xE0 | $L>>12) . chr(0x80 | $L>>6 & 0x3F) . chr(0x80 | $L & 0x3F);
}
$i += $ulen;
}
}
return $result . $last_uchr . $tail;
}
protected static function decompose($s, $c)
{
$result = '';
$ASCII = self::$ASCII;
$decompMap = self::$D;
$combClass = self::$cC;
$ulen_mask = self::$ulen_mask;
if ($c) $compatMap = self::$KD;
$c = array();
$i = 0;
$len = strlen($s);
while ($i < $len)
{
if ($s[$i] < "\x80")
{
// ASCII chars
if ($c)
{
ksort($c);
$result .= implode('', $c);
$c = array();
}
$j = 1 + strspn($s, $ASCII, $i+1);
$result .= substr($s, $i, $j);
$i += $j;
}
else
{
$ulen = $ulen_mask[$s[$i] & "\xF0"];
$uchr = substr($s, $i, $ulen);
$i += $ulen;
if (isset($combClass[$uchr]))
{
// Combining chars, for sorting
isset($c[$combClass[$uchr]]) || $c[$combClass[$uchr]] = '';
$c[$combClass[$uchr]] .= isset($compatMap[$uchr]) ? $compatMap[$uchr] : (isset($decompMap[$uchr]) ? $decompMap[$uchr] : $uchr);
}
else
{
if ($c)
{
ksort($c);
$result .= implode('', $c);
$c = array();
}
if ($uchr < "\xEA\xB0\x80" || "\xED\x9E\xA3" < $uchr)
{
// Table lookup
$j = isset($compatMap[$uchr]) ? $compatMap[$uchr] : (isset($decompMap[$uchr]) ? $decompMap[$uchr] : $uchr);
if ($uchr != $j)
{
$uchr = $j;
$j = strlen($uchr);
$ulen = $uchr[0] < "\x80" ? 1 : $ulen_mask[$uchr[0] & "\xF0"];
if ($ulen != $j)
{
// Put trailing chars in $s
$j -= $ulen;
$i -= $j;
if (0 > $i)
{
$s = str_repeat(' ', -$i) . $s;
$len -= $i;
$i = 0;
}
while ($j--) $s[$i+$j] = $uchr[$ulen+$j];
$uchr = substr($uchr, 0, $ulen);
}
}
}
else
{
// Hangul chars
$uchr = unpack('C*', $uchr);
$j = (($uchr[1]-224) << 12) + (($uchr[2]-128) << 6) + $uchr[3] - 0xAC80;
$uchr = "\xE1\x84" . chr(0x80 + (int) ($j / 588))
. "\xE1\x85" . chr(0xA1 + (int) (($j % 588) / 28));
if ($j %= 28)
{
$uchr .= $j < 25
? ("\xE1\x86" . chr(0xA7 + $j))
: ("\xE1\x87" . chr(0x67 + $j));
}
}
$result .= $uchr;
}
}
}
if ($c)
{
ksort($c);
$result .= implode('', $c);
}
return $result;
}
protected static function getData($file)
{
$file = __DIR__ . '/unidata/' . $file . '.ser';
if (file_exists($file)) return unserialize(file_get_contents($file));
else return false;
}
}

View File

@@ -0,0 +1,60 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork\PHP\Shim;
/**
* utf8_encode/decode
*/
class Xml
{
static function utf8_encode($s)
{
$len = strlen($s);
$e = $s . $s;
for ($i = 0, $j = 0; $i < $len; ++$i, ++$j) switch (true)
{
case $s[$i] < "\x80": $e[$j] = $s[$i]; break;
case $s[$i] < "\xC0": $e[$j] = "\xC2"; $e[++$j] = $s[$i]; break;
default: $e[$j] = "\xC3"; $e[++$j] = chr(ord($s[$i]) - 64); break;
}
return substr($e, 0, $j);
}
static function utf8_decode($s)
{
$len = strlen($s);
for ($i = 0, $j = 0; $i < $len; ++$i, ++$j)
{
switch ($s[$i] & "\xF0")
{
case "\xC0":
case "\xD0":
$c = (ord($s[$i] & "\x1F") << 6) | ord($s[++$i] & "\x3F");
$s[$j] = $c < 256 ? chr($c) : '?';
break;
case "\xF0": ++$i;
case "\xE0":
$s[$j] = '?';
$i += 2;
break;
default:
$s[$j] = $s[$i];
}
}
return substr($s, 0, $j);
}
}

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1 @@
a:149:{s:1:" ";s:1:" ";s:1:"!";s:1:"!";s:1:""";s:1:""";s:1:"#";s:1:"#";s:1:"$";s:1:"$";s:1:"%";s:1:"%";s:1:"&";s:1:"&";s:1:"'";s:3:"";s:1:"(";s:1:"(";s:1:")";s:1:")";s:1:"*";s:1:"*";s:1:"+";s:1:"+";s:1:",";s:1:",";s:1:"-";s:1:"-";s:1:".";s:1:".";s:1:"/";s:1:"/";i:0;s:1:"0";i:1;s:1:"1";i:2;s:1:"2";i:3;s:1:"3";i:4;s:1:"4";i:5;s:1:"5";i:6;s:1:"6";i:7;s:1:"7";i:8;s:1:"8";i:9;s:1:"9";s:1:":";s:1:":";s:1:";";s:1:";";s:1:"<";s:1:"<";s:1:"=";s:1:"=";s:1:">";s:1:">";s:1:"?";s:1:"?";s:1:"@";s:1:"@";s:1:"A";s:1:"A";s:1:"B";s:1:"B";s:1:"C";s:1:"C";s:1:"D";s:1:"D";s:1:"E";s:1:"E";s:1:"F";s:1:"F";s:1:"G";s:1:"G";s:1:"H";s:1:"H";s:1:"I";s:1:"I";s:1:"J";s:1:"J";s:1:"K";s:1:"K";s:1:"L";s:1:"L";s:1:"M";s:1:"M";s:1:"N";s:1:"N";s:1:"O";s:1:"O";s:1:"P";s:1:"P";s:1:"Q";s:1:"Q";s:1:"R";s:1:"R";s:1:"S";s:1:"S";s:1:"T";s:1:"T";s:1:"U";s:1:"U";s:1:"V";s:1:"V";s:1:"W";s:1:"W";s:1:"X";s:1:"X";s:1:"Y";s:1:"Y";s:1:"Z";s:1:"Z";s:1:"[";s:1:"[";s:1:"\";s:1:"\";s:1:"]";s:1:"]";s:1:"^";s:1:"^";s:1:"_";s:1:"_";s:1:"`";s:3:"";s:1:"a";s:1:"a";s:1:"b";s:1:"b";s:1:"c";s:1:"c";s:1:"d";s:1:"d";s:1:"e";s:1:"e";s:1:"f";s:1:"f";s:1:"g";s:1:"g";s:1:"h";s:1:"h";s:1:"i";s:1:"i";s:1:"j";s:1:"j";s:1:"k";s:1:"k";s:1:"l";s:1:"l";s:1:"m";s:1:"m";s:1:"n";s:1:"n";s:1:"o";s:1:"o";s:1:"p";s:1:"p";s:1:"q";s:1:"q";s:1:"r";s:1:"r";s:1:"s";s:1:"s";s:1:"t";s:1:"t";s:1:"u";s:1:"u";s:1:"v";s:1:"v";s:1:"w";s:1:"w";s:1:"x";s:1:"x";s:1:"y";s:1:"y";s:1:"z";s:1:"z";s:1:"{";s:1:"{";s:1:"|";s:1:"|";s:1:"}";s:1:"}";s:1:"~";s:1:"~";s:1:"<22>";s:2:"¡";s:1:"<22>";s:2:"¢";s:1:"<22>";s:2:"£";s:1:"<22>";s:3:"";s:1:"<22>";s:2:"¥";s:1:"<22>";s:2:"ƒ";s:1:"<22>";s:2:"§";s:1:"<22>";s:2:"¤";s:1:"<22>";s:1:"'";s:1:"<22>";s:3:"“";s:1:"<22>";s:2:"«";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"fi";s:1:"<22>";s:3:"fl";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"†";s:1:"<22>";s:3:"‡";s:1:"<22>";s:2:"·";s:1:"<22>";s:2:"¶";s:1:"<22>";s:3:"•";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"„";s:1:"<22>";s:3:"”";s:1:"<22>";s:2:"»";s:1:"<22>";s:3:"…";s:1:"<22>";s:3:"‰";s:1:"<22>";s:2:"¿";s:1:"<22>";s:1:"`";s:1:"<22>";s:2:"´";s:1:"<22>";s:2:"ˆ";s:1:"<22>";s:2:"˜";s:1:"<22>";s:2:"¯";s:1:"<22>";s:2:"˘";s:1:"<22>";s:2:"˙";s:1:"<22>";s:2:"¨";s:1:"<22>";s:2:"˚";s:1:"<22>";s:2:"¸";s:1:"<22>";s:2:"˝";s:1:"<22>";s:2:"˛";s:1:"<22>";s:2:"ˇ";s:1:"<22>";s:3:"—";s:1:"<22>";s:2:"Æ";s:1:"<22>";s:2:"ª";s:1:"<22>";s:2:"Ł";s:1:"<22>";s:2:"Ø";s:1:"<22>";s:2:"Œ";s:1:"<22>";s:2:"º";s:1:"<22>";s:2:"æ";s:1:"<22>";s:2:"ı";s:1:"<22>";s:2:"ł";s:1:"<22>";s:2:"ø";s:1:"<22>";s:2:"œ";s:1:"<22>";s:2:"ß";}

View File

@@ -0,0 +1 @@
a:189:{s:1:" ";s:1:" ";s:1:"!";s:1:"!";s:1:""";s:3:"∀";s:1:"#";s:1:"#";s:1:"$";s:3:"∃";s:1:"%";s:1:"%";s:1:"&";s:1:"&";s:1:"'";s:3:"∋";s:1:"(";s:1:"(";s:1:")";s:1:")";s:1:"*";s:3:"";s:1:"+";s:1:"+";s:1:",";s:1:",";s:1:"-";s:3:"";s:1:".";s:1:".";s:1:"/";s:1:"/";i:0;s:1:"0";i:1;s:1:"1";i:2;s:1:"2";i:3;s:1:"3";i:4;s:1:"4";i:5;s:1:"5";i:6;s:1:"6";i:7;s:1:"7";i:8;s:1:"8";i:9;s:1:"9";s:1:":";s:1:":";s:1:";";s:1:";";s:1:"<";s:1:"<";s:1:"=";s:1:"=";s:1:">";s:1:">";s:1:"?";s:1:"?";s:1:"@";s:3:"≅";s:1:"A";s:2:"Α";s:1:"B";s:2:"Β";s:1:"C";s:2:"Χ";s:1:"D";s:2:"Δ";s:1:"E";s:2:"Ε";s:1:"F";s:2:"Φ";s:1:"G";s:2:"Γ";s:1:"H";s:2:"Η";s:1:"I";s:2:"Ι";s:1:"J";s:2:"ϑ";s:1:"K";s:2:"Κ";s:1:"L";s:2:"Λ";s:1:"M";s:2:"Μ";s:1:"N";s:2:"Ν";s:1:"O";s:2:"Ο";s:1:"P";s:2:"Π";s:1:"Q";s:2:"Θ";s:1:"R";s:2:"Ρ";s:1:"S";s:2:"Σ";s:1:"T";s:2:"Τ";s:1:"U";s:2:"Υ";s:1:"V";s:2:"ς";s:1:"W";s:2:"Ω";s:1:"X";s:2:"Ξ";s:1:"Y";s:2:"Ψ";s:1:"Z";s:2:"Ζ";s:1:"[";s:1:"[";s:1:"\";s:3:"∴";s:1:"]";s:1:"]";s:1:"^";s:3:"⊥";s:1:"_";s:1:"_";s:1:"`";s:3:"";s:1:"a";s:2:"α";s:1:"b";s:2:"β";s:1:"c";s:2:"χ";s:1:"d";s:2:"δ";s:1:"e";s:2:"ε";s:1:"f";s:2:"φ";s:1:"g";s:2:"γ";s:1:"h";s:2:"η";s:1:"i";s:2:"ι";s:1:"j";s:2:"ϕ";s:1:"k";s:2:"κ";s:1:"l";s:2:"λ";s:1:"m";s:2:"µ";s:1:"n";s:2:"ν";s:1:"o";s:2:"ο";s:1:"p";s:2:"π";s:1:"q";s:2:"θ";s:1:"r";s:2:"ρ";s:1:"s";s:2:"σ";s:1:"t";s:2:"τ";s:1:"u";s:2:"υ";s:1:"v";s:2:"ϖ";s:1:"w";s:2:"ω";s:1:"x";s:2:"ξ";s:1:"y";s:2:"ψ";s:1:"z";s:2:"ζ";s:1:"{";s:1:"{";s:1:"|";s:1:"|";s:1:"}";s:1:"}";s:1:"~";s:3:"";s:1:"<22>";s:3:"€";s:1:"<22>";s:2:"ϒ";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"≤";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"∞";s:1:"<22>";s:2:"ƒ";s:1:"<22>";s:3:"♣";s:1:"<22>";s:3:"♦";s:1:"<22>";s:3:"♥";s:1:"<22>";s:3:"♠";s:1:"<22>";s:3:"↔";s:1:"<22>";s:3:"←";s:1:"<22>";s:3:"↑";s:1:"<22>";s:3:"→";s:1:"<22>";s:3:"↓";s:1:"<22>";s:2:"°";s:1:"<22>";s:2:"±";s:1:"<22>";s:3:"″";s:1:"<22>";s:3:"≥";s:1:"<22>";s:2:"×";s:1:"<22>";s:3:"∝";s:1:"<22>";s:3:"∂";s:1:"<22>";s:3:"•";s:1:"<22>";s:2:"÷";s:1:"<22>";s:3:"≠";s:1:"<22>";s:3:"≡";s:1:"<22>";s:3:"≈";s:1:"<22>";s:3:"…";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"↵";s:1:"<22>";s:3:"ℵ";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"℘";s:1:"<22>";s:3:"⊗";s:1:"<22>";s:3:"⊕";s:1:"<22>";s:3:"∅";s:1:"<22>";s:3:"∩";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"⊃";s:1:"<22>";s:3:"⊇";s:1:"<22>";s:3:"⊄";s:1:"<22>";s:3:"⊂";s:1:"<22>";s:3:"⊆";s:1:"<22>";s:3:"∈";s:1:"<22>";s:3:"∉";s:1:"<22>";s:3:"∠";s:1:"<22>";s:3:"∇";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"∏";s:1:"<22>";s:3:"√";s:1:"<22>";s:3:"⋅";s:1:"<22>";s:2:"¬";s:1:"<22>";s:3:"∧";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"⇔";s:1:"<22>";s:3:"⇐";s:1:"<22>";s:3:"⇑";s:1:"<22>";s:3:"⇒";s:1:"<22>";s:3:"⇓";s:1:"<22>";s:3:"◊";s:1:"<22>";s:3:"〈";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"∑";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"〉";s:1:"<22>";s:3:"∫";s:1:"<22>";s:3:"⌠";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"⌡";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";}

View File

@@ -0,0 +1 @@
a:202:{s:1:" ";s:1:" ";s:1:"!";s:3:"✁";s:1:""";s:3:"✂";s:1:"#";s:3:"✃";s:1:"$";s:3:"✄";s:1:"%";s:3:"☎";s:1:"&";s:3:"✆";s:1:"'";s:3:"✇";s:1:"(";s:3:"✈";s:1:")";s:3:"✉";s:1:"*";s:3:"☛";s:1:"+";s:3:"☞";s:1:",";s:3:"✌";s:1:"-";s:3:"✍";s:1:".";s:3:"✎";s:1:"/";s:3:"✏";i:0;s:3:"✐";i:1;s:3:"✑";i:2;s:3:"✒";i:3;s:3:"✓";i:4;s:3:"✔";i:5;s:3:"✕";i:6;s:3:"✖";i:7;s:3:"✗";i:8;s:3:"✘";i:9;s:3:"✙";s:1:":";s:3:"✚";s:1:";";s:3:"✛";s:1:"<";s:3:"✜";s:1:"=";s:3:"✝";s:1:">";s:3:"✞";s:1:"?";s:3:"✟";s:1:"@";s:3:"✠";s:1:"A";s:3:"✡";s:1:"B";s:3:"✢";s:1:"C";s:3:"✣";s:1:"D";s:3:"✤";s:1:"E";s:3:"✥";s:1:"F";s:3:"✦";s:1:"G";s:3:"✧";s:1:"H";s:3:"★";s:1:"I";s:3:"✩";s:1:"J";s:3:"✪";s:1:"K";s:3:"✫";s:1:"L";s:3:"✬";s:1:"M";s:3:"✭";s:1:"N";s:3:"✮";s:1:"O";s:3:"✯";s:1:"P";s:3:"✰";s:1:"Q";s:3:"✱";s:1:"R";s:3:"✲";s:1:"S";s:3:"✳";s:1:"T";s:3:"✴";s:1:"U";s:3:"✵";s:1:"V";s:3:"✶";s:1:"W";s:3:"✷";s:1:"X";s:3:"✸";s:1:"Y";s:3:"✹";s:1:"Z";s:3:"✺";s:1:"[";s:3:"✻";s:1:"\";s:3:"✼";s:1:"]";s:3:"✽";s:1:"^";s:3:"✾";s:1:"_";s:3:"✿";s:1:"`";s:3:"❀";s:1:"a";s:3:"❁";s:1:"b";s:3:"❂";s:1:"c";s:3:"❃";s:1:"d";s:3:"❄";s:1:"e";s:3:"❅";s:1:"f";s:3:"❆";s:1:"g";s:3:"❇";s:1:"h";s:3:"❈";s:1:"i";s:3:"❉";s:1:"j";s:3:"❊";s:1:"k";s:3:"❋";s:1:"l";s:3:"●";s:1:"m";s:3:"❍";s:1:"n";s:3:"■";s:1:"o";s:3:"❏";s:1:"p";s:3:"❐";s:1:"q";s:3:"❑";s:1:"r";s:3:"❒";s:1:"s";s:3:"▲";s:1:"t";s:3:"▼";s:1:"u";s:3:"◆";s:1:"v";s:3:"❖";s:1:"w";s:3:"◗";s:1:"x";s:3:"❘";s:1:"y";s:3:"❙";s:1:"z";s:3:"❚";s:1:"{";s:3:"❛";s:1:"|";s:3:"❜";s:1:"}";s:3:"❝";s:1:"~";s:3:"❞";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"❡";s:1:"<22>";s:3:"❢";s:1:"<22>";s:3:"❣";s:1:"<22>";s:3:"❤";s:1:"<22>";s:3:"❥";s:1:"<22>";s:3:"❦";s:1:"<22>";s:3:"❧";s:1:"<22>";s:3:"♣";s:1:"<22>";s:3:"♦";s:1:"<22>";s:3:"♥";s:1:"<22>";s:3:"♠";s:1:"<22>";s:3:"①";s:1:"<22>";s:3:"②";s:1:"<22>";s:3:"③";s:1:"<22>";s:3:"④";s:1:"<22>";s:3:"⑤";s:1:"<22>";s:3:"⑥";s:1:"<22>";s:3:"⑦";s:1:"<22>";s:3:"⑧";s:1:"<22>";s:3:"⑨";s:1:"<22>";s:3:"⑩";s:1:"<22>";s:3:"❶";s:1:"<22>";s:3:"❷";s:1:"<22>";s:3:"❸";s:1:"<22>";s:3:"❹";s:1:"<22>";s:3:"❺";s:1:"<22>";s:3:"❻";s:1:"<22>";s:3:"❼";s:1:"<22>";s:3:"❽";s:1:"<22>";s:3:"❾";s:1:"<22>";s:3:"❿";s:1:"<22>";s:3:"➀";s:1:"<22>";s:3:"➁";s:1:"<22>";s:3:"➂";s:1:"<22>";s:3:"➃";s:1:"<22>";s:3:"➄";s:1:"<22>";s:3:"➅";s:1:"<22>";s:3:"➆";s:1:"<22>";s:3:"➇";s:1:"<22>";s:3:"➈";s:1:"<22>";s:3:"➉";s:1:"<22>";s:3:"➊";s:1:"<22>";s:3:"➋";s:1:"<22>";s:3:"➌";s:1:"<22>";s:3:"➍";s:1:"<22>";s:3:"➎";s:1:"<22>";s:3:"➏";s:1:"<22>";s:3:"➐";s:1:"<22>";s:3:"➑";s:1:"<22>";s:3:"➒";s:1:"<22>";s:3:"➓";s:1:"<22>";s:3:"➔";s:1:"<22>";s:3:"→";s:1:"<22>";s:3:"↔";s:1:"<22>";s:3:"↕";s:1:"<22>";s:3:"➘";s:1:"<22>";s:3:"➙";s:1:"<22>";s:3:"➚";s:1:"<22>";s:3:"➛";s:1:"<22>";s:3:"➜";s:1:"<22>";s:3:"➝";s:1:"<22>";s:3:"➞";s:1:"<22>";s:3:"➟";s:1:"<22>";s:3:"➠";s:1:"<22>";s:3:"➡";s:1:"<22>";s:3:"➢";s:1:"<22>";s:3:"➣";s:1:"<22>";s:3:"➤";s:1:"<22>";s:3:"➥";s:1:"<22>";s:3:"➦";s:1:"<22>";s:3:"➧";s:1:"<22>";s:3:"➨";s:1:"<22>";s:3:"➩";s:1:"<22>";s:3:"➪";s:1:"<22>";s:3:"➫";s:1:"<22>";s:3:"➬";s:1:"<22>";s:3:"➭";s:1:"<22>";s:3:"➮";s:1:"<22>";s:3:"➯";s:1:"<22>";s:3:"➱";s:1:"<22>";s:3:"➲";s:1:"<22>";s:3:"➳";s:1:"<22>";s:3:"➴";s:1:"<22>";s:3:"➵";s:1:"<22>";s:3:"➶";s:1:"<22>";s:3:"➷";s:1:"<22>";s:3:"➸";s:1:"<22>";s:3:"➹";s:1:"<22>";s:3:"➺";s:1:"<22>";s:3:"➻";s:1:"<22>";s:3:"➼";s:1:"<22>";s:3:"➽";s:1:"<22>";s:3:"➾";}

View File

@@ -0,0 +1 @@
a:154:{s:2:" ";s:1:" ";s:2:"­";s:1:"-";s:3:"";s:1:"<22>";s:3:"∙";s:1:"<22>";s:2:"ˉ";s:1:"<22>";s:1:" ";s:1:" ";s:1:"!";s:1:"!";s:1:""";s:1:""";s:1:"#";s:1:"#";s:1:"$";s:1:"$";s:1:"%";s:1:"%";s:1:"&";s:1:"&";s:3:"";s:1:"'";s:1:"(";s:1:"(";s:1:")";s:1:")";s:1:"*";s:1:"*";s:1:"+";s:1:"+";s:1:",";s:1:",";s:1:"-";s:1:"-";s:1:".";s:1:".";s:1:"/";s:1:"/";i:0;i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;i:6;i:7;i:7;i:8;i:8;i:9;i:9;s:1:":";s:1:":";s:1:";";s:1:";";s:1:"<";s:1:"<";s:1:"=";s:1:"=";s:1:">";s:1:">";s:1:"?";s:1:"?";s:1:"@";s:1:"@";s:1:"A";s:1:"A";s:1:"B";s:1:"B";s:1:"C";s:1:"C";s:1:"D";s:1:"D";s:1:"E";s:1:"E";s:1:"F";s:1:"F";s:1:"G";s:1:"G";s:1:"H";s:1:"H";s:1:"I";s:1:"I";s:1:"J";s:1:"J";s:1:"K";s:1:"K";s:1:"L";s:1:"L";s:1:"M";s:1:"M";s:1:"N";s:1:"N";s:1:"O";s:1:"O";s:1:"P";s:1:"P";s:1:"Q";s:1:"Q";s:1:"R";s:1:"R";s:1:"S";s:1:"S";s:1:"T";s:1:"T";s:1:"U";s:1:"U";s:1:"V";s:1:"V";s:1:"W";s:1:"W";s:1:"X";s:1:"X";s:1:"Y";s:1:"Y";s:1:"Z";s:1:"Z";s:1:"[";s:1:"[";s:1:"\";s:1:"\";s:1:"]";s:1:"]";s:1:"^";s:1:"^";s:1:"_";s:1:"_";s:3:"";s:1:"`";s:1:"a";s:1:"a";s:1:"b";s:1:"b";s:1:"c";s:1:"c";s:1:"d";s:1:"d";s:1:"e";s:1:"e";s:1:"f";s:1:"f";s:1:"g";s:1:"g";s:1:"h";s:1:"h";s:1:"i";s:1:"i";s:1:"j";s:1:"j";s:1:"k";s:1:"k";s:1:"l";s:1:"l";s:1:"m";s:1:"m";s:1:"n";s:1:"n";s:1:"o";s:1:"o";s:1:"p";s:1:"p";s:1:"q";s:1:"q";s:1:"r";s:1:"r";s:1:"s";s:1:"s";s:1:"t";s:1:"t";s:1:"u";s:1:"u";s:1:"v";s:1:"v";s:1:"w";s:1:"w";s:1:"x";s:1:"x";s:1:"y";s:1:"y";s:1:"z";s:1:"z";s:1:"{";s:1:"{";s:1:"|";s:1:"|";s:1:"}";s:1:"}";s:1:"~";s:1:"~";s:2:"¡";s:1:"<22>";s:2:"¢";s:1:"<22>";s:2:"£";s:1:"<22>";s:3:"";s:1:"<22>";s:2:"¥";s:1:"<22>";s:2:"ƒ";s:1:"<22>";s:2:"§";s:1:"<22>";s:2:"¤";s:1:"<22>";s:1:"'";s:1:"<22>";s:3:"“";s:1:"<22>";s:2:"«";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"fi";s:1:"<22>";s:3:"fl";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"†";s:1:"<22>";s:3:"‡";s:1:"<22>";s:2:"·";s:1:"<22>";s:2:"¶";s:1:"<22>";s:3:"•";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"„";s:1:"<22>";s:3:"”";s:1:"<22>";s:2:"»";s:1:"<22>";s:3:"…";s:1:"<22>";s:3:"‰";s:1:"<22>";s:2:"¿";s:1:"<22>";s:1:"`";s:1:"<22>";s:2:"´";s:1:"<22>";s:2:"ˆ";s:1:"<22>";s:2:"˜";s:1:"<22>";s:2:"¯";s:1:"<22>";s:2:"˘";s:1:"<22>";s:2:"˙";s:1:"<22>";s:2:"¨";s:1:"<22>";s:2:"˚";s:1:"<22>";s:2:"¸";s:1:"<22>";s:2:"˝";s:1:"<22>";s:2:"˛";s:1:"<22>";s:2:"ˇ";s:1:"<22>";s:3:"—";s:1:"<22>";s:2:"Æ";s:1:"<22>";s:2:"ª";s:1:"<22>";s:2:"Ł";s:1:"<22>";s:2:"Ø";s:1:"<22>";s:2:"Œ";s:1:"<22>";s:2:"º";s:1:"<22>";s:2:"æ";s:1:"<22>";s:2:"ı";s:1:"<22>";s:2:"ł";s:1:"<22>";s:2:"ø";s:1:"<22>";s:2:"œ";s:1:"<22>";s:2:"ß";s:1:"<22>";}

View File

@@ -0,0 +1 @@
a:194:{s:2:" ";s:1:" ";s:3:"∆";s:1:"D";s:3:"Ω";s:1:"W";s:2:"μ";s:1:"m";s:3:"";s:1:"<22>";s:1:" ";s:1:" ";s:1:"!";s:1:"!";s:3:"∀";s:1:""";s:1:"#";s:1:"#";s:3:"∃";s:1:"$";s:1:"%";s:1:"%";s:1:"&";s:1:"&";s:3:"∋";s:1:"'";s:1:"(";s:1:"(";s:1:")";s:1:")";s:3:"";s:1:"*";s:1:"+";s:1:"+";s:1:",";s:1:",";s:3:"";s:1:"-";s:1:".";s:1:".";s:1:"/";s:1:"/";i:0;i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;i:6;i:7;i:7;i:8;i:8;i:9;i:9;s:1:":";s:1:":";s:1:";";s:1:";";s:1:"<";s:1:"<";s:1:"=";s:1:"=";s:1:">";s:1:">";s:1:"?";s:1:"?";s:3:"≅";s:1:"@";s:2:"Α";s:1:"A";s:2:"Β";s:1:"B";s:2:"Χ";s:1:"C";s:2:"Δ";s:1:"D";s:2:"Ε";s:1:"E";s:2:"Φ";s:1:"F";s:2:"Γ";s:1:"G";s:2:"Η";s:1:"H";s:2:"Ι";s:1:"I";s:2:"ϑ";s:1:"J";s:2:"Κ";s:1:"K";s:2:"Λ";s:1:"L";s:2:"Μ";s:1:"M";s:2:"Ν";s:1:"N";s:2:"Ο";s:1:"O";s:2:"Π";s:1:"P";s:2:"Θ";s:1:"Q";s:2:"Ρ";s:1:"R";s:2:"Σ";s:1:"S";s:2:"Τ";s:1:"T";s:2:"Υ";s:1:"U";s:2:"ς";s:1:"V";s:2:"Ω";s:1:"W";s:2:"Ξ";s:1:"X";s:2:"Ψ";s:1:"Y";s:2:"Ζ";s:1:"Z";s:1:"[";s:1:"[";s:3:"∴";s:1:"\";s:1:"]";s:1:"]";s:3:"⊥";s:1:"^";s:1:"_";s:1:"_";s:3:"";s:1:"`";s:2:"α";s:1:"a";s:2:"β";s:1:"b";s:2:"χ";s:1:"c";s:2:"δ";s:1:"d";s:2:"ε";s:1:"e";s:2:"φ";s:1:"f";s:2:"γ";s:1:"g";s:2:"η";s:1:"h";s:2:"ι";s:1:"i";s:2:"ϕ";s:1:"j";s:2:"κ";s:1:"k";s:2:"λ";s:1:"l";s:2:"µ";s:1:"m";s:2:"ν";s:1:"n";s:2:"ο";s:1:"o";s:2:"π";s:1:"p";s:2:"θ";s:1:"q";s:2:"ρ";s:1:"r";s:2:"σ";s:1:"s";s:2:"τ";s:1:"t";s:2:"υ";s:1:"u";s:2:"ϖ";s:1:"v";s:2:"ω";s:1:"w";s:2:"ξ";s:1:"x";s:2:"ψ";s:1:"y";s:2:"ζ";s:1:"z";s:1:"{";s:1:"{";s:1:"|";s:1:"|";s:1:"}";s:1:"}";s:3:"";s:1:"~";s:3:"€";s:1:"<22>";s:2:"ϒ";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"≤";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"∞";s:1:"<22>";s:2:"ƒ";s:1:"<22>";s:3:"♣";s:1:"<22>";s:3:"♦";s:1:"<22>";s:3:"♥";s:1:"<22>";s:3:"♠";s:1:"<22>";s:3:"↔";s:1:"<22>";s:3:"←";s:1:"<22>";s:3:"↑";s:1:"<22>";s:3:"→";s:1:"<22>";s:3:"↓";s:1:"<22>";s:2:"°";s:1:"<22>";s:2:"±";s:1:"<22>";s:3:"″";s:1:"<22>";s:3:"≥";s:1:"<22>";s:2:"×";s:1:"<22>";s:3:"∝";s:1:"<22>";s:3:"∂";s:1:"<22>";s:3:"•";s:1:"<22>";s:2:"÷";s:1:"<22>";s:3:"≠";s:1:"<22>";s:3:"≡";s:1:"<22>";s:3:"≈";s:1:"<22>";s:3:"…";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"↵";s:1:"<22>";s:3:"ℵ";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"℘";s:1:"<22>";s:3:"⊗";s:1:"<22>";s:3:"⊕";s:1:"<22>";s:3:"∅";s:1:"<22>";s:3:"∩";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"⊃";s:1:"<22>";s:3:"⊇";s:1:"<22>";s:3:"⊄";s:1:"<22>";s:3:"⊂";s:1:"<22>";s:3:"⊆";s:1:"<22>";s:3:"∈";s:1:"<22>";s:3:"∉";s:1:"<22>";s:3:"∠";s:1:"<22>";s:3:"∇";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"∏";s:1:"<22>";s:3:"√";s:1:"<22>";s:3:"⋅";s:1:"<22>";s:2:"¬";s:1:"<22>";s:3:"∧";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"⇔";s:1:"<22>";s:3:"⇐";s:1:"<22>";s:3:"⇑";s:1:"<22>";s:3:"⇒";s:1:"<22>";s:3:"⇓";s:1:"<22>";s:3:"◊";s:1:"<22>";s:3:"〈";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"∑";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"〉";s:1:"<22>";s:3:"∫";s:1:"<22>";s:3:"⌠";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"⌡";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";}

View File

@@ -0,0 +1 @@
a:203:{s:2:" ";s:1:" ";s:1:" ";s:1:" ";s:3:"✁";s:1:"!";s:3:"✂";s:1:""";s:3:"✃";s:1:"#";s:3:"✄";s:1:"$";s:3:"☎";s:1:"%";s:3:"✆";s:1:"&";s:3:"✇";s:1:"'";s:3:"✈";s:1:"(";s:3:"✉";s:1:")";s:3:"☛";s:1:"*";s:3:"☞";s:1:"+";s:3:"✌";s:1:",";s:3:"✍";s:1:"-";s:3:"✎";s:1:".";s:3:"✏";s:1:"/";s:3:"✐";i:0;s:3:"✑";i:1;s:3:"✒";i:2;s:3:"✓";i:3;s:3:"✔";i:4;s:3:"✕";i:5;s:3:"✖";i:6;s:3:"✗";i:7;s:3:"✘";i:8;s:3:"✙";i:9;s:3:"✚";s:1:":";s:3:"✛";s:1:";";s:3:"✜";s:1:"<";s:3:"✝";s:1:"=";s:3:"✞";s:1:">";s:3:"✟";s:1:"?";s:3:"✠";s:1:"@";s:3:"✡";s:1:"A";s:3:"✢";s:1:"B";s:3:"✣";s:1:"C";s:3:"✤";s:1:"D";s:3:"✥";s:1:"E";s:3:"✦";s:1:"F";s:3:"✧";s:1:"G";s:3:"★";s:1:"H";s:3:"✩";s:1:"I";s:3:"✪";s:1:"J";s:3:"✫";s:1:"K";s:3:"✬";s:1:"L";s:3:"✭";s:1:"M";s:3:"✮";s:1:"N";s:3:"✯";s:1:"O";s:3:"✰";s:1:"P";s:3:"✱";s:1:"Q";s:3:"✲";s:1:"R";s:3:"✳";s:1:"S";s:3:"✴";s:1:"T";s:3:"✵";s:1:"U";s:3:"✶";s:1:"V";s:3:"✷";s:1:"W";s:3:"✸";s:1:"X";s:3:"✹";s:1:"Y";s:3:"✺";s:1:"Z";s:3:"✻";s:1:"[";s:3:"✼";s:1:"\";s:3:"✽";s:1:"]";s:3:"✾";s:1:"^";s:3:"✿";s:1:"_";s:3:"❀";s:1:"`";s:3:"❁";s:1:"a";s:3:"❂";s:1:"b";s:3:"❃";s:1:"c";s:3:"❄";s:1:"d";s:3:"❅";s:1:"e";s:3:"❆";s:1:"f";s:3:"❇";s:1:"g";s:3:"❈";s:1:"h";s:3:"❉";s:1:"i";s:3:"❊";s:1:"j";s:3:"❋";s:1:"k";s:3:"●";s:1:"l";s:3:"❍";s:1:"m";s:3:"■";s:1:"n";s:3:"❏";s:1:"o";s:3:"❐";s:1:"p";s:3:"❑";s:1:"q";s:3:"❒";s:1:"r";s:3:"▲";s:1:"s";s:3:"▼";s:1:"t";s:3:"◆";s:1:"u";s:3:"❖";s:1:"v";s:3:"◗";s:1:"w";s:3:"❘";s:1:"x";s:3:"❙";s:1:"y";s:3:"❚";s:1:"z";s:3:"❛";s:1:"{";s:3:"❜";s:1:"|";s:3:"❝";s:1:"}";s:3:"❞";s:1:"~";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"";s:1:"<22>";s:3:"❡";s:1:"<22>";s:3:"❢";s:1:"<22>";s:3:"❣";s:1:"<22>";s:3:"❤";s:1:"<22>";s:3:"❥";s:1:"<22>";s:3:"❦";s:1:"<22>";s:3:"❧";s:1:"<22>";s:3:"♣";s:1:"<22>";s:3:"♦";s:1:"<22>";s:3:"♥";s:1:"<22>";s:3:"♠";s:1:"<22>";s:3:"①";s:1:"<22>";s:3:"②";s:1:"<22>";s:3:"③";s:1:"<22>";s:3:"④";s:1:"<22>";s:3:"⑤";s:1:"<22>";s:3:"⑥";s:1:"<22>";s:3:"⑦";s:1:"<22>";s:3:"⑧";s:1:"<22>";s:3:"⑨";s:1:"<22>";s:3:"⑩";s:1:"<22>";s:3:"❶";s:1:"<22>";s:3:"❷";s:1:"<22>";s:3:"❸";s:1:"<22>";s:3:"❹";s:1:"<22>";s:3:"❺";s:1:"<22>";s:3:"❻";s:1:"<22>";s:3:"❼";s:1:"<22>";s:3:"❽";s:1:"<22>";s:3:"❾";s:1:"<22>";s:3:"❿";s:1:"<22>";s:3:"➀";s:1:"<22>";s:3:"➁";s:1:"<22>";s:3:"➂";s:1:"<22>";s:3:"➃";s:1:"<22>";s:3:"➄";s:1:"<22>";s:3:"➅";s:1:"<22>";s:3:"➆";s:1:"<22>";s:3:"➇";s:1:"<22>";s:3:"➈";s:1:"<22>";s:3:"➉";s:1:"<22>";s:3:"➊";s:1:"<22>";s:3:"➋";s:1:"<22>";s:3:"➌";s:1:"<22>";s:3:"➍";s:1:"<22>";s:3:"➎";s:1:"<22>";s:3:"➏";s:1:"<22>";s:3:"➐";s:1:"<22>";s:3:"➑";s:1:"<22>";s:3:"➒";s:1:"<22>";s:3:"➓";s:1:"<22>";s:3:"➔";s:1:"<22>";s:3:"→";s:1:"<22>";s:3:"↔";s:1:"<22>";s:3:"↕";s:1:"<22>";s:3:"➘";s:1:"<22>";s:3:"➙";s:1:"<22>";s:3:"➚";s:1:"<22>";s:3:"➛";s:1:"<22>";s:3:"➜";s:1:"<22>";s:3:"➝";s:1:"<22>";s:3:"➞";s:1:"<22>";s:3:"➟";s:1:"<22>";s:3:"➠";s:1:"<22>";s:3:"➡";s:1:"<22>";s:3:"➢";s:1:"<22>";s:3:"➣";s:1:"<22>";s:3:"➤";s:1:"<22>";s:3:"➥";s:1:"<22>";s:3:"➦";s:1:"<22>";s:3:"➧";s:1:"<22>";s:3:"➨";s:1:"<22>";s:3:"➩";s:1:"<22>";s:3:"➪";s:1:"<22>";s:3:"➫";s:1:"<22>";s:3:"➬";s:1:"<22>";s:3:"➭";s:1:"<22>";s:3:"➮";s:1:"<22>";s:3:"➯";s:1:"<22>";s:3:"➱";s:1:"<22>";s:3:"➲";s:1:"<22>";s:3:"➳";s:1:"<22>";s:3:"➴";s:1:"<22>";s:3:"➵";s:1:"<22>";s:3:"➶";s:1:"<22>";s:3:"➷";s:1:"<22>";s:3:"➸";s:1:"<22>";s:3:"➹";s:1:"<22>";s:3:"➺";s:1:"<22>";s:3:"➻";s:1:"<22>";s:3:"➼";s:1:"<22>";s:3:"➽";s:1:"<22>";s:3:"➾";s:1:"<22>";}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,498 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork;
use Normalizer as n;
/**
* UTF-8 Grapheme Cluster aware string manipulations implementing the quasi complete
* set of native PHP string functions that need UTF-8 awareness and more.
* Missing are printf-family functions.
*/
class Utf8
{
protected static
$commonCaseFold = array(
array('µ','ſ',"\xCD\x85",'ς',"\xCF\x90","\xCF\x91","\xCF\x95","\xCF\x96","\xCF\xB0","\xCF\xB1","\xCF\xB5","\xE1\xBA\x9B","\xE1\xBE\xBE"),
array('μ','s','ι', 'σ','β', 'θ', 'φ', 'π', 'κ', 'ρ', 'ε', "\xE1\xB9\xA1",'ι' )
),
$cp1252 = array('€','‚','ƒ','„','…','†','‡','ˆ','‰','Š','‹','Œ','Ž','‘','’','“','”','•','–','—','˜','™','š','›','œ','ž','Ÿ'),
$utf8 = array('€','','ƒ','„','…','†','‡','ˆ','‰','Š','','Œ','Ž','','','“','”','•','','—','˜','™','š','','œ','ž','Ÿ');
static function isUtf8($s)
{
return (bool) preg_match('//u', $s); // Since PHP 5.2.5, this also excludes invalid five and six bytes sequences
}
// Generic UTF-8 to ASCII transliteration
static function toAscii($s)
{
if (preg_match("/[\x80-\xFF]/", $s))
{
static $translitExtra = false;
$translitExtra or $translitExtra = self::getData('translit_extra');
$s = n::normalize($s, n::NFKD);
$s = preg_replace('/\p{Mn}+/u', '', $s);
$s = str_replace($translitExtra[0], $translitExtra[1], $s);
$s = iconv('UTF-8', 'ASCII' . ('glibc' !== ICONV_IMPL ? '//IGNORE' : '') . '//TRANSLIT', $s);
}
return $s;
}
// Unicode transformation for caseless matching
// see http://unicode.org/reports/tr21/tr21-5.html
static function strtocasefold($s, $full = true, $turkish = false)
{
$s = str_replace(self::$commonCaseFold[0], self::$commonCaseFold[1], $s);
if ($turkish)
{
false !== strpos($s, 'I') && $s = str_replace('I', 'ı', $s);
$full && false !== strpos($s, 'İ') && $s = str_replace('İ', 'i', $s);
}
if ($full)
{
static $fullCaseFold = false;
$fullCaseFold || $fullCaseFold = self::getData('caseFolding_full');
$s = str_replace($fullCaseFold[0], $fullCaseFold[1], $s);
}
return self::strtolower($s);
}
// Generic case sensitive collation support for self::strnatcmp()
static function strtonatfold($s)
{
$s = n::normalize($s, n::NFD);
return preg_replace('/\p{Mn}+/u', '', $s);
}
// PHP string functions that need UTF-8 awareness
static function substr($s, $start, $len = 2147483647)
{
/**/ static $bug62759;
/**/ isset($bug62759) or $bug62759 = extension_loaded('intl') && 'à' === grapheme_substr('éà', 1, -2);
/**/ if ($bug62759)
/**/ {
return PHP\Shim\Intl::grapheme_substr_workaround62759($s, $start, $len);
/**/ }
/**/ else
/**/ {
return grapheme_substr($s, $start, $len);
/**/ }
}
static function strlen($s) {return grapheme_strlen($s);}
static function strpos ($s, $needle, $offset = 0) {return grapheme_strpos ($s, $needle, $offset);}
static function strrpos($s, $needle, $offset = 0) {return grapheme_strrpos($s, $needle, $offset);}
static function stripos($s, $needle, $offset = 0)
{
/**/ if (50418 > PHP_VERSION_ID || 50500 == PHP_VERSION_ID)
/**/ {
// Don't use grapheme_stripos because of https://bugs.php.net/61860
if ($offset < 0) $offset = 0;
if (!$needle = mb_stripos($s, $needle, $offset, 'UTF-8')) return $needle;
return grapheme_strlen(iconv_substr($s, 0, $needle, 'UTF-8'));
/**/ }
/**/ else
/**/ {
return grapheme_stripos($s, $needle, $offset);
/**/ }
}
static function strripos($s, $needle, $offset = 0)
{
/**/ if (50418 > PHP_VERSION_ID || 50500 == PHP_VERSION_ID)
/**/ {
// Don't use grapheme_strripos because of https://bugs.php.net/61860
if ($offset < 0) $offset = 0;
if (!$needle = mb_strripos($s, $needle, $offset, 'UTF-8')) return $needle;
return grapheme_strlen(iconv_substr($s, 0, $needle, 'UTF-8'));
/**/ }
/**/ else
/**/ {
return grapheme_strripos($s, $needle, $offset);
/**/ }
}
static function stristr($s, $needle, $before_needle = false)
{
if ('' === (string) $needle) return false;
return mb_stristr($s, $needle, $before_needle, 'UTF-8');
}
static function strstr ($s, $needle, $before_needle = false) {return grapheme_strstr($s, $needle, $before_needle);}
static function strrchr ($s, $needle, $before_needle = false) {return mb_strrchr ($s, $needle, $before_needle, 'UTF-8');}
static function strrichr($s, $needle, $before_needle = false) {return mb_strrichr($s, $needle, $before_needle, 'UTF-8');}
static function strtolower($s, $form = n::NFC) {if (n::isNormalized($s = mb_strtolower($s, 'UTF-8'), $form)) return $s; return n::normalize($s, $form);}
static function strtoupper($s, $form = n::NFC) {if (n::isNormalized($s = mb_strtoupper($s, 'UTF-8'), $form)) return $s; return n::normalize($s, $form);}
static function wordwrap($s, $width = 75, $break = "\n", $cut = false)
{
// This implementation could be extended to handle unicode word boundaries,
// but that's enough work for today (see http://www.unicode.org/reports/tr29/)
$width = (int) $width;
$s = explode($break, $s);
$iLen = count($s);
$result = array();
$line = '';
$lineLen = 0;
for ($i = 0; $i < $iLen; ++$i)
{
$words = explode(' ', $s[$i]);
$line && $result[] = $line;
$lineLen = grapheme_strlen($line);
$jLen = count($words);
for ($j = 0; $j < $jLen; ++$j)
{
$w = $words[$j];
$wLen = grapheme_strlen($w);
if ($lineLen + $wLen < $width)
{
if ($j) $line .= ' ';
$line .= $w;
$lineLen += $wLen + 1;
}
else
{
if ($j || $i) $result[] = $line;
$line = '';
$lineLen = 0;
if ($cut && $wLen > $width)
{
$w = self::str_split($w);
do
{
$result[] = implode('', array_slice($w, 0, $width));
$line = implode('', $w = array_slice($w, $width));
$lineLen = $wLen -= $width;
}
while ($wLen > $width);
$w = implode('', $w);
}
$line = $w;
$lineLen = $wLen;
}
}
}
$line && $result[] = $line;
return implode($break, $result);
}
static function chr($c)
{
if (0x80 > $c %= 0x200000) return chr($c);
if (0x800 > $c) return chr(0xC0 | $c>>6) . chr(0x80 | $c & 0x3F);
if (0x10000 > $c) return chr(0xE0 | $c>>12) . chr(0x80 | $c>>6 & 0x3F) . chr(0x80 | $c & 0x3F);
return chr(0xF0 | $c>>18) . chr(0x80 | $c>>12 & 0x3F) . chr(0x80 | $c>>6 & 0x3F) . chr(0x80 | $c & 0x3F);
}
static function count_chars($s, $mode = 0)
{
if (1 != $mode) user_error(__METHOD__ . '(): the only allowed $mode is 1', E_USER_WARNING);
$s = self::str_split($s);
return array_count_values($s);
}
static function ltrim($s, $charlist = INF)
{
$charlist = INF === $charlist ? '\s' : self::rxClass($charlist);
return preg_replace("/^{$charlist}+/u", '', $s);
}
static function ord($s)
{
$a = ($s = unpack('C*', substr($s, 0, 4))) ? $s[1] : 0;
if (0xF0 <= $a) return (($a - 0xF0)<<18) + (($s[2] - 0x80)<<12) + (($s[3] - 0x80)<<6) + $s[4] - 0x80;
if (0xE0 <= $a) return (($a - 0xE0)<<12) + (($s[2] - 0x80)<<6) + $s[3] - 0x80;
if (0xC0 <= $a) return (($a - 0xC0)<<6) + $s[2] - 0x80;
return $a;
}
static function rtrim($s, $charlist = INF)
{
$charlist = INF === $charlist ? '\s' : self::rxClass($charlist);
return preg_replace("/{$charlist}+$/u", '', $s);
}
static function trim($s, $charlist = INF) {return self::rtrim(self::ltrim($s, $charlist), $charlist);}
static function str_ireplace($search, $replace, $subject, &$count = null)
{
$search = (array) $search;
foreach ($search as &$s) $s = '' !== (string) $s ? '/' . preg_quote($s, '/') . '/ui' : '/^(?<=.)$/';
$subject = preg_replace($search, $replace, $subject, -1, $replace);
$count = $replace;
return $subject;
}
static function str_pad($s, $len, $pad = ' ', $type = STR_PAD_RIGHT)
{
$slen = grapheme_strlen($s);
if ($len <= $slen) return $s;
$padlen = grapheme_strlen($pad);
$freelen = $len - $slen;
$len = $freelen % $padlen;
if (STR_PAD_RIGHT == $type) return $s . str_repeat($pad, $freelen / $padlen) . ($len ? grapheme_substr($pad, 0, $len) : '');
if (STR_PAD_LEFT == $type) return str_repeat($pad, $freelen / $padlen) . ($len ? grapheme_substr($pad, 0, $len) : '') . $s;
if (STR_PAD_BOTH == $type)
{
$freelen /= 2;
$type = ceil($freelen);
$len = $type % $padlen;
$s .= str_repeat($pad, $type / $padlen) . ($len ? grapheme_substr($pad, 0, $len) : '');
$type = floor($freelen);
$len = $type % $padlen;
return str_repeat($pad, $type / $padlen) . ($len ? grapheme_substr($pad, 0, $len) : '') . $s;
}
user_error(__METHOD__ . '(): Padding type has to be STR_PAD_LEFT, STR_PAD_RIGHT, or STR_PAD_BOTH', E_USER_WARNING);
}
static function str_shuffle($s)
{
$s = self::str_split($s);
shuffle($s);
return implode('', $s);
}
static function str_split($s, $len = 1)
{
if (1 > $len = (int) $len)
{
$len = func_get_arg(1);
return str_split($s, $len);
}
/**/ if (extension_loaded('intl'))
/**/ {
$a = array();
$p = 0;
$l = strlen($s);
while ($p < $l) $a[] = grapheme_extract($s, 1, GRAPHEME_EXTR_COUNT, $p, $p);
/**/ }
/**/ else
/**/ {
preg_match_all('/' . GRAPHEME_CLUSTER_RX . '/u', $s, $a);
$a = $a[0];
/**/ }
if (1 == $len) return $a;
$s = array();
$p = -1;
foreach ($a as $l => $a)
{
if ($l % $len) $s[$p] .= $a;
else $s[++$p] = $a;
}
return $s;
}
static function str_word_count($s, $format = 0, $charlist = '')
{
$charlist = self::rxClass($charlist, '\pL');
$s = preg_split("/({$charlist}+(?:[\p{Pd}']{$charlist}+)*)/u", $s, -1, PREG_SPLIT_DELIM_CAPTURE);
$charlist = array();
$len = count($s);
if (1 == $format) for ($i = 1; $i < $len; $i+=2) $charlist[] = $s[$i];
else if (2 == $format)
{
$offset = grapheme_strlen($s[0]);
for ($i = 1; $i < $len; $i+=2)
{
$charlist[$offset] = $s[$i];
$offset += grapheme_strlen($s[$i]) + grapheme_strlen($s[$i+1]);
}
}
else $charlist = ($len - 1) / 2;
return $charlist;
}
static function strcmp ($a, $b) {return (string) $a === (string) $b ? 0 : strcmp(n::normalize($a, n::NFD), n::normalize($b, n::NFD));}
static function strnatcmp ($a, $b) {return (string) $a === (string) $b ? 0 : strnatcmp(self::strtonatfold($a), self::strtonatfold($b));}
static function strcasecmp ($a, $b) {return self::strcmp (self::strtocasefold($a), self::strtocasefold($b));}
static function strnatcasecmp($a, $b) {return self::strnatcmp(self::strtocasefold($a), self::strtocasefold($b));}
static function strncasecmp ($a, $b, $len) {return self::strncmp(self::strtocasefold($a), self::strtocasefold($b), $len);}
static function strncmp ($a, $b, $len) {return self::strcmp(self::substr($a, 0, $len), self::substr($b, 0, $len));}
static function strcspn($s, $charlist, $start = 0, $len = 2147483647)
{
if ('' === (string) $charlist) return null;
if ($start || 2147483647 != $len) $s = self::substr($s, $start, $len);
return preg_match('/^(.*?)' . self::rxClass($charlist) . '/us', $s, $len) ? grapheme_strlen($len[1]) : grapheme_strlen($s);
}
static function strpbrk($s, $charlist)
{
if (preg_match('/' . self::rxClass($charlist) . '/us', $s, $m)) return substr($s, strpos($s, $m[0]));
else return false;
}
static function strrev($s)
{
$s = self::str_split($s);
return implode('', array_reverse($s));
}
static function strspn($s, $mask, $start = 0, $len = 2147483647)
{
if ($start || 2147483647 != $len) $s = self::substr($s, $start, $len);
return preg_match('/^' . self::rxClass($mask) . '+/u', $s, $s) ? grapheme_strlen($s[0]) : 0;
}
static function strtr($s, $from, $to = INF)
{
if (INF !== $to)
{
$from = self::str_split($from);
$to = self::str_split($to);
$a = count($from);
$b = count($to);
if ($a > $b) $from = array_slice($from, 0, $b);
else if ($a < $b) $to = array_slice($to , 0, $a);
$from = array_combine($from, $to);
}
return strtr($s, $from);
}
static function substr_compare($a, $b, $offset, $len = 2147483647, $i = 0)
{
$a = self::substr($a, $offset, $len);
return $i ? self::strcasecmp($a, $b) : self::strcmp($a, $b);
}
static function substr_count($s, $needle, $offset = 0, $len = 2147483647)
{
return substr_count(self::substr($s, $offset, $len), $needle);
}
static function substr_replace($s, $replace, $start, $len = 2147483647)
{
$s = self::str_split($s);
$replace = self::str_split($replace);
array_splice($s, $start, $len, $replace);
return implode('', $s);
}
static function ucfirst($s)
{
$c = iconv_substr($s, 0, 1, 'UTF-8');
return self::ucwords($c) . substr($s, strlen($c));
}
static function lcfirst($s)
{
$c = iconv_substr($s, 0, 1, 'UTF-8');
return mb_strtolower($c, 'UTF-8') . substr($s, strlen($c));
}
static function ucwords($s)
{
return mb_convert_case($s, MB_CASE_TITLE, 'UTF-8');
}
static function number_format($number, $decimals = 0, $dec_point = '.', $thousands_sep = ',')
{
/**/ if (PHP_VERSION_ID < 50400)
/**/ {
if (isset($thousands_sep[1]) || isset($dec_point[1]))
{
return str_replace(
array('.', ','),
array($dec_point, $thousands_sep),
number_format($number, $decimals, '.', ',')
);
}
/**/ }
return number_format($number, $decimals, $dec_point, $thousands_sep);
}
static function utf8_encode($s)
{
$s = utf8_encode($s);
if (false === strpos($s, "\xC2")) return $s;
else return str_replace(self::$cp1252, self::$utf8, $s);
}
static function utf8_decode($s)
{
$s = str_replace(self::$utf8, self::$cp1252, $s);
return utf8_decode($s);
}
protected static function rxClass($s, $class = '')
{
$class = array($class);
foreach (self::str_split($s) as $s)
{
if ('-' === $s) $class[0] = '-' . $class[0];
else if (!isset($s[2])) $class[0] .= preg_quote($s, '/');
else if (1 === iconv_strlen($s, 'UTF-8')) $class[0] .= $s;
else $class[] = $s;
}
$class[0] = '[' . $class[0] . ']';
if (1 === count($class)) return $class[0];
else return '(?:' . implode('|', $class) . ')';
}
protected static function getData($file)
{
$file = __DIR__ . '/Utf8/data/' . $file . '.ser';
if (file_exists($file)) return unserialize(file_get_contents($file));
else return false;
}
}

View File

@@ -0,0 +1,223 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
namespace Patchwork\Utf8;
use Normalizer as n;
use Patchwork\Utf8 as u;
class Bootup
{
static function initAll()
{
self::initUtf8Encode();
self::initMbstring();
self::initIconv();
self::initExif();
self::initIntl();
self::initLocale();
}
static function initUtf8Encode()
{
function_exists('utf8_encode') or require __DIR__ . '/Bootup/utf8_encode.php';
}
static function initMbstring()
{
if (extension_loaded('mbstring'))
{
if ( ((int) ini_get('mbstring.encoding_translation') || in_array(strtolower(ini_get('mbstring.encoding_translation')), array('on', 'yes', 'true')))
&& !in_array(strtolower(ini_get('mbstring.http_input')), array('pass', '8bit', 'utf-8')) )
{
user_error('php.ini settings: Please disable mbstring.encoding_translation or set mbstring.http_input to "pass"', E_USER_WARNING);
}
if (MB_OVERLOAD_STRING & (int) ini_get('mbstring.func_overload'))
{
user_error('php.ini settings: Please disable mbstring.func_overload', E_USER_WARNING);
}
mb_regex_encoding('UTF-8');
ini_set('mbstring.script_encoding', 'pass');
if ('utf-8' !== strtolower(mb_internal_encoding()))
{
mb_internal_encoding('UTF-8');
ini_set('mbstring.internal_encoding', 'UTF-8');
}
if ('none' !== strtolower(mb_substitute_character()))
{
mb_substitute_character('none');
ini_set('mbstring.substitute_character', 'none');
}
if (!in_array(strtolower(mb_http_output()), array('pass', '8bit')))
{
mb_http_output('pass');
ini_set('mbstring.http_output', 'pass');
}
if (!in_array(strtolower(mb_language()), array('uni', 'neutral')))
{
mb_language('uni');
ini_set('mbstring.language', 'uni');
}
}
else if (!defined('MB_OVERLOAD_MAIL'))
{
require __DIR__ . '/Bootup/mbstring.php';
}
}
static function initIconv()
{
if (extension_loaded('iconv'))
{
if ('UTF-8' !== iconv_get_encoding('input_encoding'))
{
iconv_set_encoding('input_encoding', 'UTF-8');
ini_set('iconv.input_encoding', 'UTF-8');
}
if ('UTF-8' !== iconv_get_encoding('internal_encoding'))
{
iconv_set_encoding('internal_encoding', 'UTF-8');
ini_set('iconv.internal_encoding', 'UTF-8');
}
if ('UTF-8' !== iconv_get_encoding('output_encoding'))
{
iconv_set_encoding('output_encoding' , 'UTF-8');
ini_set('iconv.output_encoding', 'UTF-8');
}
}
else if (!defined('ICONV_IMPL'))
{
require __DIR__ . '/Bootup/iconv.php';
}
}
static function initExif()
{
if (extension_loaded('exif'))
{
if (ini_get('exif.encode_unicode') && 'UTF-8' !== strtoupper(ini_get('exif.encode_unicode')))
{
ini_set('exif.encode_unicode', 'UTF-8');
}
if (ini_get('exif.encode_jis') && 'UTF-8' !== strtoupper(ini_get('exif.encode_jis')))
{
ini_set('exif.encode_jis', 'UTF-8');
}
}
}
static function initIntl()
{
if (defined('GRAPHEME_CLUSTER_RX')) return;
preg_match('/^.$/u', '§') or user_error('PCRE is compiled without UTF-8 support', E_USER_WARNING);
extension_loaded('intl') or require __DIR__ . '/Bootup/intl.php';
if (PCRE_VERSION < '8.32')
{
// (CRLF|([ZWNJ-ZWJ]|T+|L*(LV?V+|LV|LVT)T*|L+|[^Control])[Extend]*|[Control])
// This regular expression is not up to date with the latest unicode grapheme cluster definition.
// However, until http://bugs.exim.org/show_bug.cgi?id=1279 is fixed, it's still better than \X
define('GRAPHEME_CLUSTER_RX', '(?:\r\n|(?:[ -~\x{200C}\x{200D}]|[ᆨ-ᇹ]+|[ᄀ-]*(?:[가개갸걔거게겨계고과괘괴교구궈궤귀규그긔기까깨꺄꺠꺼께껴꼐꼬꽈꽤꾀꾜꾸꿔꿰뀌뀨끄끠끼나내냐냬너네녀녜노놔놰뇌뇨누눠눼뉘뉴느늬니다대댜댸더데뎌뎨도돠돼되됴두둬뒈뒤듀드듸디따때땨떄떠떼뗘뗴또똬뙈뙤뚀뚜뚸뛔뛰뜌뜨띄띠라래랴럐러레려례로롸뢔뢰료루뤄뤠뤼류르릐리마매먀먜머메며몌모뫄뫠뫼묘무뭐뭬뮈뮤므믜미바배뱌뱨버베벼볘보봐봬뵈뵤부붜붸뷔뷰브븨비빠빼뺘뺴뻐뻬뼈뼤뽀뽜뽸뾔뾰뿌뿨쀄쀠쀼쁘쁴삐사새샤섀서세셔셰소솨쇄쇠쇼수숴쉐쉬슈스싀시싸쌔쌰썌써쎄쎠쎼쏘쏴쐐쐬쑈쑤쒀쒜쒸쓔쓰씌씨아애야얘어에여예오와왜외요우워웨위유으의이자재쟈쟤저제져졔조좌좨죄죠주줘줴쥐쥬즈즤지짜째쨔쨰쩌쩨쪄쪠쪼쫘쫴쬐쬬쭈쭤쮀쮜쮸쯔쯰찌차채챠챼처체쳐쳬초촤쵀최쵸추춰췌취츄츠츼치카캐캬컈커케켜켸코콰쾌쾨쿄쿠쿼퀘퀴큐크킈키타태탸턔터테텨톄토톼퇘퇴툐투퉈퉤튀튜트틔티파패퍄퍠퍼페펴폐포퐈퐤푀표푸풔풰퓌퓨프픠피하해햐햬허헤혀혜호화홰회효후훠훼휘휴흐희히]?[-ᆢ]+|[가-힣])[ᆨ-ᇹ]*|[ᄀ-]+|[^\p{Cc}\p{Cf}\p{Zl}\p{Zp}])[\p{Mn}\p{Me}\x{09BE}\x{09D7}\x{0B3E}\x{0B57}\x{0BBE}\x{0BD7}\x{0CC2}\x{0CD5}\x{0CD6}\x{0D3E}\x{0D57}\x{0DCF}\x{0DDF}\x{200C}\x{200D}\x{1D165}\x{1D16E}-\x{1D172}]*|[\p{Cc}\p{Cf}\p{Zl}\p{Zp}])');
}
else
{
define('GRAPHEME_CLUSTER_RX', '\X');
}
}
static function initLocale()
{
// With non-UTF-8 locale, basename() bugs.
// Be aware that setlocale() can be slow.
// You'd better properly configure your LANG environment variable to an UTF-8 locale.
if ('' === basename('§'))
{
setlocale(LC_ALL, 'C.UTF-8', 'C');
setlocale(LC_CTYPE, 'en_US.UTF-8', 'fr_FR.UTF-8', 'es_ES.UTF-8', 'de_DE.UTF-8', 'ru_RU.UTF-8', 'pt_BR.UTF-8', 'it_IT.UTF-8', 'ja_JP.UTF-8', 'zh_CN.UTF-8', 0);
}
}
static function filterRequestUri()
{
// Ensures the URL is well formed UTF-8
// When not, assumes Windows-1252 and redirects to the corresponding UTF-8 encoded URL
if (isset($_SERVER['REQUEST_URI']) && !preg_match('//u', urldecode($a = $_SERVER['REQUEST_URI'])))
{
if ($a === u::utf8_decode($a))
{
$a = preg_replace_callback(
'/(?:%[89A-F][0-9A-F])+/i',
function($m) {return urlencode(u::utf8_encode(urldecode($m[0])));},
$a
);
}
else $a = '/';
header('HTTP/1.1 301 Moved Permanently');
header('Location: ' . $a);
exit;
}
}
static function filterRequestInputs($normalization_form = /* n::NFC = */ 4, $pre_lead_comb = '◌')
{
// Ensures inputs are well formed UTF-8
// When not, assumes Windows-1252 and converts to UTF-8
// Tests only values, not keys
$a = array(&$_GET, &$_POST, &$_COOKIE, &$_REQUEST, &$_ENV);
foreach ($_FILES as &$v) $a[] = array(&$v['name'], &$v['type']);
$len = count($a);
for ($i = 0; $i < $len; ++$i)
{
foreach ($a[$i] as &$v)
{
if (is_array($v)) $a[$len++] =& $v;
else if (preg_match('/[\x80-\xFF]/', $v))
{
if (n::isNormalized($v, $normalization_form)) $w = '';
else
{
$w = n::normalize($v, $normalization_form);
if (false === $w) $v = u::utf8_encode($v);
else $v = $w;
}
if ($v[0] >= "\x80" && false !== $w && isset($pre_lead_comb[0]) && preg_match('/^\p{Mn}/u', $v))
{
// Prevent leading combining chars
// for NFC-safe concatenations.
$v = $pre_lead_comb . $v;
}
}
}
reset($a[$i]);
unset($a[$i]);
}
}
}

View File

@@ -0,0 +1,48 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
use Patchwork\PHP\Shim as s;
const ICONV_IMPL = 'Patchwork';
const ICONV_VERSION = '1.0';
const ICONV_MIME_DECODE_STRICT = 1;
const ICONV_MIME_DECODE_CONTINUE_ON_ERROR = 2;
function iconv($from, $to, $s) {return s\Iconv::iconv($from, $to, $s);};
function iconv_get_encoding($type = 'all') {return s\Iconv::iconv_get_encoding($type);};
function iconv_set_encoding($type, $charset) {return s\Iconv::iconv_set_encoding($type, $charset);};
function iconv_mime_encode($name, $value, $pref = INF) {return s\Iconv::iconv_mime_encode($name, $value, $pref);};
function ob_iconv_handler($buffer, $mode) {return s\Iconv::ob_iconv_handler($buffer, $mode);};
function iconv_mime_decode_headers($encoded_headers, $mode = 0, $charset = INF) {return s\Iconv::iconv_mime_decode_headers($encoded_headers, $mode, $charset);};
if (extension_loaded('mbstring'))
{
function iconv_strlen($s, $enc = INF) {return mb_strlen($s, $enc);};
function iconv_strpos($s, $needle, $offset = 0, $enc = INF) {return mb_strpos($s, $needle, $offset, $enc);};
function iconv_strrpos($s, $needle, $enc = INF) {return mb_strrpos($s, $needle, $enc);};
function iconv_substr($s, $start, $length = 2147483647, $enc = INF) {return mb_substr($s, $start, $length, $enc);};
function iconv_mime_decode($encoded_headers, $mode = 0, $charset = INF) {return mb_decode_mimeheader($encoded_headers, $mode, $charset);};
}
else
{
if (extension_loaded('xml'))
{
function iconv_strlen($s, $enc = INF) {return s\Iconv::strlen1($s, $enc);};
}
else
{
function iconv_strlen($s, $enc = INF) {return s\Iconv::strlen2($s, $enc);};
}
function iconv_strpos($s, $needle, $offset = 0, $enc = INF) {return s\Mbstring::mb_strpos($s, $needle, $offset, $enc);};
function iconv_strrpos($s, $needle, $enc = INF) {return s\Mbstring::mb_strrpos($s, $needle, $enc);};
function iconv_substr($s, $start, $length = 2147483647, $enc = INF) {return s\Mbstring::mb_substr($s, $start, $length, $enc);};
function iconv_mime_decode($encoded_headers, $mode = 0, $charset = INF) {return s\Iconv::iconv_mime_decode($encoded_headers, $mode, $charset);};
}

View File

@@ -0,0 +1,28 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
use Patchwork\PHP\Shim as s;
const GRAPHEME_EXTR_COUNT = 0;
const GRAPHEME_EXTR_MAXBYTES = 1;
const GRAPHEME_EXTR_MAXCHARS = 2;
function normalizer_is_normalized($s, $form = s\Normalizer::NFC) {return s\Normalizer::isNormalized($s, $form);};
function normalizer_normalize($s, $form = s\Normalizer::NFC) {return s\Normalizer::normalize($s, $form);};
function grapheme_extract($s, $size, $type = 0, $start = 0, &$next = 0) {return s\Intl::grapheme_extract($s, $size, $type, $start, $next);};
function grapheme_stripos($s, $needle, $offset = 0) {return s\Intl::grapheme_stripos($s, $needle, $offset);};
function grapheme_stristr($s, $needle, $before_needle = false) {return s\Intl::grapheme_stristr($s, $needle, $before_needle);};
function grapheme_strlen($s) {return s\Intl::grapheme_strlen($s);};
function grapheme_strpos($s, $needle, $offset = 0) {return s\Intl::grapheme_strpos($s, $needle, $offset);};
function grapheme_strripos($s, $needle, $offset = 0) {return s\Intl::grapheme_strripos($s, $needle, $offset);};
function grapheme_strrpos($s, $needle, $offset = 0) {return s\Intl::grapheme_strrpos($s, $needle, $offset);};
function grapheme_strstr($s, $needle, $before_needle = false) {return s\Intl::grapheme_strstr($s, $needle, $before_needle);};
function grapheme_substr($s, $start, $len = 2147483647) {return s\Intl::grapheme_substr($s, $start, $len);};

View File

@@ -0,0 +1,40 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
use Patchwork\PHP\Shim as s;
const MB_OVERLOAD_MAIL = 1;
const MB_OVERLOAD_STRING = 2;
const MB_OVERLOAD_REGEX = 4;
const MB_CASE_UPPER = 0;
const MB_CASE_LOWER = 1;
const MB_CASE_TITLE = 2;
function mb_convert_encoding($s, $to, $from = INF) {return s\Mbstring::mb_convert_encoding($s, $to, $from);};
function mb_decode_mimeheader($s) {return s\Mbstring::mb_decode_mimeheader($s);};
function mb_encode_mimeheader($s, $charset = INF, $transfer_enc = INF, $lf = INF, $indent = INF) {return s\Mbstring::mb_encode_mimeheader($s, $charset, $transfer_enc, $lf, $indent);};
function mb_convert_case($s, $mode, $enc = INF) {return s\Mbstring::mb_convert_case($s, $mode, $enc);};
function mb_internal_encoding($enc = INF) {return s\Mbstring::mb_internal_encoding($enc);};
function mb_list_encodings() {return s\Mbstring::mb_list_encodings();};
function mb_parse_str($s, &$result = array()) {return parse_str($s, $result);};
function mb_strlen($s, $enc = INF) {return s\Mbstring::mb_strlen($s, $enc);};
function mb_strpos($s, $needle, $offset = 0, $enc = INF) {return s\Mbstring::mb_strpos($s, $needle, $offset, $enc);};
function mb_strtolower($s, $enc = INF) {return s\Mbstring::mb_strtolower($s, $enc);};
function mb_strtoupper($s, $enc = INF) {return s\Mbstring::mb_strtoupper($s, $enc);};
function mb_substitute_character($char = INF) {return s\Mbstring::mb_substitute_character($char);};
function mb_substr_count($s, $needle) {return substr_count($s, $needle);};
function mb_substr($s, $start, $length = 2147483647, $enc = INF) {return s\Mbstring::mb_substr($s, $start, $length, $enc);};
function mb_stripos($s, $needle, $offset = 0, $enc = INF) {return s\Mbstring::mb_stripos($s, $needle, $offset, $enc);};
function mb_stristr($s, $needle, $part = false, $enc = INF) {return s\Mbstring::mb_stristr($s, $needle, $part, $enc);};
function mb_strrchr($s, $needle, $part = false, $enc = INF) {return s\Mbstring::mb_strrchr($s, $needle, $part, $enc);};
function mb_strrichr($s, $needle, $part = false, $enc = INF) {return s\Mbstring::mb_strrichr($s, $needle, $part, $enc);};
function mb_strripos($s, $needle, $offset = 0, $enc = INF) {return s\Mbstring::mb_strripos($s, $needle, $offset, $enc);};
function mb_strrpos($s, $needle, $offset = 0, $enc = INF) {return s\Mbstring::mb_strrpos($s, $needle, $offset, $enc);};
function mb_strstr($s, $needle, $part = false, $enc = INF) {return s\Mbstring::mb_strstr($s, $needle, $part, $enc);};

View File

@@ -0,0 +1,14 @@
<?php // vi: set fenc=utf-8 ts=4 sw=4 et:
/*
* Copyright (C) 2013 Nicolas Grekas - p@tchwork.com
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the (at your option):
* Apache License v2.0 (http://apache.org/licenses/LICENSE-2.0.txt), or
* GNU General Public License v2.0 (http://gnu.org/licenses/gpl-2.0.txt).
*/
use Patchwork\PHP\Shim as s;
function utf8_encode($s) {return s\Xml::utf8_encode($s);};
function utf8_decode($s) {return s\Xml::utf8_decode($s);};

View File

@@ -0,0 +1 @@
a:2:{i:0;a:104:{i:0;s:2:"ß";i:1;s:2:"İ";i:2;s:2:"ʼn";i:3;s:2:"ǰ";i:4;s:2:"ΐ";i:5;s:2:"ΰ";i:6;s:2:"և";i:7;s:3:"ẖ";i:8;s:3:"ẗ";i:9;s:3:"ẘ";i:10;s:3:"ẙ";i:11;s:3:"ẚ";i:12;s:3:"ẞ";i:13;s:3:"ὐ";i:14;s:3:"ὒ";i:15;s:3:"ὔ";i:16;s:3:"ὖ";i:17;s:3:"ᾀ";i:18;s:3:"ᾁ";i:19;s:3:"ᾂ";i:20;s:3:"ᾃ";i:21;s:3:"ᾄ";i:22;s:3:"ᾅ";i:23;s:3:"ᾆ";i:24;s:3:"ᾇ";i:25;s:3:"ᾈ";i:26;s:3:"ᾉ";i:27;s:3:"ᾊ";i:28;s:3:"ᾋ";i:29;s:3:"ᾌ";i:30;s:3:"ᾍ";i:31;s:3:"ᾎ";i:32;s:3:"ᾏ";i:33;s:3:"ᾐ";i:34;s:3:"ᾑ";i:35;s:3:"ᾒ";i:36;s:3:"ᾓ";i:37;s:3:"ᾔ";i:38;s:3:"ᾕ";i:39;s:3:"ᾖ";i:40;s:3:"ᾗ";i:41;s:3:"ᾘ";i:42;s:3:"ᾙ";i:43;s:3:"ᾚ";i:44;s:3:"ᾛ";i:45;s:3:"ᾜ";i:46;s:3:"ᾝ";i:47;s:3:"ᾞ";i:48;s:3:"ᾟ";i:49;s:3:"ᾠ";i:50;s:3:"ᾡ";i:51;s:3:"ᾢ";i:52;s:3:"ᾣ";i:53;s:3:"ᾤ";i:54;s:3:"ᾥ";i:55;s:3:"ᾦ";i:56;s:3:"ᾧ";i:57;s:3:"ᾨ";i:58;s:3:"ᾩ";i:59;s:3:"ᾪ";i:60;s:3:"ᾫ";i:61;s:3:"ᾬ";i:62;s:3:"ᾭ";i:63;s:3:"ᾮ";i:64;s:3:"ᾯ";i:65;s:3:"ᾲ";i:66;s:3:"ᾳ";i:67;s:3:"ᾴ";i:68;s:3:"ᾶ";i:69;s:3:"ᾷ";i:70;s:3:"ᾼ";i:71;s:3:"ῂ";i:72;s:3:"ῃ";i:73;s:3:"ῄ";i:74;s:3:"ῆ";i:75;s:3:"ῇ";i:76;s:3:"ῌ";i:77;s:3:"ῒ";i:78;s:3:"ΐ";i:79;s:3:"ῖ";i:80;s:3:"ῗ";i:81;s:3:"ῢ";i:82;s:3:"ΰ";i:83;s:3:"ῤ";i:84;s:3:"ῦ";i:85;s:3:"ῧ";i:86;s:3:"ῲ";i:87;s:3:"ῳ";i:88;s:3:"ῴ";i:89;s:3:"ῶ";i:90;s:3:"ῷ";i:91;s:3:"ῼ";i:92;s:3:"ff";i:93;s:3:"fi";i:94;s:3:"fl";i:95;s:3:"ffi";i:96;s:3:"ffl";i:97;s:3:"ſt";i:98;s:3:"st";i:99;s:3:"ﬓ";i:100;s:3:"ﬔ";i:101;s:3:"ﬕ";i:102;s:3:"ﬖ";i:103;s:3:"ﬗ";}i:1;a:104:{i:0;s:2:"ss";i:1;s:3:"i̇";i:2;s:3:"ʼn";i:3;s:3:"ǰ";i:4;s:6:"ΐ";i:5;s:6:"ΰ";i:6;s:4:"եւ";i:7;s:3:"ẖ";i:8;s:3:"ẗ";i:9;s:3:"ẘ";i:10;s:3:"ẙ";i:11;s:3:"aʾ";i:12;s:2:"ss";i:13;s:4:"ὐ";i:14;s:6:"ὒ";i:15;s:6:"ὔ";i:16;s:6:"ὖ";i:17;s:5:"ἀι";i:18;s:5:"ἁι";i:19;s:5:"ἂι";i:20;s:5:"ἃι";i:21;s:5:"ἄι";i:22;s:5:"ἅι";i:23;s:5:"ἆι";i:24;s:5:"ἇι";i:25;s:5:"ἀι";i:26;s:5:"ἁι";i:27;s:5:"ἂι";i:28;s:5:"ἃι";i:29;s:5:"ἄι";i:30;s:5:"ἅι";i:31;s:5:"ἆι";i:32;s:5:"ἇι";i:33;s:5:"ἠι";i:34;s:5:"ἡι";i:35;s:5:"ἢι";i:36;s:5:"ἣι";i:37;s:5:"ἤι";i:38;s:5:"ἥι";i:39;s:5:"ἦι";i:40;s:5:"ἧι";i:41;s:5:"ἠι";i:42;s:5:"ἡι";i:43;s:5:"ἢι";i:44;s:5:"ἣι";i:45;s:5:"ἤι";i:46;s:5:"ἥι";i:47;s:5:"ἦι";i:48;s:5:"ἧι";i:49;s:5:"ὠι";i:50;s:5:"ὡι";i:51;s:5:"ὢι";i:52;s:5:"ὣι";i:53;s:5:"ὤι";i:54;s:5:"ὥι";i:55;s:5:"ὦι";i:56;s:5:"ὧι";i:57;s:5:"ὠι";i:58;s:5:"ὡι";i:59;s:5:"ὢι";i:60;s:5:"ὣι";i:61;s:5:"ὤι";i:62;s:5:"ὥι";i:63;s:5:"ὦι";i:64;s:5:"ὧι";i:65;s:5:"ὰι";i:66;s:4:"αι";i:67;s:4:"άι";i:68;s:4:"ᾶ";i:69;s:6:"ᾶι";i:70;s:4:"αι";i:71;s:5:"ὴι";i:72;s:4:"ηι";i:73;s:4:"ήι";i:74;s:4:"ῆ";i:75;s:6:"ῆι";i:76;s:4:"ηι";i:77;s:6:"ῒ";i:78;s:6:"ΐ";i:79;s:4:"ῖ";i:80;s:6:"ῗ";i:81;s:6:"ῢ";i:82;s:6:"ΰ";i:83;s:4:"ῤ";i:84;s:4:"ῦ";i:85;s:6:"ῧ";i:86;s:5:"ὼι";i:87;s:4:"ωι";i:88;s:4:"ώι";i:89;s:4:"ῶ";i:90;s:6:"ῶι";i:91;s:4:"ωι";i:92;s:2:"ff";i:93;s:2:"fi";i:94;s:2:"fl";i:95;s:3:"ffi";i:96;s:3:"ffl";i:97;s:2:"st";i:98;s:2:"st";i:99;s:4:"մն";i:100;s:4:"մե";i:101;s:4:"մի";i:102;s:4:"վն";i:103;s:4:"մխ";}}

File diff suppressed because one or more lines are too long

24
vendor/patchwork/utf8/composer.json vendored Normal file
View File

@@ -0,0 +1,24 @@
{
"name": "patchwork/utf8",
"type": "library",
"description": "UTF-8 strings handling for PHP 5.3: portable, performant and extended",
"keywords": ["utf8","utf-8","unicode","i18n"],
"homepage": "https://github.com/nicolas-grekas/Patchwork-UTF8",
"license": "(Apache-2.0 or GPL-2.0)",
"authors": [
{
"name": "Nicolas Grekas",
"email": "p@tchwork.com",
"role": "Developer"
}
],
"require": {
"php": ">=5.3.0"
},
"autoload": {
"psr-0": {
"Patchwork": "class/",
"Normalizer": "class/"
}
}
}