(PHP 4, PHP 5)
recode_string — Umkodierung eines Strings entsprechend einer Recode-Anweisung
Der Parameter string wird entsprechend der Recode-Anweisung request umgewandelt.
Gibt den umkodierten string oder FALSE zurück, wenn es nicht möglich ist, den Recode-Request auszuführen.
Beispiel #1 Einfaches recode_string()-Beispiel:
<?php
echo recode_string("us..flat", "Der folgende Buchstabe hat ein diakritisches Zeichen: á");
?>
Eine einfache Recode-Anfrage könnte sein "lat1..iso646-de".
Seems to require that librecode be installed.
Try iconv() instead.
Here's how to convert romaji to katakana/hiragana with PHP (transliterating Japanese text).
The function Romaji2Kana($s) will return with keys 'hira' and 'kata' that respectively contain the hiragana and katakana versions of the given string in UTF-8 encoding.
<?php
// eucjp: 2421; unicode: 3041
define('HIRATABLE', 'a A i I u U e E o O KAGAKIGIKUGUKEGEKOGOSAZASIZISUZUSEZESOZO'.
'TADATIDItuTUDUTEDETODONANINUNENOHABAPAHIBIPIHUBUPUHEBEPEHOBOPO'.
'MAMIMUMEMOyaYAyuYUyoYORARIRUREROwaWAWIWEWOn ');
// eucjp: 2521; unicode: 30A1
define('KATATABLE', 'a A i I u U e E o O KAGAKIGIKUGUKEGEKOGOSAZASIZISUZUSEZESOZO'.
'TADATIDItuTUDUTEDETODONANINUNENOHABAPAHIBIPIHUBUPUHEBEPEHOBOPO'.
'MAMIMUMEMOyaYAyuYUyoYORARIRUREROwaWAWIWEWOn VUkake');
function HiraTrans($s)
{
  #print "trans('$s')\n";
  $pos = strpos(HIRATABLE, $s);
  if($pos===false) return 0xA1BC; // ^
  return 0xA4A1 + $pos/2;
}
function KataTrans($s)
{
  $pos = strpos(KATATABLE, $s);
  if($pos===false) return 0xA1BC; // ^
  return 0xA5A1 + $pos/2;
}
function Romaji2Kana($s)
{
  $s = strtoupper(str_replace(
     Array('shi', 'sh', 'fu', 'chi', 'ch', 'tsu', 'dz', 'l', '-',
           'â', 'î', 'û', 'ê', 'ô', 'ā', 'ī', 'ū', 'ē', 'ō'),
     Array('si',  'sy', 'hu', 'ti',  'ty', 'tu',  'j',  'r', '^',
           'a^', 'i^', 'u^', 'e^', 'o^', 'a^', 'i^', 'u^', 'e^', 'o^'),
     $s));
  // FO -> FUxo
  $s = preg_replace('@F([AIOE])@e', '"HU".strtolower("\1")', $s);
  // VO -> VUxo
  $s = preg_replace('@V([AIUEO])@e', '"VU".strtolower("\1")', $s);
  // KYA -> KYya
  $s = preg_replace('@([KSTNHMRGZBPD])Y([AUO])@e',   '"\1Iy".strtolower("\2")', $s);
  // XTU -> tu (make them actually small)
  $s = preg_replace('@X(TU|Y[AUO]|[AIUEO]|KA|KE)@e', 'strtolower("\1")', $s);
  // KKO -> tuKO
  $s = preg_replace('@([KSTHMRYWGZBPDV]{2,})@e',
                      'str_pad("",2*strlen("\1")-2,"tu").substr("\1",0,1)', $s);
  // N -> n (but not NO -> nO)
  // At this point, N' will work correctly
  $s = preg_replace('@N(?![AIUEO])@', 'n', $s);
  // Unrecognized characters off
  $s = eregi_replace('[^^VAIUEOKSTNHMYRWGZBPD]', '', $s);
  
  $pat = '@([AIUEOnaiueo^]|..)@e';
  $rec = 'EUCJP..UTF8';
  
  return
    Array('hira' => recode_string($rec,preg_replace($pat, 'pack("n", HiraTrans("\1"))', $s)),
          'kata' => recode_string($rec,preg_replace($pat, 'pack("n", KataTrans("\1"))', $s)));
}
print_r( Romaji2Kana('konnichiha') );
?>
Note: Due to technical limitations in the manual pages, there are two errors in this code:
- Some characters in the first str_replace may appear wrong in some php.net mirrors. It supposed to contain aiueo with circumflex and aiueo with macron.
- The strings in the defines should be constant, not appendage expressions. (Line length limitation)
-Joel Yliluoma
I came across a bug (and workaround) when using recode_string. When converting from utf-8 to iso-2022-jp, it would always return an empty string (although it would work fine for conversions from html to utf8). Converting with recode on the command line worked fine, which was odd. I noticed that if I specified "-v" on the command line, recode stated that it was using libiconv to do the conversion.
Using "iconv" instead of recode got the right results.
i.e.
Works:
$str = recode_string("html..utf-8", "日本語"); // Unicode for "Japanese"
Doesn't work:
$str = recode_string("utf-8..iso-2022-jp", $mystring);
Works:
$str = iconv("utf-8", "iso-2022-jp", $mystring);
Don't ask me why. Hope this saves someone some frustrating hours debugging.