PHP utf8 problem


I have some problems comparing an array with Norwegian characters with a utf8 character.

All characters except the special Norwegian characters(?, ?, ?) works fine.

function isNorwegianChar($Char)
    $aNorwegianChars = array('a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E', 'f', 'F', 'g', 'G', 'h', 'H', 'i', 'I', 'j', 'J', 'k', 'K', 'l', 'L', 'm', 'M', 'n', 'N', 'o', 'O', 'p', 'P', 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 'U', 'v', 'V', 'w', 'W', 'x', 'X', 'y', 'Y', 'z', 'Z', '?', '?', '?', '?', '?', '?', '=', '(', ')', ' ', '-');
    $iArrayLength = count($aNorwegianChars);

    for($iCount = 0; $iCount < $iArrayLength; $iCount++)
        if($aNorwegianChars[$iCount] == $Char)
            return true;

    return false;


If anyone has any idea about what I can do pleas let me know.


The reason for needing this is that I'm trying to parse a text file that contains lines with Norwegian and Chinese words, like a dictionary. I want to split the line in to strings, one containing the Norwegian word and one containing the Chinese. This will later be inserted in a database. Example lines:

impulsiv 形 冲动的

im?teg? 动 反对,反驳

im?tekomme 动 符合

alkoholmisbruk(er) 名 滥用酒精 (名 滥用酒精的人)

alkoholp?virket 形 受酒精影响的

alkotest 名 呼吸性酒精测试

alkymi(st) 名 炼金术 (名 炼金术士)

all, alt, alle, 形 全部, 所有

As you can see there might be spaces between the words so I can not use something easy like explode to split between the Chinese and Norwegian words. What I do is use the isNorwegianChar and loop through the line until I find a char that is not in the array.

The problem is that it ?, ? and ? is not returned as a Norwegian character and it think the Chinese word has started.

Here is the code:

   //Open file.
$rFile = fopen("norsk-kinesisk.txt", "r");

// Loop through the file.
$Count = 0;
    if(40== $Count)

    $sLine = fgets($rFile);

    if(0 == $Count)
        $sLine = mb_substr($sLine, 3);

    $iLineLength        = strlen($sLine);
    $bChineseHasStarted = false;
    $sNorwegianWord     = '';
    $sChineseWord       = '';
    for($iCount2 = 0; $iCount2 < $iLineLength; $iCount2++)
        $char = mb_substr($sLine, $iCount2, 1);

        if(($bChineseHasStarted === false) && (false == isNorwegianChar($char)))
            $bChineseHasStarted = true;

        if(false === $bChineseHasStarted)
            $sNorwegianWord .= $char;
            $sChineseWord .= $char;

        //echo $char;

    $sNorwegianWord = trim($sNorwegianWord);
    $sChineseWord = trim($sChineseWord);