Validating Cyrillic (UTF8) alphanumeric input with PHP preg_match and regular expresions

Recently I needed to make a validation rule that a Cyrillic alphanumeric string was entered in an input field. I saw answers like matching one by one all the characters in the alphabet. Luckily I found a more clever solution.

It can be done with preg_match() with the following pattern:


$str="ABC abc 1234 АБВ абв";

$pattern  = "/^[a-zA-Z\p{Cyrillic}0-9\s\-]+$/u";

$result = (bool) preg_match($pattern, $str);
if($result)
   echo "$str is composed of Cyrillic and alphanumeric characters\n";



Here is a decomposition of the pattern:

a-zA-Z is for the Latin characters. You can omit this if you want only Cyrillic input.

\p{Cyrillic} is for the cyrillic characters. You can also use \p{Arabic}, \p{Greek} or other alphabeths. See this site for a full list of utf8 scripts.

0-9 is for the numbers.

\s is for the space

\- is for the dash