1

My web page uses Charset UTF-8 to allow Chinese character input in a textarea form field. I want to test if the input contains a certain character. I've writtena test script to see how Perl is going to handle the Chinese input. It's not finding the match when there is a known match.

Here is my test form:

<!DOCTYPE html>
  <head>
    <meta charset="utf-8">
  </head>
  <body>

    <form method="post" action="http://www.my_domain.com/cgi-bin/my_test_script.pl">
          <textarea name="user_input" rows="" cols=""></textarea>
          <input type="submit" name="submit" value="submit">
    </form>

  </body>
</html>

Here is my code:

#!/usr/bin/perl -T
use strict;
use warnings;
use CGI;
use CGI::Carp qw(warningsToBrowser fatalsToBrowser); 
use utf8;

print "Content-type: text/html; charset=UTF-8\n\n";
print "<meta http-equiv='content-type' content='text/html;charset=UTF-8'>";

my $query = new CGI;
my $msg = $query->param('user_input');
chomp $msg;
my $msg_code = ord($msg);

print "<p> Message was: ".$msg."\n";
print "<p> Message Code is: ".$msg_code."\n";

my $char_from_code_point = "\N{U+89C6}";
my $char_from_code_point_reverse_code = ord($char_from_code_point);

print "<p> char_from_code_point= ".$char_from_code_point."\n";
print "<p> char_from_code_point_reverse_code =     ".$char_from_code_point_reverse_code."\n";

if ($msg =~ m/$char_from_code_point/) {
  print "<p>Matched!\n";
  }
else {
  print "<p> NOT matched\n";
}

And here is the output from submitting the correct character:

Message was: 视

Message Code is: 232

char_from_code_point= 视

char_from_code_point_reverse_code = 35270

NOT matched 

Could someone please point out what I'm doing wrong?

Thank you.

Jim_1234
  • 25
  • 4
  • 2
    Decode the inputs. (Use `use CGI qw( -utf8 );`) /// You also need to encode your outputs. (Add `binmode STDOUT, ':encoding(UTF-8); binmode STDERR, ':encoding(UTF-8);`.) – ikegami Mar 16 '18 at 22:58
  • https://stackoverflow.com/questions/49309468/perl-validate-chinese-character-input-from-web-page-form/49309829#49309829 – Gilles Quénot Mar 17 '18 at 01:36
  • @ikegami's answer did it for me: "Decode the inputs. (Use use CGI qw( -utf8 );)" Simply changing the "use CGI;" to "use CGI qw( -utf8 );" made everything work. Thank you. [Now how do I mark a comment as the answer?] – Jim_1234 Mar 17 '18 at 14:12
  • @Jim_1234 you can't (mark a comment as an answer). You can ask ikegami to write a real answer and then you can validate it. – Patrick Mevzek Mar 17 '18 at 16:00
  • Possible duplicate of [Perl - Validate Chinese character input from web page form?](https://stackoverflow.com/questions/49309468/perl-validate-chinese-character-input-from-web-page-form) – Tom Blodget Mar 17 '18 at 20:05
  • @ikegami - would you care to post your comment as an answer so I can mark it as the answer? – Jim_1234 Mar 18 '18 at 14:05

0 Answers0