0

i 've created a generic search form with php to query couchdb-lucene full text. sample code:

<?php
if($_POST['formSubmit'] == "Submit") {
  
$varKeyword = $_POST['formKeyword'];
 
  
$ch = curl_init();
   
curl_setopt($ch, CURLOPT_URL, "http://00.000.00.158:5984/_fti/local/elod_empty/_design/sellerAll/In_All_Json?q=".$varKeyword);
echo($varKeyword);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);   
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
        'Content-type: application/json; charset=utf-8',
     'Accept: */*'
            ));
$response = curl_exec($ch); 
curl_close($ch);
echo($response);
}
?>
<html>
<head>
 <title>search Form</title>
</head>
<body>
<form action="search2.php" method="post">
  <p>
   enter search keyword<br>
   <input type="text" name="formKeyword" maxlength="50" value="<?=$varKeyword;?>" />
   <input type="submit" name="formSubmit" value="Submit">
   <input type="submit" name="formClear" value="clear">
  </p>
</body>
</html>

problem is i cannot search non latin characters (greek) > Result is:

{"reason":"Bad query syntax: Cannot parse '': Encountered \"\" at line 1, column 0.\nWas expecting one of:\n ...\n \"+\" ...\n \"-\" ...\n ...\n \"(\" ...\n \"*\" ...\n ...\n ...\n ...\n ...\n ...\n \"[\" ...\n \"{\" ...\n ...\n ...\n \"*\" ...\n ","code":400}

while: 1) i am getting latin results if my keyword is latin 2) i get non latin results if i query couchdb from command prompt (ubuntu 14.04) or any browser with non latin e.g.

http://00.000.00.158:5984/_fti/local/elod_sellers/_design/sellerAll/In_All_Json?q='non latin'

or

curl http://00.00.00.158:5984/_fti/local/elod_sellers/_design/sellerAll/In_All_Json?q='non latin'

couchdb version is 1.5.0. and lucene 1.0.2 couchdb lucene log says: "org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8"

Any suggestion would be really helpful!

dimneg
  • 13
  • 5
  • You may have done your curl request/response with utf-8, but you also have to issue a utf-8 charset header for your php->client response. since you haven't, the browser's free to use any charset it wants and will just pick its default. – Marc B Dec 08 '14 at 15:35
  • thank u Marc, i 'll certainly give that a try! – dimneg Dec 08 '14 at 15:41
  • you can set the accept-charset in the form. see http://stackoverflow.com/questions/4902062/php-form-submit-utf8 – Kim Stebel Dec 08 '14 at 15:56
  • Thank u both, beginner 's mistakes! – dimneg Dec 08 '14 at 16:08
  • You can both post them as answers so i can close the question – dimneg Dec 08 '14 at 17:21

1 Answers1

0
<form action="search2.php" method="post">    

is replaced by:

<form action="search2.php" method="post" accept-charset="UTF-8">
dimneg
  • 13
  • 5