How can I write a Perl script that connects to the various Stanford NLP applications?
I have invoked both the Stanford part-of-speech and named-entity applications as servers, and when I send them requests from the command line, I get the sorts of responses I expect. Here is an example command-line invocation:
cat file.txt | nc localhost 8081
I now want to write both a Perl-based command line script as well as a Perl-based CGI script to do the same work, but I am having problems getting back the full response. Here are the most salient lines in my script(s):
# initialize
my $text = '';
my $response = '';
# get the text to process and normalize it for xml
$text = &slurp( $file );
$text =~ s/\&/\&/g;
$text =~ s/</\</g;
$text =~ s/>/\>/g;
$text =~ s/\W+/ /g;
# open a connection, send the data, and get the response
my $socket = new IO::Socket::INET( PeerHost => HOST, PeerPort => PORT, Proto => PROTOCOL );
if ( ! $socket ) { die "Cannot connect to the server $!\n" }
$socket->send( "$text\n" );
$socket->recv( $response, 10240000 );
$socket->close();
This works fine for smaller files, but often does not for larger files, no matter how large I seem to increase the buffer (10240000). Moreover, the amount of data return by the server (or more specifically received by the client) is never the same size. Sometimes the response is bigger or smaller than other times.
When does recv know to stop... receiving?
What am I doing wrong?