0

So I'm trying to develop a module for my web application that pushes a content feed to our Google Search Appliance (GSA) using PHP cURL to transmit the data to the appliance as POST information via port 19900. Based on everything I've read in the documentation for creating and submitting a feed to the GSA, this should be working without issue, and yet the server is returning with the following (incredibly vague and largely useless) error:

  1. That's an error.

Your client has issued a malformed or illegal request. That's all we know.

I've been troubleshooting this with the architect who helped instrument the GSA on our sister site, and we have been unable to determine what is causing our issue. According to our IT department, all of the ports are opened for this communication to occur (and we wouldn't be getting the error message if they were closed), and we have verified that the sending server's IP address is listed as 'allowed' in the GSA. Needless to say, we're stumped.

Here is the code that transmits the XML feed:

<?php
$target_url = 'http://gsadomain.com:19900/xmlfeed';

$header = array('Content-Type: multipart/form-data');

$fields = array(
    'feedtype'=>'incremental',
    'datasource'=>'datasourcename',
    'data'=>'@'.realpath('gsa_feed.xml')
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_USERPWD, "gsaadmin:gsaadminpassword");
curl_setopt($ch, CURLOPT_HTTPHEADER,$header);
curl_setopt($ch, CURLOPT_TIMEOUT,120);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($fields));

$return = curl_exec($ch);

if (curl_errno($ch)) {
    $msg = curl_error($ch);
}

curl_close ($ch);

echo $return;
?>

And here is XML that we're trying to submit:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE gsafeed PUBLIC "-//Google//DTD GSA Feeds//EN" "http://this.is.the.ip/gsafeed.dtd">
<gsafeed>
    <header>
        <datasource>datasource</datasource>
        <feedtype>incremental</feedtype>
    </header>
    <group>
        <record url="http://website.com/mod/view.php?id=15903" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">customers</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="module" />
                <meta name="id" content="1" />
                <meta name="name" content="Module for Everyone" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Module for Everyone.
        </content>
        </record>
        <record url="http://website.com/mod/view.php?id=15904" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
            </acl>
            <metadata>
                <meta name="type" content="module" />
                <meta name="id" content="2" />
                <meta name="name" content="Module for Partners" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Module for Partners.
        </content>
        </record>
        <record url="http://website.com/mod/view.php?id=15905" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="module" />
                <meta name="id" content="3" />
                <meta name="name" content="Module for Employees" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Module for Employees.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655#section-1" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">customers</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="topic" />
                <meta name="id" content="1" />
                <meta name="name" content="Course Topic for Everyone" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course Topic for All Audiences.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655#section-2" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
            </acl>
            <metadata>
                <meta name="type" content="topic" />
                <meta name="id" content="2" />
                <meta name="name" content="Course Topic for Partners" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course Topic for Partners.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655#section-3" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="topic" />
                <meta name="id" content="3" />
                <meta name="name" content="Course Topic for Employees" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course Topic for Employees.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">customers</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="course" />
                <meta name="id" content="655" />
                <meta name="name" content="Course for Everyone" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course for Everyone.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=656" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
            </acl>
            <metadata>
                <meta name="type" content="course" />
                <meta name="id" content="656" />
                <meta name="name" content="Course for Partners" />
                <meta name="course_id" content="656" />
            </metadata>
            <content>
                This is the description of the Course for Partners.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=657" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="course" />
                <meta name="id" content="657" />
                <meta name="name" content="Course for Employees" />
                <meta name="course_id" content="657" />
            </metadata>
            <content>
                This is the description of the Course for Employees.
        </content>
        </record>
    </group>
</gsafeed>

Based on everything we're seeing, this should work, and yet we are running into a brick wall. Does anyone have any ideas?

As an added note, due to how the page we're trying to index is set up, having the appliance crawl the page will not work (there are too many interactive elements, and everything I've read suggests that the GSA can't index those properly).

EDIT 1: As Mark suggested in the replies, here is the link to the GSA Feed Developer's Guide: http://www.google.com/support/enterprise/static/gsa/docs/admin/72/gsa_doc_set/feedsguide/feedsguide.html

EDIT 2: Success! See my answer below. The key is to let cURL handle encoding of the $fields array, and to pass the file contents, instead of just the file path.

Community
  • 1
  • 1
Josh C
  • 67
  • 10

2 Answers2

1

So, long story short, I was able to get the feed submitting properly, and it all had to do with how cURL was handling the data. Neither I, nor the engineer with whom I was working to try to get the feed submitting, had much experience with the cURL plugin for PHP, nor how to get the GSA to accept the input fields. Thanks in part to this question by Ken, the answer by ThiefMaster, the next answer from Czechnology, and Mike's help above, I came up with the following code:

<?php
$target_url = 'http://gsadomain.com:19900/xmlfeed';

$header = array('Content-Type: multipart/form-data');

$fields = array(
    'feedtype'=>'incremental',
    'datasource'=>'datasourcename',
    'data'=>file_get_contents(realpath('gsa_feed.xml'))
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_USERPWD, "gsaadmin:gsaadminpassword");
curl_setopt($ch, CURLOPT_HTTPHEADER,$header);
curl_setopt($ch, CURLOPT_TIMEOUT,120);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);

$return = curl_exec($ch);

if (curl_errno($ch)) {
    $msg = curl_error($ch);
}

curl_close ($ch);

echo $return;
?>

The difficulty was in the http_build_query() method, which was trying to do cURL's work for it, and wasn't properly setting the boundaries on the POST data.

We also ran into some difficulty later with some of the fields in the XML, but that was mostly because we forgot to str_replace ampersands, single-, and double-quotes. Once those were taken care of, the XML was parsed properly, and we got everything running.

Community
  • 1
  • 1
Josh C
  • 67
  • 10
0

I am not familiar with GSA API, but it doesn't seem you are actually sending any XML data. The string value you would be sending for data parameter would just be something like @/path/to/gsa_feed.xml. I would imagine you actually need to POST the XML right?

Perhaps something more like

$fields = array(
    'feedtype'=>'incremental',
    'datasource'=>'datasourcename',
    'data'=> file_get_contents(realpath('gsa_feed.xml'))
);
Mike Brant
  • 70,514
  • 10
  • 99
  • 103
  • I just gave this a shot, and didn't have any luck. The PHP script in question builds the xml file, so initially, I was just including the string I was writing to file as '$fields = array( 'fieldtype'=>'incremental','datasource'=>'datasourcename','data'=>$string);' When that didn't work, I switched to the @path/to/gsa_feed.xml method (though what you're suggesting is probably what I want, given what I'm seeing in the documentation. – Josh C Sep 24 '14 at 20:06
  • @JoshC You might link the API documentation in your question. From the PHP you have shown, there is nowhere where an XML file would be built. What you show is just a simple cURL POST with a few fields of data, one of which is the this string in question. There is nothing here to suggest any functionality beyond that. – Mike Brant Sep 24 '14 at 21:33
  • @Mark Brant I just edited the original post to contain the link to the developer's guide from Google. Unfortunately, it doesn't give too much guidance in the way of setting up your own feed script. Here's the link again: http://www.google.com/support/enterprise/static/gsa/docs/admin/72/gsa_doc_set/feedsguide/feedsguide.html – Josh C Sep 25 '14 at 12:59