5

I've registered at http://www.developers.elsevier.com/action/devprojects. I created a project and got my scopus key:

enter image description here

Now, using this generated key, I would like to find an author by firstname, lastname and subjectarea. I make requests from my university network, which is allowed to visit Scopus (I have full manual access to Scopus search, use it from Firefox with no problem). However, I wanted to automatize my Scopus mining, by writing a simple script. I would like to find publications of an author by giving his/her firstname, lastname and subjectarea.

Here's my code:

# !/usr/bin/env python
# -*- coding: utf-8 -*-

import requests
import json
from scopus import SCOPUS_API_KEY


scopus_author_search_url = 'http://api.elsevier.com/content/search/author?'
headers = {'Accept':'application/json', 'X-ELS-APIKey': SCOPUS_API_KEY}
search_query = 'query=AUTHFIRST(%) AND AUTHLASTNAME(%s) AND SUBJAREA(%s)' % ('John', 'Kitchin', 'COMP')

# api_resource = "http://api.elsevier.com/content/search/author?apiKey=%s&" % (SCOPUS_API_KEY)

# request with first searching page
page_request = requests.get(scopus_author_search_url + search_query, headers=headers)
print page_request.url

# response to json
page = json.loads(page_request.content.decode("utf-8"))
print page

Where SCOPUS_API_KEY looks just like this: SCOPUS_API_KEY="xxxxxxxx".

Although I have full access to scopus from my university network, I'm getting such response:

{u'service-error': {u'status': {u'statusText': u'Requestor configuration settings insufficient for access to this resource.', u'statusCode': u'AUTHENTICATION_ERROR'}}}

The generated link looks like this: http://api.elsevier.com/content/search/author?query=AUTHFIRST(John)%20AND%20AUTHLASTNAME(Kitchin)%20AND%20SUBJAREA(COMP) and when I click it, it shows an XML file:

<service-error><status>
  <statusCode>AUTHORIZATION_ERROR</statusCode>
  <statusText>No APIKey provided for request</statusText>
</status></service-error>

Or, when I change the scopus_author_search_url to "http://api.elsevier.com/content/search/author?apiKey=%s&" % (SCOPUS_API_KEY) I'm getting:

{u'service-error': {u'status': {u'statusText': u'Requestor configuration settings insufficient for access to this resource.', u'statusCode': u'AUTHENTICATION_ERROR'}}} and the XML file:

<service-error>
<status>
<statusCode>AUTHENTICATION_ERROR</statusCode>
<statusText>Requestor configuration settings insufficient for access to this resource.</statusText>
</status>
</service-error>

What can be the cause of this problem and how can I fix it?

Brian Brown
  • 3,873
  • 16
  • 48
  • 79
  • i think you are missing the authentication, maybe your browsing is sending it and you don't see... by any chance do you have any credentials you could be using? – Carlos H Romano Aug 16 '15 at 23:38
  • 1
    Were you asked to register a website when you signed up, and are your URI requests coming from that website? – rask004 Aug 17 '15 at 01:48
  • 1
    Just because your allowed to use the web interface from your university network doesn't mean you're allowed to use the API without additional credentials. Maybe that API key isn't giving you the appropriate authorizations. – Cyphase Aug 17 '15 at 09:28

2 Answers2

4

I have just registered for an API key and tested it first with this URL:

http://api.elsevier.com/content/search/author?apikey=4xxxxxxxxxxxxxxxxxxxxxxxxxxxxx43&query=AUTHFIRST%28John%29+AND+AUTHLASTNAME%28Kitchin%29+AND+SUBJAREA%28COMP%29

This works fine from my university network. I also tested a second API Key, so have verified one with registered website on my university domain, one with registered website http://apitest.example.com, ruling out the domain name used to register as the source of your problem.

I tested this

  1. in the browser,
  2. using your python code both with the api key in the headers. The only change I made to your code is removing

    from scopus import SCOPUS_API_KEY
    

    and adding

    SCOPUS_API_KEY ='4xxxxxxxxxxxxxxxxxxxxxxxxxxxxx43'
    
  3. using your python code adapted to put the apikey in the URL instead of the headers.

In all cases, the query returns two authors, one at Carnegie Mellon and one at Palo Alto.

I can't replicate your error message. If I try to use the API key from an IP address unregistered with elsevier (e.g. my home computer), I see a different error:

<service-error>
  <status>
    <statusCode>AUTHENTICATION_ERROR</statusCode>
    <statusText>Client IP Address: xxx.yyy.aaa.bbb does not resolve to an account</statusText>
   </status>
</service-error>

If I use a random (wrong) API key from the university network, I see

<service-error>
    <status>
        <statusCode>AUTHORIZATION_ERROR</statusCode>
        <statusText>APIKey <mad3upa1phanum3r1ck3y> with IP address <my.uni.IP.add> is unrecognized or has insufficient privileges for access to this resource</statusText>
    </status>
</service-error>

Debug steps

As I can't replicate your problem - here are some diagnostic steps you can use to resolve:

  1. Use your browser at uni to actually submit the api query with your key in the URL (i.e. copy the URL above, paste it into the address bar, substitute your key and see whether you get the XML back)

  2. If 1 returns the XML you expect, move onto submitting the request via Python - first, copy the exact URL straight into Python (no variable substitution via %s, no apikey in the header) and simply do a .get() on it.

  3. If 2 returns correctly, ensure that your SCOPUS_API_KEY holds the exact key value, no more no less. i.e. print 'SCOPUS_API_KEY' should return your apikey: 4xxxxxxxxxxxxxxxxxxxxxxxxxxxxx43

  4. If 1 returns the error, it looks like your uni (for whatever reason) has not got access to the authors query API. This doesn't make much sense given that you can perform manual search, but that is all I can conclude

Docs

For reference the authentication algorithm documentation is here, but it is not very simple to follow. You are following authentication option 1 and your method should just work.

N.B. The API is limited to 5000 author retrievals per week. If you have run a lot of queries in a loop, even if they have failed, it is possible that you have exceeded that...

J Richard Snape
  • 20,116
  • 5
  • 51
  • 79
  • Thank you. I think I solved the problem. I registered an example page (not an application) and generated the API key. Now it seems to work, I can download author's profile and publication pages from my university network. I don't know what exactly was the problem, I assume it was something with the application API / wepage API (didn't work with app API key, worked with webpage API key). However, I found some very useful, additional information in your answer that might help me in future with my project. Again, thank you so much for help, cheers! :-) – Brian Brown Aug 19 '15 at 19:04
1

For future reference. OP was using the package scopus which has long been renamed to pybliometrics.

Nowadays you can do

from pybliometrics.scopus import AuthorSearch

q = "AUTHFIRST(John) AND AUTHLASTNAME(Kitchin) AND SUBJAREA(COMP)"
s = AuthorSearch(q)  # handles access, retrieval, parsing and even caches results
print(s)
results = s.authors  # Holds all the information as a list of namedtuples
print(results)  # You can put this into a pandas DataFrame as well
MERose
  • 4,048
  • 7
  • 53
  • 79