3

I am writing an app to integrate with Khan Academy, and I was wondering if anyone has figured out how to get the challenges that a learner has done?

For example, I have logged in and have done a couple of the challenges in the below programming playlist.

https://www.khanacademy.org/computing/computer-programming/programming

When I look at the page itself, it shows that some of the challenges marked as completed, however the Chrome developer console on the page itself doesn't show any XHR Api calls that pull that information down.

So has anyone found out which internal API is necessary to get which challenges have been done?


Per Ben Kraft's suggestion, I tried: '/api/v1/user/progress_summary?kind=Exercise' and got: {"started":[],"complete":["ex8e7aac0b"]}

Using: '/api/internal/user/kaid_688515334519823186196256/progress?dt_start=2017-08-15T00:00:00.000Z&dt_end=2018-08-25T00:00:00Z'

I got a lot of data, but I don't know what other parameters I can use to zero in on the information I want (challenges completed for the Intro to JS Course)

Mark Ellul
  • 1,906
  • 3
  • 26
  • 37
  • 1
    Not sure why this was voted down, all I can say is that Khan Academy themselves say to look at internal APIs for getting access to data thats not available on the public API. – Mark Ellul May 20 '16 at 07:03

3 Answers3

1

I think /api/v1/user/progress_summary is your best bet. I'm not sure why it's not listed on the API explorer, but here's the internal documentation:

Return progress for a content type with started and completed lists.
Takes a comma-separated `kind` param, like:
    /api/v1/user/progress_summary?kind=Video,Article
and returns a dictionary that looks like:
    {"complete": ["a1314267931"], "started": []}

(You'll also need to pass a user identifier like kaid, similar to other /api/v1/user routes.) Those IDs should match up with what you can get from the topic tree API, if you want more data on the individual content items. As far as I can tell that's exactly the same data we use on the topic pages.

Ben Kraft
  • 494
  • 3
  • 9
  • In the end the kaid is not needed because its a authenticated call to the API, the endpoint I will be using is "/api/v1/user/progress_summary?kind=Article,Scratchpad,Video,Exercise" – Mark Ellul Jun 06 '16 at 08:14
  • PS: the result doesn't match 100% to the topic tree, the first letter of each of the results must be removed to match the topic tree. i.e. ["a1314267931"] should be ["1314267931"] – Mark Ellul Jun 06 '16 at 09:28
1

Here is a heavily modified version of one of the Khan API examples that does exactly what you are looking for (I needed the same information).

import cgi
import rauth
import SimpleHTTPServer
import SocketServer
import time
import webbrowser
import requests

students = ['student1@email.com','student2@email.com']
courses = ['programming','html-css','html-css-js','programming-games-visualizations']

# You can get a CONSUMER_KEY and CONSUMER_SECRET for your app here:
# http://www.khanacademy.org/api-apps/register
CONSUMER_KEY = 'abcdefghijklmnop'
CONSUMER_SECRET = 'qrstuvwxyz123456'

CALLBACK_BASE = '127.0.0.1'
SERVER_URL = 'http://www.khanacademy.org'
VERIFIER = None


# Create the callback server that's used to set the oauth verifier after the
# request token is authorized.
def create_callback_server():
    class CallbackHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
        def do_GET(self):
            global VERIFIER

            params = cgi.parse_qs(self.path.split('?', 1)[1],
                keep_blank_values=False)
            VERIFIER = params['oauth_verifier'][0]

            self.send_response(200)
            self.send_header('Content-Type', 'text/plain')
            self.end_headers()
            self.wfile.write('OAuth request token fetched and authorized;' +
                ' you can close this window.')

        def log_request(self, code='-', size='-'):
            pass

    server = SocketServer.TCPServer((CALLBACK_BASE, 0), CallbackHandler)
    return server


# Make an authenticated API call using the given rauth session.
def get_api_resource(session):
    start = time.time()
    allProgress = []

    for student in students:
        print "Getting key for",student
        url = SERVER_URL + '/api/v1/user?email=' + student
        split_url = url.split('?', 1)
        params = {}

        # Separate out the URL's parameters, if applicable.
        if len(split_url) == 2:
            url = split_url[0]
            params = cgi.parse_qs(split_url[1], keep_blank_values=False)

        response = session.get(url, params=params)
        studentKhanData = response.json()

        try:
            if student != studentKhanData['student_summary']['email']:
                print "Mismatch. Khan probably returned my data instead."
                print "This student probably needs to add me as a coach."
                print "Skipping",student
                continue
            key = studentKhanData['student_summary']['key']
        except TypeError as e:
            print "Error:",e
            print "Does this student have a Khan account?"
            print "Skipping",student
            continue

        individualProgress = []
        for course in courses:
            print "Getting",course,"progress for",student
            ts = int(time.time()*1000)
            url = SERVER_URL + '/api/internal/user/topic-progress/' + course + '?casing=camel&userKey=' + key + '&lang=en&_=' + str(ts)
            print url
            split_url = url.split('?', 1)
            params = {}

            # Separate out the URL's parameters, if applicable.
            if len(split_url) == 2:
                url = split_url[0]
                params = cgi.parse_qs(split_url[1], keep_blank_values=False)

            response = session.get(url, params=params)
            progressData = response.json()
            progressArray = progressData['topicProgress']

            challengeCount = 0
            for activity in progressArray:
                if activity['status'] == 'complete' and activity['type'] == 'challenge':
                    challengeCount += 1

            individualProgress.append(challengeCount)

        allProgress.append([student,individualProgress])

    for x in allProgress:
        print x

    print "\n"
    end = time.time()
    print "\nTime: %ss\n" % (end - start)

def run_tests():
    # Create an OAuth1Service using rauth.
    service = rauth.OAuth1Service(
           name='autoGrade',
           consumer_key=CONSUMER_KEY,
           consumer_secret=CONSUMER_SECRET,
           request_token_url=SERVER_URL + '/api/auth2/request_token',
           access_token_url=SERVER_URL + '/api/auth2/access_token',
           authorize_url=SERVER_URL + '/api/auth2/authorize',
           base_url=SERVER_URL + '/api/auth2')

    callback_server = create_callback_server()

    # 1. Get a request token.
    request_token, secret_request_token = service.get_request_token(
        params={'oauth_callback': 'http://%s:%d/' %
            (CALLBACK_BASE, callback_server.server_address[1])})

    # 2. Authorize your request token.
    print "Get authorize URL"
    authorize_url = service.get_authorize_url(request_token)
    print authorize_url
    webbrowser.open(authorize_url)
    #It is possible to automate this part using selenium, but it appears to be against Khan Academy's Terms of Service

    callback_server.handle_request()
    callback_server.server_close()

    # 3. Get an access token.
    session = service.get_auth_session(request_token, secret_request_token,
        params={'oauth_verifier': VERIFIER})

    # Repeatedly prompt user for a resource and make authenticated API calls.
    print
    #while(True):
    get_api_resource(session)


def main():
    run_tests()

if __name__ == "__main__":
    main()
mbbackus
  • 13
  • 5
0

After some investigation I found the internal API. The path is below. The User KAID can be found from the public /api/v1/users call. The dt_start and dt_end are the time range you are looking to get the progress from.

/api/internal/user/[USER KAID]/progress?dt_start=2016-05-13T22:00:00.000Z&dt_end=2016-05-21T00:00:00Z&tz_offset=120&lang=en&_=1463730370107

I hope this helps anyone else in the future.

Mark Ellul
  • 1,906
  • 3
  • 26
  • 37
  • What are you trying to do with the data? I'm not sure if /progress is exactly the same as what we show on the topic page, but if what you actually want is what the user has done recently, it's probably better! If you want everything they've completed, there may be another call that's better. – Ben Kraft Jun 01 '16 at 22:11
  • We are using the public API for the user's exercises and videos. We want to find out what challenges the user has done, and articles they have read. Unfortunately, that internal progress call only does the challenges. If you know of an API method that gets everything they have completed, would you mind putting it in a answer? – Mark Ellul Jun 02 '16 at 09:25