0

We have an intranet, built in .Net (C#). We would like our CMS to be able to extract the HTML content from a Google Doc to integrate with other content.

Specifically, we want an editor to be able to create and maintain a Google Doc, and embed a reference to this doc (via its ID) into a page on our intranet. When rendering the page, the CMS will contact Google Docs, get the HTML content of the document, and render it as part of the page (yes, there will be caching involved).

I have gone 'round and 'round the GData API. It's harder than I thought.

Authentication is via OAuth, so what we'll do is create another Google Apps user for our CMS, so I can get authenticated.

But once I retrieve a document, there is no HTML in it. There are a variety of properties (including one infuriatingly called "Content." which isn't), but nothing I can see has the actual HTML content of the document. It seems to have all sorts of information about the document, except for the content itself.

Hours of Googling and research tells me that I will likely have to form the export URL, then download it as an HTML file via HTTP. I can do this in a browser -- just paste in the correct URL, and there it is.

But I can't do that from code. I can make an HTTP request, certainly, but it doesn't carry the authentication that I've already been through to get the document object itself.

So, two questions:

  1. Is there an easier way to do this? I have a nagging suspicion that I'm going about this all wrong.
  2. How can I make an HTTP request to a Google Docs URL in the context of an authenticated user?
Deane
  • 8,269
  • 12
  • 58
  • 108

1 Answers1

1

A couple of pointers ...

  1. As an alternative to Gdata, you'll probably find it easier to use the newer Drive API and SDK. See https://developers.google.com/drive/v2/reference/files/get for the API call to retrieve a file object with an exportLink. You can choose whether you want to engage with Drive using the REST API directly, or using the Google provided C# libraries.
  2. Deal with oauth (more specifically oauth2) as a separate problem. Once you have oauth2 working and won an access token, then (and only then) move on to using that access token for Drive. As with Drive, you have a choice of driving Oauth directly using its URLs, or using the Google supplied libraries. If you prefer the DIY approachm then everything you need to know is here https://developers.google.com/oauthplayground/ and https://developers.google.com/accounts/docs/OAuth2

You ask about the user interaction. To that ...

The good news is that you can do what you are looking for. The specifics depend largely on who owns the documents. Remember that Oauth is about authorisation (with authentication as a kinda by-product).

So you have a Google Docs document "Doc" owned by "User". Application "App" wants to read Doc. So the first step is for User to authorise App to access Doc. That is what the user-centric stuff is all about. If App requests "offline" access, then Oauth will provide it with a refresh-token, which App will store and can use at any time to generate an access token and read Doc. Generating an access token from a refresh token can be done without any user interaction, it's simply a POST to a Google URL.

So in this scenario, the user-interaction only happens once.

The other approach you can take is to have Doc owned by App, and shared to User. In that case , App would (probably) be a Service Account (https://developers.google.com/accounts/docs/OAuth2ServiceAccount). Since App creates and owns Doc, there is no authorisation required by User.

pinoyyid
  • 21,499
  • 14
  • 64
  • 115
  • Been trying to get this to work. I still see user-centric things like "redirect URLs," and their sample opens a browser window, etc. To be clear -- I want to pull this information server-side, UNATTENDED. This process will occur with no human interaction, so no way for a human to authorized credentials or anything. Put another way, as an example, I want to be able to code this in a job that runs at 3 a.m. or something. Is there anyway to do this? – Deane Nov 03 '13 at 23:05
  • I've updated the answer with some additional information. The key point is that if the human owns the document, then the human needs to authorise the web-app to access it, but this only needs to be done once. – pinoyyid Nov 04 '13 at 04:14
  • I think I've figured out how to word this: I don't want to "authorize" the App to do anything. Instead, I want the app to "Act As" a particular user. My idea is that I will create a Google account for the App, then invite it into a document, so it has read permission. Why can't I just impersonate that Google account from code? IU essentially want the code to act as that user, and therefore avoid all concepts of "authorization" entirely. Not possible? – Deane Nov 04 '13 at 10:59
  • Creating an account for your app, and then have users share their docs to that account will certainly work. However, the account will still need to authorise the app to permit it to access the account's documents. You will only need to do this one time in order to obtain a refresh token, which you will store. Thereafter the refresh token can be used to obtain an access token without further authorisation. You can (probably - I'll check for you) use the oauth playground to get the refresh token. – pinoyyid Nov 04 '13 at 11:05
  • It looks like you CAN use Oauth2 playground https://developers.google.com/oauthplayground/ to get your refresh token. This saves you having to write any of the user-centric Oauth2 code and callbacks. I'll update the answer with the steps. – pinoyyid Nov 04 '13 at 11:09
  • I've created a separate SO question and answer at http://stackoverflow.com/questions/19766912/how-do-i-authorise-a-background-web-app-without-user-intervention-canonical/19766913#19766913 with the procedure – pinoyyid Nov 04 '13 at 11:38