1

Would it be possible to load an external page inside a container and replace text elements?

We work with ad campaigns and earn a percentage whenever a user signs up.

Can a script replace certain words? For instance “User” to “Usuario” or “Password” to “Contraseña” without affecting the original website or its functions.

Note: These links always pass through a redirection.

Example:

http://a2g-secure.com/?E=/0yTeQmWHoKOlN6zUciCXQwUzfnVGPGN&s1=

Note 2: Using an iframe is out of the question due to “Same-origin policy”.

Gabriel Meono
  • 990
  • 3
  • 18
  • 48
  • 1
    Possibly related: http://stackoverflow.com/questions/8146648/jquery-find-text-and-replace – jchook Mar 19 '16 at 19:34
  • Do you have any option to "craft" the html/css finally loaded or that is absolutelly out of your control? – miguel-svq Mar 21 '16 at 21:24
  • if "_iframe is out of the question due to “Same-origin policy_”, then so is ajax and other tools you use for this. – dandavis Mar 22 '16 at 18:24
  • @jchook, continuing from other answers, since this probably should be done on the server side: a server-equivalent to JQuery could be Cheeriojs on Node.js. – noderman Mar 22 '16 at 21:20

5 Answers5

2

You would need to load the external page server-side, and then you can do whatever you want with it. You can do serverside string replacement, or you can do it later in javascript.

But, remember that as soon as you add a whole webpage into for example a div in your own page, the css from your page will affect it. Plus, you would need to manipulate all the links in the documents, to have absolute urls. If the page depends on ajax, there is pretty much no way to accomplish what you want to do.

If on the other hand the pages you will be loading are static html, it is possible, though there are a lot of things you need to take care of before you can actually present the page to the user, like adjusting links, urls to stylesheets and so on.

lshas
  • 1,691
  • 1
  • 19
  • 39
  • 1
    it is also a security risk. The site you are loading could contain malicious javascript that has then full control over your page, without any SOP restrictions. – felixfbecker Mar 19 '16 at 19:47
  • That is also a very important thing to remember. And there is basically no way to get around that problem, while still having the site you present to the user functional. – lshas Mar 19 '16 at 19:49
  • You would have to sanitize the HTML by stripping all script tags and event handlers. – felixfbecker Mar 19 '16 at 20:02
2

I'm not sure if this answers your question, but you might find it useful.

(Perhaps you might give a step-by-step example of what you're trying to accomplish?)

If we assume that a browser attempts to retrieve page P from a proxy which first retrieves the content of page P from its actual home and then performs some transformation on its content before returning that page content to the browser, what you're describing is a Reverse HTTP Proxy and is a very well-known page serving technique.

Rather than performing complex transformations at the server (which require specialized knowledge of the page layout), this technique is usually used to inject a single line into the retrieved source that calls a JavaScript file to actually perform the required transformation at the browser.

So in essence:

  1. Browser requests Page P from Proxy 1.
  2. Proxy 1 retrieves the actual Page P from its real home, Server 2.
  3. Proxy 1 adds the line <script src="//proxy1.com/transform.js"></script> to the source of Page P.
  4. Proxy 1 then returns the modified source of Page P to Browser.

Once the Browser has received the page content, the JavaScript file is also retrieved, which can then modify the page contents in any way required.

This technique can be used to solve your "Same origin policy" issue by loading an iframe from a URL that points to the same server as that which provided the parent or owning page of the iframe which acts as proxy, like:

http://example.com/?proxy_target=//server2.com/pageP.html

Thus, the browser only "sees" content from a single server.

Rob Raisch
  • 17,040
  • 4
  • 48
  • 58
  • The issue with a proxy is that it breaks the monetization, the URL tracks the IP location and monetizes accordingly. If all the registrations come from a single IP non will be counted as legitimate. – Gabriel Meono Mar 20 '16 at 16:06
  • Which is why tracking the origin is important. If you have control over the target of the proxy (the server actually serving the request), you'd need to honor the Origin HTTP Request Header rather than the IP address of the request. If you do not have control, you'd need to use a 'transparent' proxy, like [filternet](https://github.com/axiak/filternet). See http://www.catonmat.net/http-proxy-in-nodejs/ for writing your own simple proxy. – Rob Raisch Mar 21 '16 at 15:57
  • Again, if you provide a step-by-step example of what you're trying to accomplish, I'd be better able to help you. – Rob Raisch Mar 21 '16 at 16:00
  • If a proxy is a problem, then my answer does not apply as well, because the "proxy translation" service would create a copy of the content and serve from somewhere else. @RobRaisch suggestion with a local JS to transform is closer to what could work. But without a proxy, what you could do is "map" the position of the ad elements and place an overlay on top of that, with your translations. The ad's signup button must remain clickable, so no translation (or something that disappears on mouse over). So you can serve the ad normally, but place your "mask" on top of it. – noderman Mar 24 '16 at 19:14
1

It seems you are trying to localize a website on the fly, using your server as a proxy for that content. Does it make sense? If that's the case, depending on the size of your operation, there are several proxy translation services out there (I'll name them if needed).

Basically, they scrape a website, providing a way for you to translate and host the translated content. Of course, this depends on your relationship with the content providers. You should also take this into consideration, since modifying content, even for translation, can be a copyright problem.

All things considered, if you trust the provider's javascript, the solution involves scraping the content, as mentioned in other answers, and serving that modified content. You really need to trust the origin...

update per request

http://www.easyling.com

http://www.smartling.com

http://www.motionpoint.com

http://www.lionbridge.com/solutions/translation-proxy/

http://www.sajan.com/translation-proxy-technology-and-traditional-website-translation-understanding-your-options/

They are all aimed at enterprise-grade projects, but I would say Easyling is the most accessible.

Hope this helps.

noderman
  • 1,934
  • 1
  • 20
  • 36
0

Using the .load() callback function, this will replace the text

$(function(){
$("#Content").load("http://example.com?user=Usuario",function() {
    $(this).html($(this).html().replace("user", +get param value+));
}); 

redirection u can use

// similar behavior as an HTTP redirect
window.location.replace("url");


// similar behavior as clicking on a link
window.location.href = "url";
channasmcs
  • 1,104
  • 12
  • 27
0

The answer is NO, not without using a server-side proxy. For a really good overview of how to use a proxy, see this YUI page: https://developer.yahoo.com/javascript/howto-proxy.html (Be patient, as it will take time to load, but the illustrations are worth it!)

When I try to do this in jsfiddle to see what data that the 3 parameters contain, then the error below appears:

$(function() {
    $(this).load('https://stackoverflow.com/questions/36003367/load-external-page-and-replace-text', function(responseText, textStatus, jqXHR){
    debugger;
  });
});

ERROR:

XMLHttpRequest cannot load Load external page and Replace text.

No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://fiddle.jshell.net' is therefore not allowed access.

Clomp
  • 3,168
  • 2
  • 23
  • 36