When using Silverlight or Flash to fetch data from other domains one often runs into cross-domain access restrictions. For security reasons in order to access data from different domains, the remote server explicitly has to allow this by defining a crossdomain.xml (or, for Silverlight, clientaccesspolicy.xml is good as well). If this file cannot be found on the remote domain, the request is not executed.
This can be fustrating when querying against RSS feeds or JSON/XML web APIs that don’t define any of these files. The workaround for this issue is to use some sort of proxying service. In this article I’ll be showing how to use Google App Engine to create a simple proxy that will forward these requests for free – within a reasonable daily load.
Google App Engine Overview
The reason I’ve chose to implement the proxy using Google App Engine is because it has a free daily quota and getting started using it is really simple: all you need is a Google account and to download and install the Google App Engine SDK.
Google App Engine supports developing in both Java and Python. In my example I’ll be using Python. In order to use and deploy the code yourself as well, follow these steps:
- If you don’t yet have a Google account, register for one.
- Download and install Python from python.org
- Download and install the Google App Engine Python SDK
- Register an application on Google App Engine (click on Create Application). When done, run the Google App Engine Launcher and create an application with the same name you’ve just created.
Creating the Proxy In Python
Creating a simple proxy is pretty straightforward by using the urlfetch library:
try: response = urlfetch.fetch(url) except (urlfetch.Error, apiproxy_errors.Error): # an error occured
In order to make the proxy a bit smarter, I’ll implement some caching using the App Engine’s memcache:
proxiedContent = memcache.get(memcacheKey) proxiedContentInMemcache = True if proxiedContent is None: # not in memcache: execute request # ... # add the result content to memcache for CACHE_TIME minutes memcache.add(memcacheKey,proxiedContent,CACHE_TIME) else: # use the content from memcache
Using these snipplets, here’s how the main file of the proxy webapp will look like:
import datetime import hashlib import logging import pickle import urllib import re import time import urllib import wsgiref.handlers from google.appengine.api import memcache from google.appengine.api import urlfetch from google.appengine.ext import db from google.appengine.ext import webapp from google.appengine.ext.webapp import template from google.appengine.runtime import apiproxy_errors CACHE_TIME = 1 # number of minutes to cache content for URL_PREFIXES = ["http://www.google.com/finance"] # only allow URLs to be queried from certain domain(s) def getMemcacheKey(url): url_hash = hashlib.sha256() url_hash.update(url) return "hash_" + url_hash.hexdigest() class ProxyHandler(webapp.RequestHandler): def get(self): url = self.request.get('url') url = urllib.unquote(url) # only allow urls that start with prefixes defined in URL_PREFIXES to be used if not self.isUrlAllowed(url): self.response.out.write("The URL passed can not be proxied due to security reasons.") return memcacheKey = getMemcacheKey(url) # Use memcache to store the request for CACHE_TIME proxiedContent = memcache.get(memcacheKey) proxiedContentInMemcache = True if proxiedContent is None: proxiedContentInMemcache = False try: response = urlfetch.fetch(url) except (urlfetch.Error, apiproxy_errors.Error): return self.error(404) proxiedContent = response.content if proxiedContent is None: return self.error(404) # Add the fetched content to memcache if (not proxiedContentInMemcache): memcache.add(memcacheKey,proxiedContent,CACHE_TIME) self.response.out.write(proxiedContent) def isUrlAllowed(self, url): for urlPrefix in URL_PREFIXES: if url.startswith(urlPrefix): return True return False app = webapp.WSGIApplication([ ("/proxy", ProxyHandler), ], debug=True) def main(): wsgiref.handlers.CGIHandler().run(app) if __name__ == "__main__": main()
The proxy can now be called by passing the URL to be proxied via the url parameter. So to proxy e.g. http://google.com, the following request should be made: http://myapplication.appspot.com/proxy?url=http://google.com.
Adding crossdomain.xml to the Application
Now that the proxy class is ready, all we need to do is wire it into the web application and include a crossdomain.xml static file which will allow requests from all hosts (it is advisable to change this to the URL the requests are actually made). Based on this, here is how the crossdomain.xml would look like:
<!DOCTYPE cross-domain-policy SYSTEM "http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd"> <cross-domain-policy> <allow-access-from domain="*" /> </cross-domain-policy>
And here is how the modified app.yaml descriptior will look like:
application: yourappname version: proxyv1 runtime: python api_version: 1 handlers: - url: /crossdomain.xml static_files: crossdomain.xml upload: crossdomain.xml - url: /.* script: proxy.py secure: optional
Download the Code
The code for the application can be downloaded here: Proxy Using Google App Engine.zip. After unzipping be sure to change the “yourappname” name in app.yaml in order to be able to deploy it on Google App Engine.
Update: Security Issues
Andrew noted in the comment that the original solution raised security concerns as it would have been easy for someone to hijack this proxy and use it for their purpose as no authentication or authorization is done. I’ve added a small fix where the proxy only forwards to URLs that start from a list of prefixes. This is probably a sufficient solution for common cases, however more sophisticated authorization methods may be needed in other cases.