Skip to content Skip to sidebar Skip to footer

Python-Twisted: Reverse Proxy To HTTPS API: Could Not Connect

I am trying to build a reverse-proxy to talk to certain APIs(like Twitter, Github, Instagram) that I can then call with my reverse-proxy to any (client) applications I want (think

Solution 1:

If you read the API documentation for ReverseProxyResource, you will see that the signature of __init__ is:

def __init__(self, host, port, path, reactor=reactor):

and "host" is documented as "the host of the web server to proxy".

So you are passing a URI where Twisted expects a host.

Worse yet, ReverseProxyResource is designed for local use on a web server, and doesn't quite support https:// URLs out of the box.

It does have a (very limited) extensibility hook though - proxyClientFactoryClass - and to apologize for ReverseProxyResource not having what you need out of the box, I will show you how to use that to extend ReverseProxyResource to add https:// support so you can use the GitHub API :).

from twisted.web import proxy, server
from twisted.logger import globalLogBeginner, textFileLogObserver
from twisted.protocols.tls import TLSMemoryBIOFactory
from twisted.internet import ssl, defer, task, endpoints
from sys import stdout
globalLogBeginner.beginLoggingTo([textFileLogObserver(stdout)])

class HTTPSReverseProxyResource(proxy.ReverseProxyResource, object):
    def proxyClientFactoryClass(self, *args, **kwargs):
        """
        Make all connections using HTTPS.
        """
        return TLSMemoryBIOFactory(
            ssl.optionsForClientTLS(self.host.decode("ascii")), True,
            super(HTTPSReverseProxyResource, self)
            .proxyClientFactoryClass(*args, **kwargs))
    def getChild(self, path, request):
        """
        Ensure that implementation of C{proxyClientFactoryClass} is honored
        down the resource chain.
        """
        child = super(HTTPSReverseProxyResource, self).getChild(path, request)
        return HTTPSReverseProxyResource(child.host, child.port, child.path,
                                         child.reactor)

@task.react
def main(reactor):
    import sys
    forever = defer.Deferred()
    myProxy = HTTPSReverseProxyResource('api.github.com', 443,
                                        b'/users/defunkt')
    myProxy.putChild("", myProxy)
    site = server.Site(myProxy)
    endpoint = endpoints.serverFromString(
        reactor,
        dict(enumerate(sys.argv)).get(1, "tcp:8080:interface=127.0.0.1")
    )
    endpoint.listen(site)
    return forever

If you run this, curl http://localhost:8080/ should do what you expect.

I've taken the liberty of modernizing your Twisted code somewhat; endpoints instead of listenTCP, logger instead of twisted.python.log, and react instead of starting the reactor yourself.

The weird little putChild piece at the end there is because when we pass b"/users/defunkt" as the path, that means a request for / will result in the client requesting /users/defunkt/ (note the trailing slash), which is a 404 in GitHub's API. If we explicitly proxy the empty-child-segment path as if it did not have the trailing segment, I believe it will do what you expect.

PLEASE NOTE: proxying from plain-text HTTP to encrypted HTTPS can be extremely dangerous, so I've added a default listening interface here of localhost-only. If your bytes transit over an actual network, you should ensure that they are properly encrypted with TLS.


Post a Comment for "Python-Twisted: Reverse Proxy To HTTPS API: Could Not Connect"