Use a socks5 proxy to beat HTTP error 451

This documents my recipe for beating working around the error 451 when viewing some websites from Europe.

The Problem

I want to screenshot pages, but that is difficult to do from some geographies because they cannot comply with European regulations.

If you live in Europe, or more precisely-- if you visit a website from an IP that appears to be in Europe – you might have seen this error page:

Many Europe-based visitors see errors like this on US-based sites

Whereas the expected outcome should have been the actual homepage for Austin's paper:

Viewed through socks5 proxy via New York

Actual screenshot of statesman.com through the socks5 proxy

Learning through side-projects

I have a number of on-going experiments. They are run from Scaleway, a budget-friendly cloud-hosting provider in France.

One of these experiments is a service that screenshots about 1000 different news sites globally every day. (go here to see NewShots). For the most part it works well, but like any software project I noticed show-stopping bugs.

One such bug was this error 451 error, which is basically, "Sorry, I can't show you this page for legal reasons. The 451 is in reference to Farenheit 451.

The legal reason is that they cannot or will not comply with European regulations on user data. GDPR, etc.

Honestly I don't care. I just wanted to take a screenshot of the page.


Proposed Solution

Let's setup a cheap and easy private proxy server accessible only to myself and my project servers. (Hint, ZeroTier is amazing for building VPN/mesh networks).

Logical layout of socks/ error 451 work-around

The ingredients in this solution

  • FoxyProxy plugin (I use Firefox)
  • Virtual Machine in US (DigitalOcean)
  • SOCKS5 Proxy Server (I run my own, but many are open and freely available)

Let's cook!

There are two parts we want to fix:

  1. Fix the errors for my side projects running in the cloud (based in France)
  2. Fix the errors on my PC (usually based in Sweden)

In both cases the answer is the same – use a SOCKS5 proxy based in the US. To do that we will set up a simple socks5 services on a VM in New York.


Step 1: Setup the SOCKS5 server

That is in a separate post here. With only Docker and 10 lines of YAML you can get your own SOCKS5 service up an running:

Setup my browser to use Socks5 on demand

I use a browser extension to quickly switch it over to socks5. There is a separate post I made about doing this:

Setup my cloud-hosted services to use socks5

My cloud-hosted side-project uses Python and Selenium to drive a headless web browser. All I had to do was add the following lines to my code which tells the process to route requests through a the remote proxy:

    if settings.SOCKS5_PROXY_ENABLED:
        self.stdout.write(self.style.SUCCESS(f'Proxy enabled: {settings.SOCKS5_PROXY_HOSTNAME}:{settings.SOCKS5_PROXY_PORT}'))
        profile.set_preference('network.proxy.type', 1)
        profile.set_preference("network.proxy.socks_version", 5)
        profile.set_preference('network.proxy.socks', settings.SOCKS5_PROXY_HOSTNAME)


        # explicit casting to int because otherwise it is ignored and fails silently.
        profile.set_preference('network.proxy.socks_port', int(settings.SOCKS5_PROXY_PORT))
        profile.set_preference("network.proxy.socks_remote_dns", True)

See the full file here on Github

So what is happening above? – Simply put, if the proxy settings exist, then we add that to the configuration with the set_preference() methods and attributes passed in as parameters.


Final Thoughts

I like SOCKS5 because it is built in to most HTTP client software, whether it is a browser (Firefox), command-line based ( curl or wget ), or buried within code like with python and Selenium. I use it for all three cases when I need my client to appear as though it is in a different geographic region.

Show Comments