Consider the following purely hypothetical scenario: You do a lot of your work inside a web browser. Stuff on your workplace network (like the bug tracking system, the company Wiki and the source control system) you have to access directly. Stuff outside you have to access via a proxy (because the direct access is filtered in a haphazard way that blocks websites of your vendors and customers, as well as resources you need to do your job).
You can go manually change the network settings in your browser every time you go back and forth. You can set up two different browsers (one proxied and another not). You can use a browser add-on that lets you manually toggle proxy settings with one click.
Better still if you didn’t have to do any of that stuff and your browser would just automatically do the Right Thing. With Firefox, you can achieve this result reasonably easily using a “Proxy Auto-Configuration File” (or PAC file, for short). Read on for details.
PAC File Basics
A PAC file is a little chunk of JavaScript code that can look at a URL and make a decision about what proxy (if any) the browser should use to access that URL. The basic structure is something like:
function FindProxyForURL(url, host) { if(some_condition) return "method1"; ... return "default_method"; }
The name of the function and its arguments must be exactly as shown above. The “url” argument is the full URL being accessed. The “host” is just the hostname part extracted from the URL for your convenience. The method strings returned are things like the literal string “DIRECT” if the URL should be accessed directly; to use a proxy, return something like:
SOCKS5 myhost:8000
(The above example would use a SOCKS5 proxy running on host “myhost” listening on port 8000.) You can make the rules as simple or as complicated as they need to be for your use case. There are a handful of predefined functions you can call from FindProxyForURL()
; these are all documented on the relevant Mozilla Developer Network page.
I’m not trying to duplicate the Mozilla documentation here, but I suspect an actual example (with the identifying details filed off) might be a useful addition to that documentation.
Example Use Case
The following is a fictional example. It is also something that will require customization for your specific needs, and the prior assent of your network administrator if applicable. This is not a case where you can cut and paste and expect things to work.
Say you’re running Firefox on a machine inside the network of your employer, example.com
. On that same machine, you’ve got an SSH client connected to your home network (which has unfettered Internet access). The SSH client provides a SOCKS5 proxy on port 8000 of localhost (for clients on the same machine only), the exit point of which is on the aforementioned home network.
Your goal is to be able to access the following:
- Any URL with a fully-qualified hostname under the
example.com
domain (directly). - Any URL with a numeric IP in the 10.0.0.0/8 block (directly).
- URLs with unqualifed hostnames that resolve using your local (example.com) DNS (directly).
- Any URL with a numeric IP not in the 10.0.0.0/8 block (via proxy).
- URLs with unqualified hostnames that don’t resolve at
example.com
but do resolve on your home DNS (via proxy). - Any URL with a fully-qualified hostname not under
example.com
(via proxy).
(A simpler way to say the above would be to just have the first three bullet points and say “everything else via proxy”.) Let’s also stipulate that you never want to use the local DNS at example.com to resolve (or attempt to resolve) anything except for unqualified hostnames.
Putting Together the PAC File
We need to write a JavaScript FindProxyForURL() function that will return one of two fixed strings, depending on whether we want to use the proxy or not. If we want to access a URL directly (not using the proxy), we must return:
DIRECT
To use the proxy, we must return:
SOCKS5 localhost:8000
Our inputs are the URL, and the hostname part of the URL (host
). We have a (small) library of utility functions provided by our caller, and the facilities of the JavaScript language. Using those resources, we need to return DIRECT
if any of the following conditions are true:
host
ends with.example.com
host
is a numeric IP inside the 10.0.0.0/8 networkhost
is unqualified (lacks a domain part) and resolves using local (example.com
) DNS
Otherwise, if none of those are true, we must return the string that indicates we should use the proxy. (Note that the three tests are ordered in a very deliberate way, with the goal of avoiding using the slow and unreliable local DNS unless absolutely necessary. If the first condition is true, we can return right away without executing the latter two. If the second is true, we can skip the third. If the host is fully qualified, we know the third condition is false without having to look at the DNS part. Everything but the DNS test can be done by just examining the host string without touching the network.)
Test for Specific Domain
The first test (checking to see if the host
is under the example.com
domain) is simply a matter of calling one of the provided library functions, thus:
if(dnsDomainIs(host, ".example.com")) ...
This condition will evaluate to true
(and the code following the if
will be executed) for hosts like “foo.example.com” or “server.subdomain.region.example.com”, but not for hosts like “example.org” or “foo.example.com.cn”.
Test for Subnet
This test is somewhat more complicated. To check if host
is on the 10.0.0.0/8 subnet, we could just do this:
if(isInNet(host, "10.0.0.0", "255.0.0.0")) ...
The function isInNet()
is provided as part of the calling environment. The first argument is the hostname, the second argument is the address and the third is the netmask (where a 1 bit indicates the network part).
The problem with this is that isInNet()
resolves (or attempts to resolve) host using local DNS first, then checks the subnet of the result. We want to avoid using local DNS unless host is already a numeric IP (dotted quad) in the URL. To accomplish this feat, we provide our own JavaScript function to test if host is a dotted quad up front:
function isDottedQuad(host) { octets = host.split("."); if(octets.length != 4) return false; for(i=0; i < octets.length; i++) { val = +octets[i]; if(isNaN(val) || val < 0 || val > 255) return false; } return true; }
Using the above, we can write the test like this:
if(isDottedQuad(host) && isInNet(host, "10.0.0.0", "255.0.0.0")) ...
That’s a lot better, since isDottedQuad()
just looks at the string without hitting the network, and isInNet()
is only called when isDottedQuad
comes back true. (JavaScript, like C, skips evaluating the later parts of compound conditional expressions once the answer is known. In other words, in code like if(a && b) ...
, if
a
evaluates as false, then we already know the whole condition is false and don't need to evaluate b
.)
It’s still not ideal because it hits the local resolver for all numeric IPs (not just those in 10.0.0.0/8), but it’s close enough for our example. Writing a fully general, purely local replacement for isInNet()
that understands CIDR block notation is left as an exercise for the reader.
Test for Unqualified Local Names
Fortunately, for the third test, all we have to do is use two utility functions already part of the environment:
if(isPlainHostName(host) && isResolvable(host)) ...
If the hostname lacks a domain part, we check if it can be resolved using the local DNS. (If hostname has domain part, we never try to resolve it locally.)
Complete PAC File Example
Putting together all the parts described above gives us a file like:
function isDottedQuad(host) { octets = host.split("."); if(octets.length != 4) return false; for(i=0; i < octets.length; i++) { val = +octets[i]; if(isNaN(val) || val < 0 || val > 255) return false; } return true; } function FindProxyForURL(url, host) { if(dnsDomainIs(host, ".example.com")) return "DIRECT"; if(isDottedQuad(host) && isInNet(host, "10.0.0.0", "255.0.0.0")) return "DIRECT"; if(isPlainHostName(host) && isResolvable(host)) return "DIRECT"; return "SOCKS5 localhost:8000"; }
Using the PAC File
To use the PAC file you’ve created, you need to put it somewhere Firefox can get to it, then change your browser configuration to use it. There are two general approaches I’ll describe here: a lazy method, with the PAC file on the local filesystem of the machine where you’re running the browser, and the official way where you put the PAC file on a web server.
Local PAC File
Note: This method is not explicitly documented as being supported, but it empirically works using Firefox 50.0.2. If you have trouble, try the “real” way where you put the PAC file on a web server (described in the next section).
Place the PAC file in your local filesystem, somewhere you can easily find it, with a name ending with .pac
. Users on UNIX-y thing probably want to put it somewhere under their home directory. On Windows, a place like C:\temp\proxy.pac
might be a good choice.
On UNIX-like things, configure Firefox as follows:
- launch Firefox
- go to Edit->Preferences
- click the Advanced tab in the sidebar
- click the Network tab along the top
- click the Settings… button (next to Connection/Configure how Firefox connects to the Internet)
- select the radio button next to
Automatic proxy configuration URL:
- in the URL box, supply a
file://
URL giving the file location - check the checkbox next to
Proxy DNS when using SOCKS v5
If the path to the file is /home/someuser/proxy.pac
then the full URL would be:
file:///home/someuser/proxy.pac
For Windows systems, follow the instructions above, but instead of Edit->Preferences use Tools->Options. If you put the file in C:\temp\proxy.pac
then the full URL would be:
file:///c:/temp/proxy2.pac
Server-Hosted PAC File
Place the PAC file on a web server, make sure you can reach it from your browser (and that people who shouldn’t see it, can’t), and ensure that it is served with MIME type application/x-ns-proxy-autoconfig
. A full discussion of web hosting is (way) beyond the scope of this document, but in a typical Apache setup, putting the following line in your .htaccess
file will take care of the MIME type at least:
AddType application/x-ns-proxy-autoconfig .pac
Set up the browser as above, but use the appropriate http://
(or https://
) method URL instead of a file://
method URL. If I call the PAC file proxy.pac
and put it in ~/public_html/
on a webserver with Apache set up for userdir, then the URL might be something like:
http://somehost/~mylogin/proxy.pac
SSH Client as SOCKS5 Proxy
If you don’t already have a SOCKS5 proxy, but you can SSH to somewhere with good network connectivity, your SSH client may provide the easiest to a working proxy. Most SSH clients provide application-level port forwarding with SOCKS5 support. For OpenSSH (the default SSH client on many desktop UNIX-like systems), the recipe is:
ssh -D localhost:8000 mylogin@myremotehost
Once you successfully log in to the remote host, this will create a SOCKS5 proxy with a local endpoint listening on port 8000 (on the host where you’re running the SSH client, and accepting connections only from that same host), and exiting on the remote host. Note that it is important to specify localhost:8000
rather than just :8000
— the latter will appear to work, but will allow connections to the local tunnel endpoint from anywhere, not just your machine.
On Windows systems, PuTTY is an excellent (and free) SSH client which also supports tunnels. To set this up, look in the configuration under Connection->SSH->Tunnels. Check the “Dynamic” and “IPv4” radio boxes and put 8000 in the “Source Port” box. (Leave Destination blank.) Click Add. You should see text like <code>4D8000</code> appear under Forwarded Ports. Remember to save your config.
The figure to the right shows an example of the PuTTY settings. Click to see a full-size version.
Limitations
Be aware that there are some edge cases this simple example doesn’t handle:
- Your home network and work network address spaces overlap, and there are hosts on your home network you need to access via numeric IP. (In our example, the work network was 10.0.0.0/8. Imagine your home network is 10.200.0.0/16. You ask for the host 10.200.1.2 — which one is it?)
- Your home network and work network contain one or more hosts with identical hostnames, and you need to access such a host on your home network using an unqualified name. (If home and work both have a host named “webserver”, using the unqualified name will give you the work one.)
- Anything to do with numeric IPv6 addresses.
Given the specifics, it would be possible to add code to deal with any or all of the above.
Troubleshooting
If you’re having trouble, make sure all the local (work network) hosts work as expected with “No proxy” set in the browser options. Then, make sure that all the remote and home-network stuff works using “Manual Proxy Configuration”, “SOCKS host: 127.0.0.1”, “Port: 8000”, “SOCKS v5”.
If you get “proxy server is refusing connection” pages on the latter test, your SOCKS server isn’t working. Check that your tunnel is open. (If using SSH, your client must be running and connected to the remote host. Remember that the sysadmin on the remote host can disable forwarding in the SSH server config.)
If the manual proxy config works but the PAC file doesn’t, try a simplified version that just returns a fixed string (first DIRECT, then if that works, the SOCKS string). If you don’t get the expected results, your browser isn’t finding the PAC file, or isn’t understanding it. The browser error log (Tools->Web Developer->Browser Console) and command-line JavaScript syntax checkers can be useful here. Remember that when you change the PAC file, you must tell the browser to re-load it (using the button on the network connection settings page where you configured the PAC file URL) before your changes will “take”.
If the simplified PAC file works, add tests back in one at a time and check at each step.