Using Stem and PySocks to access network over Tor
I previously
wrote about
using the standard SOCKS proxy provided by the Tor
Project on your system using Python, in particular
using the requests
module.
But, what about if you want to start a Tor SOCKS proxy only for your project (while the code is running), and use some existing code/module to do network calls (using sockets) over it?
Stem module to control the tor process
We can use Stem module to control the tor
process in the system. In our example below, we will use
PySocks module along with so that we can
use urllib
module to fetch some data. The code is based on one of the tutorial
at the stem
project. You will also notice that we are asking to use any exitnode
from the Russia while creating the tor
process.
Starting the tor
process with a given SOCKS proxy port is super simple. And
then we replace socket.socket
with socks.socksocket
(after the right
configuration), and socket.getaddrinfo
with our own implementation to make
sure that we don't leak DNS information.
import io
import socket
import urllib.request
import socks
import stem.process
from stem.util import term
SOCKS_PORT = 7000
def query(url):
"""
Uses urllib to fetch a site using the proxy on the SOCKS_PORT.
"""
return urllib.request.urlopen(url).read()
def print_bootstrap_lines(line):
if "Bootstrapped " in line:
print(term.format(line, term.Color.BLUE))
def getaddrinfo(*args):
"Let us do the actual DNS resolution in the SOCKS5 proxy"
return [(socket.AF_INET, socket.SOCK_STREAM, 6, "", (args[0], args[1]))]
def main():
# Start an instance of Tor configured to only exit through Russia. This prints
# Tor's bootstrap information as it starts. Note that this likely will not
# work if you have another Tor instance running.
print(term.format("Starting Tor:\n", term.Attr.BOLD))
tor_process = stem.process.launch_tor_with_config(
config={
"SocksPort": str(SOCKS_PORT),
"ExitNodes": "{ru}",
},
init_msg_handler=print_bootstrap_lines,
)
print(term.format("\nChecking our endpoint:\n", term.Attr.BOLD))
socks.set_default_proxy(
socks.PROXY_TYPE_SOCKS5, "localhost", 7000, True, None, None
)
socket.socket = socks.socksocket
socket.getaddrinfo = getaddrinfo
try:
print(term.format(query("https://icanhazip.com"), term.Color.BLUE))
finally:
tor_process.kill() # stops tor
if __name__ == "__main__":
main()
If you are surprised about the getaddrinfo
function above, it is just
returning the same domain name it received (instead of an IP address). The
actual DNS resolution happens over Tor at the proxy level.