Kushal Das

FOSS and life. Kushal Das talks here.

kushal76uaid62oup5774umh654scnu5dwzh4u2534qxhcbi4wbab3ad.onion

first few weeks in stockholm

Completed 2 weeks in the rented apartment. The day we got it, the owner called at 10pm to tell us that he is selling the place. He decided to skip this important information till we get in. So that he can earn the rent for whatever time he has before selling. So, now we have to find a new place and every other plan is out of order. Anwesha even unpacked most of the bags on that evening, now we are slowly packing again. The worst thing is that Py will have to go to another school (most probably based on where we can find a place).

Very slowly finding out all the things at work, as the brain is being used exclusively to find a new place to live. But, excited to see so much of Python usage within Sunet.

Continuing the journey at SUNET

SUNET logo

From this week I started working for SUNET as a public interest technologist. We are under Vetenskapsrådet, this also means from now on I am a central government employee in Sweden.

I will be helping out various in open source projects and services provided SUNET, focusing on privacy and security. I will also continue working on all of the upstream projects I maintain, including SecureDrop.

Rust, reproducibility and shadow-rs

Generally all of our Rust code are reproducible. If you build it in a fixed path, and also use SOURCE_DATE_EPOCH environment variable, the final library or executables will be producible. This is really helpful, for example while building cryptography python wheel, I can keep building it in a reproducible way even with the Rust dependencies.

A few days ago I saw shadow-rs, which can provide a lot of build time information. For example, in khata now I have a way to tell if I am using any custom build and also identify which one. I was a bit worried as shadow allows to store the build time too, but later found that the community already put in the patches so that it follows SOURCE_DATE_EPOCH.

So, I can still have reproducibility if I use the same build toolchain and environment while depending on shadow. Here is some example code for you to play around.

Default values, documentation and Ansible

While testing my qubes_ansible project on the upcoming Qubes OS 4.1 project, I noticed something really strange. But, before getting into that, this Ansible module and the connection plugin are for Qubes OS only, and based on the excellent Python modules provided by the Qubes team.

The error goes like this during the fact gathering steps (reformatted for the blog):

fatal: [debian-10]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to create temporary directory.In some cases, you may have
    been able to authenticate and did not have permissions on the target
    directory. Consider changing the remote tmp path in ansible.cfg to a path
    rooted in \"/tmp\", for more error information use -vvv. Failed command
    was: ( umask 77 && mkdir -p \"` echo ~The *user* is the default user in
    Qubes./.ansible/tmp `\"&& mkdir \"` echo ~The *user* is the default user in
    Qubes./.ansible/tmp/ansible-tmp-1630548982.9355698-7707-90110802425258 `\"
    && echo ansible-tmp-1630548982.9355698-7707-90110802425258=\"` echo ~The
    *user* is the default user in
    Qubes./.ansible/tmp/ansible-tmp-1630548982.9355698-7707-90110802425258 `\"
    ), exited with result 1, stderr output: mkdir: cannot create directory
    ‘~The ~The *user* account as default in Qubes OS. ~The ~The *user* account
    as default in Qubes OS. account as default in Qubes OS. ~The ~The *user*
    account as default in Qubes OS. ~The ~The *user* account as default in
    Qubes OS. account as default in Qubes OS. account as default in Qubes OS.
    is the default user in Qubes.’: File name too long\n",
    "unreachable": true
}

Most important part is the default user's home directory part, echo ~The user is the default user in Qubes./.ansible/tmp. For a moment I totally freaked out, as this looks like documentation. After reading the code more, I can see it is coming from the DOCUMENTATION variable in my plugin. After playing around a bit more and trying out different values I can see that the default value mentioned in the documentation is becoming the default value in the Python code.

After searching more I can see that the Ansible developers want the documentation string to be the gold standard and the code is parsing it find the default values. In my mind this is more confusing. I would expect the default value to be declared inside of the code.

Parsing the DOCUMENTATION and then finding the default values there in a Python code still does not fit in my brain. Fixed the issue for now, let me see what other surprises are waiting in the future.

Trouble of zoom and participant name

Last night I was in a panel along with Juan Andrés Guerrero-Saade organized by Aveek Sen, the topic was "Tips on how journalists can avoid getting snooped". You can watch the recording at Youtube.

But this post is not about that. It is about Zoom. Just before logging into the call, I made sure that the name is changed while joining the call, generally my daughter uses the Zoom and her name was mentioned before. I personally have almost zero zoom usage (except 2-3 times in last 1 year). But, after logging into the call, zoom again went back to the older name, and did not allow me to change it during the session. I kept trying during the session without any luck. I don't know why did they do this or why I could not find a way to change my name, but I feel this is really stupid.

A few bytes of curl

curl is most probably the highest used software in the world. I generally use it daily (directly) in the various scripts at the SecureDrop, starting from inside of Dockerfiles, to Ansible roles or in CI. I never read much about various options available other than a few very basic ones. So I decided to look more into the available options. Here are a few interesting points from that reading:

Protocols supported

This is a very long list, I had no clue!!!!

  • dict
  • file
  • ftp
  • ftps
  • gopher
  • gophers
  • http
  • https
  • imap
  • imaps
  • ldap
  • ldaps
  • mqtt
  • pop3
  • pop3s
  • rtsp
  • scp
  • sftp
  • smb
  • smbs
  • smtp
  • smtps
  • telnet
  • tftp

The --version flag will tell you this details and many things more.

$ curl --version
curl 7.76.1 (x86_64-redhat-linux-gnu) libcurl/7.76.1 OpenSSL/1.1.1k-fips zlib/1.2.11 brotli/1.0.9 libidn2/2.3.1 libpsl/0.21.1 (+libidn2/2.3.0) libssh/0.9.5/openssl/zlib nghttp2/1.43.0
Release-Date: 2021-04-14
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: alt-svc AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz Metalink NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets

User agent string

We can use -A to provide any given string as User Agent.

$ curl -A "browser/kd 1.0.0" http://httpbin.org/get
{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Host": "httpbin.org", 
    "User-Agent": "browser/kd 1.0.0", 
    "X-Amzn-Trace-Id": "Root=1-60ec8985-05acbf60769f18855db925f1"
  }, 
  "origin": "xxx.xxx.xxx.xxx", 
  "url": "http://httpbin.org/get"
}

Download via sequence

We can specify the range in our command. The first example downloads different index files from my blog.

curl -O "https://kushaldas.in/index-[1-10].html"

A few more examples:

curl -O "http://example.com/[1-100].png"

curl -O "http://example.com/[001-100].png"

curl -O "http://example.com/[0-100:2].png"

curl -O "http://example.com/section[a-z].html"

curl -O "http://example.com/{one,two,three,alpha,beta}.html"

Parallel downloads

You can pass -Z to the above command to download in parallel. Try it out yourself.

Configuration file

We can create a configuration file to hold all the command line parameters and pass it to the curl command via -K flag.

curl -K configs.txt https://kushaldas.in

One flag each line, comments via #.

# Change user agent
--user-agent "ACAB/1.0"

Curl checks for a default configuration file at ~/.curlrc unless called with -q.

To view only the headers

-I to check the headers returned by the server.

$ curl -I https://kushaldas.in
HTTP/2 200 
server: nginx/1.14.2
date: Sun, 18 Jul 2021 06:40:37 GMT
content-type: text/html; charset=utf-8
content-length: 25977
last-modified: Mon, 05 Jul 2021 15:24:32 GMT
etag: "60e32430-6579"
strict-transport-security: max-age=31536000
onion-location: https://kushal76uaid62oup5774umh654scnu5dwzh4u2534qxhcbi4wbab3ad.onion
permissions-policy: interest-cohort=()
x-frame-options: DENY
x-content-type-options: nosniff
referrer-policy: strict-origin
accept-ranges: bytes

DNS over HTTPS

We can force curl to use a DoH server for the DNS query, pass --doh-url flag to the command along with the server address.

curl -I --doh-url https://adblock.doh.mullvad.net/dns-query https://securedrop.org

Reproducible wheel buidling failure on CircleCI container

At SecureDrop project we have Python wheels built for Python 3.7 on Buster in a reproducible way. We use the same wheels inside of the Debian packages. The whole process has checks to verify the sha256sums based on gpg signatures, and at the end we point pip to a local directory to find all the dependencies.

Now, I am working to update the scripts so that we can build wheels for Ubuntu Focal, and in future we can use those wheels in the SecureDrop server side packages. While working on this, in the CI I suddenly noticed that all wheels started failing on the reproducibility test on Buster. I did not make any change other than the final directory path, so was wondering what is going on. Diffoscope helped to find out the difference:

│ -6 files, 34110 bytes uncompressed, 9395 bytes compressed:  72.5%
│ +6 files, 34132 bytes uncompressed, 9404 bytes compressed:  72.4%
├── six-1.11.0.dist-info/METADATA
│ @@ -9,14 +9,15 @@
│  Platform: UNKNOWN
│  Classifier: Programming Language :: Python :: 2
│  Classifier: Programming Language :: Python :: 3
│  Classifier: Intended Audience :: Developers
│  Classifier: License :: OSI Approved :: MIT License
│  Classifier: Topic :: Software Development :: Libraries
│  Classifier: Topic :: Utilities
│ +License-File: LICENSE
│  
│  .. image:: http://img.shields.io/pypi/v/six.svg
│     :target: https://pypi.python.org/pypi/six
│  
│  .. image:: https://travis-ci.org/benjaminp/six.svg?branch=master
│      :target: https://travis-ci.org/benjaminp/six
├── six-1.11.0.dist-info/RECORD
│ @@ -1,6 +1,6 @@
│  six.py,sha256=A08MPb-Gi9FfInI3IW7HimXFmEH2T2IPzHgDvdhZPRA,30888
│  six-1.11.0.dist-info/LICENSE,sha256=Y0eGguhOjJj0xGMImV8fUhpohpduJUIYJ9KivgNYEyg,1066
│ -six-1.11.0.dist-info/METADATA,sha256=vfvF0GW2vCjz99oMyLbw15XSkmo1IxC-G_339_ED4h8,1607
│ +six-1.11.0.dist-info/METADATA,sha256=Beq9GTDD6nYVwLYrN3oOcts0HSPHotfRWQ_Zn8_9a7g,1629
│  six-1.11.0.dist-info/WHEEL,sha256=Z-nyYpwrcSqxfdux5Mbn_DQ525iP7J2DG3JgGvOYyTQ,110
│  six-1.11.0.dist-info/top_level.txt,sha256=_iVH_iYEtEXnD8nYGQYpYFUvkUW9sEO1GYbkeKSAais,4
│  six-1.11.0.dist-info/RECORD,,

It seems somehow the packages were picking up metadata about the LICENSE file. After looking into the environment, I found virtualenv is pulling latest setuptools into the virtualenv. Thus breaking the reproducibility. It was a quick fix. Yes, we do pin setuptools and pip in our wheels repository.

Next, the extension based wheels started failing, and I was going totally crazy to find out why there is some ABI change. I still don't know the correct reason, but noticed that the circleci/python:3.7-buster container image (and the upstream Python container) is using different flags to build the extensions than Debian Buster on vm/bare metal. For example, below on the top we have the command line from the container and then inside of the normal Debian Buster.

creating build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/circleci/project/.venv/include -I/usr/local/include/python3.7m -c lib/sqlalchemy/cextension/processors.c -o build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/processors.o
gcc -pthread -shared build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/processors.o -L/usr/local/lib -lpython3.7m -o build/lib.linux-x86_64-3.7/sqlalchemy/cprocessors.cpython-37m-x86_64-linux-gnu.so
building 'sqlalchemy.cresultproxy' extension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/circleci/project/.venv/include -I/usr/local/include/python3.7m -c lib/sqlalchemy/cextension/resultproxy.c -o build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/resultproxy.o
gcc -pthread -shared build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/resultproxy.o -L/usr/local/lib -lpython3.7m -o build/lib.linux-x86_64-3.7/sqlalchemy/cresultproxy.cpython-37m-x86_64-linux-gnu.so
building 'sqlalchemy.cutils' extension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/circleci/project/.venv/include -I/usr/local/include/python3.7m -c lib/sqlalchemy/cextension/utils.c -o build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/utils.o
gcc -pthread -shared build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/utils.o -L/usr/local/lib -lpython3.7m -o build/lib.linux-x86_64-3.7/sqlalchemy/cutils.cpython-37m-x86_64-linux-gnu.so



creating build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kdas/code/securedrop-debian-packaging/.venv/include -I/usr/include/python3.7m -c lib/sqlalchemy/cextension/processors.c -o build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/processors.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/processors.o -o build/lib.linux-x86_64-3.7/sqlalchemy/cprocessors.cpython-37m-x86_64-linux-gnu.so
building 'sqlalchemy.cresultproxy' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kdas/code/securedrop-debian-packaging/.venv/include -I/usr/include/python3.7m -c lib/sqlalchemy/cextension/resultproxy.c -o build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/resultproxy.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/resultproxy.o -o build/lib.linux-x86_64-3.7/sqlalchemy/cresultproxy.cpython-37m-x86_64-linux-gnu.so
building 'sqlalchemy.cutils' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kdas/code/securedrop-debian-packaging/.venv/include -I/usr/include/python3.7m -c lib/sqlalchemy/cextension/utils.c -o build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/utils.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.7/lib/sqlalchemy/cextension/utils.o -o build/lib.linux-x86_64-3.7/sqlalchemy/cutils.cpython-37m-x86_64-linux-gnu.so
installing to build/bdist.linux-x86_64/wheel

If you want to see the difference in the flags, you can try out the following command:

python3 -m sysconfig

Look for PY_CFLAGS and related flags in the command output. Reproducibility depends on the environment, and we should not expect the latest container image (with the latest Python 3.7.x release) creating the same thing like in Debian Buster, but the failure started recently, we did not notice this a few months. Also, this means in future we should enable reproducibility tests on nightlies in CI.

Adding dunder methods to a Python class written in Rust

Last week I did two rounds of my Creating Python modules in Rust workshop. During the second session on Sunday, someone asked if we can create standard dunder methods, say __str__ or __repr__. I never did that before, and during the session I tried to read the docs and implement it. And I failed :)

Later I realized that I should have read the docs carefully. To add those methods, we will have to implement PyObjectProtocol for the Rust structure.

#[pyproto]
impl PyObjectProtocol for Ros {
    fn __repr__(&self) -> PyResult<String> {
        let cpus = self.sys.get_processors().len();
        let repr = format!("<Ros(CPUS: {})>", cpus);
        Ok(repr)
    }

    fn __str__(&self) -> PyResult<String> {
        let cpus = self.sys.get_processors().len();
        let repr = format!("<Ros(CPUS: {})>", cpus);
        Ok(repr)
    }
}
>>> from randomos import Ros
>>> r = Ros()
>>> r
<Ros (CPUS: 8)>
>>> str(r)
'<Ros (CPUS: 8)>'

This code example is in the ros-more branch of the code.

Workshop on writing Python modules in Rust April 2020

I am conducting 2 repeat sessions for a workshop on "Writing Python modules in Rust".

The first session is on 16th April, 1500 UTC onwards, and the repeat session will be on 18th April 0900 UTC. More details can be found in this issue.

You don't have to have any prior Rust knowledge. I will be providing working code, and we will go very slowly to have a working Python module with useful functions in it.

If you are planning to attend or know anyone else who may want to join, then please point them to the issue link.