Kushal Das

FOSS and life. Kushal Das talks here.

kushal76uaid62oup5774umh654scnu5dwzh4u2534qxhcbi4wbab3ad.onion

Story of debugging exit 0

For more than a month, my primary task at SecureDrop land is to make the project ready for a distribution update. The current system runs on Ubuntu Xenial, and the goal is to upgrade to Ubuntu Focal. The deadline is around February 2021, and we will also disable Onion service v2 in the same release.

Tracking issue

There is a tracking issue related to all the changes for Focal. After the initial Python language-related updates, I had to fix the Debian packages for the same. Then comes the Ansible roles for the whole system, setting up the rebase + full integration tests for the system in CI. A lot of new issues were found when we managed to get Focal based staging instances in the CI.

OSSEC test failure

This particular test is failing on Focal staging. It checks for the port 1514 listening on UDP on the monitor server via OSSEC. The first thing we noticed that the output of the command is different in Xenial and on Focal. While looking at the details, we also noticed that the OSSEC service is failing on Focal. Now, this starts via a sysv script in the /etc/init.d/ directory. My initial try was to follow the solution mentioned here. But, the maild service was still failing most of the time. Later, I decided to move to a more straightforward forking based service file. Works well for the OSSEC monitoring service. So, I decided to have the same systemd service file for the agent in the app server.

But, the installation step via Ansible failed before any configuration. When we install the .deb packages, our configuration provider package tries to restart the service via the post-installation script. And that fails with a complaint that /var/ossec/etc/client.keys is not found. If I try to start the service manually, I get the same error with an exit code 1.

At this moment, I was trying to identify how was it working before, we are using the same codebase + the sysv on Xenial, but package installation works. The systemd generates the following service file:

# Automatically generated by systemd-sysv-generator

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/init.d/ossec
Description=LSB: Start daemon at boot time
After=remote-fs.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
SuccessExitStatus=5 6
ExecStart=/etc/init.d/ossec start
ExecStop=/etc/init.d/ossec stop

After digging for hours, I noticed exit 0 at the end of the old sysv script. This file was last modified by late James Dolan around seven years ago, and we never had to touch the same.

Now I am planning to have 1 in the SuccessExitStaus= line of the service file for the agent. This will keep the behavior the same as the old sysv file.

[Unit]
Description=OSSEC service

[Service]
Type=forking
ExecStart=/var/ossec/bin/ossec-control start
ExecStop=/var/ossec/bin/ossec-control stop
RemainAfterExit=True
SuccessExitStatus=1

[Install]
WantedBy=multi-user.target

Using rkt and systemd

Few days back, I wrote about my usage of rkt containers. As rkt does not have any daemon running, the simplest way to have a container running is to start it inside some screen or tmux session. I started following the same path, I used a tmux session.

But then I wanted to have better control over the containers, to start or stop them as required. Systemd is the solution for all the other services in the system, that makes it an ideal candidate for this case too.

Example of a service file

[Unit]
Description=ircbot
Documentation=https://github.com/kushaldas/ircbot
Requires=network-online.target

[Service]
Slice=machine.slice
MemoryLimit=500M
ExecStart=/usr/bin/rkt --insecure-options=image --debug run --dns=8.8.8.8 --volume mnt,kind=host,source=/some/path,readOnly=false  /mnt/ircbot-latest-linux-amd64.aci
ExecStopPost=/usr/bin/rkt gc --mark-only
KillMode=mixed
Restart=always

The path of the service file is /etc/systemd/system/ircbot.service. In the [Unit] section, I mentioned a super short Description, and link to the documentation of the project. I also mentioned that this service requires network-online.target to be available first.

The [Service] is the part where we define all the required configurations. The first value we mention is the Slice.

Slices, a way to do resource control

Systemd uses slices to group a number of services, and slices in a hierarchical tree. This is built on top of the Linux Kernel Control Group feature. In a system by default, there are four different slices.

  • -.slice : The root slice.
  • system.slice : All system services are in this slice.
  • machine.slice : All vms and containers are in this slice.
  • user.slice : All user sessions are in this slice.

We can see the whole hierarchy using the systemd-cgls command. For example:

Control group /:
-.slice
├─machine.slice
│ ├─ircbot.service
│ │ ├─11272 /usr/bin/systemd-nspawn --boot --register=true -Zsystem_u:system_r:container_t:s0:c447,c607 -Lsystem_u:object_r:container_file_t:s0:c447,
│ │ ├─init.scope
│ │ │ └─11693 /usr/lib/systemd/systemd --default-standard-output=tty
│ │ └─system.slice
│ │   ├─ircbot.service
│ │   │ └─11701 /usr/bin/ircbot
│ │   └─systemd-journald.service
│ │     └─11695 /usr/lib/systemd/systemd-journald
├─user.slice
│ └─user-1000.slice
│   ├─session-31.scope
│   │ ├─16228 sshd: kdas [priv]
│   │ ├─16231 sshd: kdas@pts/0
│   │ ├─16232 -bash
│   │ ├─16255 sudo su -
│   │ ├─16261 su -
│   │ └─16262 -bash

You can manage various resources using cgroups. Here, in our example service file, I mentioned that memory limit for the service is 500MB. You can read more here on resource management.

There is also systemd-cgtop tool, which will give you a top like view for the various resources consumed by the slices.

# systemd-cgtop -M rkt-250d0c2b-0130-403b-a9a6-3bb3bde4e934

Control Group                                                           Tasks   %CPU   Memory  Input/s Output/s
/machine.slice/ircbot.service                                             9      -   234.0M        -        -
/machine.slice/ircbot.service/system.slice                                -      -     5.0M        -        -
/machine.slice/ircbot.service/system.slice/ircbot.service                 -      -     5.0M        -        -

The actual command which we used to run the container is mentioned in ExecStart.

Using the service

I can now use the standard systemctl commands for this new ircbot service. For example:

# systemctl start ircbot
# systemctl enable ircbot
# systemctl stop ircbot
# systemctl status ircbot

You can also view the log of the application using journalctl command.

# journalctl -u ircbot

The documentation from rkt has more details on systemd and rkt.