Posts tagged "linux":
You want sudo -i or su -
You want to use sudo -i
or su -
to log into root. sudo su
anything is
superfluous, because you probably should be using sudo -i
or sudo -s
, which are
roughly equivalent, depending if you want to simulate a login (su -
or sudo
-i
) or not (su
or sudo -s
).1
When to use su -
?
You want to log into root using the root password. Typically you must be in
the wheel
group (check your PAM configuration). In Debian, you simply need
to know the password, as there is no wheel
group restriction enabled by
default.2
When to use sudo -i
?
You want to log into root using your sudo configuration, which will typically
prompt for your login password or allow login without a password. sudo
also
logs all invocations by default. It is also more flexible, but is also prone
to security concerns, such as the recent local user escalation vulnerability.
It's not crazy to consider using just su -
or maybe some other tool like
doas
, sudo
is a bit hard to pin down when assessing its security risks.
Why do I want to "simulate a login"?
The next few sections show some diffs of env(1)
output. The diff is
generated using diff -U0 | grep -v '^@'
. This shows an unified diff with no
context and suppresses line markers. First I'll summarize what the output
suggests, then show the diffs.
TL;DR
In my case, if I don't use sudo -i
or su -
to simulate3 a login the
following things might not work correctly:
- Not exactly sure what the missing settings mean for nix, though I'm going to guess root won't be able to use nix without a login.
- the pager settings won't be configured correctly
- SBCL, Java, Dotnet, VBox, Fltk, OpenCL, Distcc might not work
- Plan9port might not work
sudo
will allowXAUTHORITY
on through, which permits you to run graphical programs as another user, using the current user's desktop session. So usingsu
may not pass this on through by default. (Maybesu -m
could do this as well?)
Simulating login also sets your PWD
to root's HOME
(/root
). This might
seem convenient at first. I wonder why one would want to touch their user
files as root. The use-cases might be (1) write a thumb drive, (2) grab some
system configuration from your user's homedir, or (3) You store system stuff in
your home directory. Maybe logging into root's homedir is a saner default,
then just specify an absolute path to be extra clear what you wanted to do.
This also means if you do something dumb, it will not damage your user homedir,
only root's, provided you didn't cd
somewhere else.
It most of these settings are pulled in from my /etc/profile
. Hence you
probably want to simulate a user login.
The env
diffs
sudo -s
vs sudo -i
--- sudo_-s 2021-02-14 17:26:26.912620999 -0600 +++ sudo_-i 2021-02-14 17:26:38.259214818 -0600 +PLAN9=/opt/plan9 +XDG_CONFIG_DIRS=/etc/xdg +LESS=-R -M --shift 5 +JDK_HOME=/etc/java-config-2/current-system-vm +CONFIG_PROTECT_MASK=/etc/sandbox.d /etc/fonts/fonts.conf /etc/gentoo-release /etc/gconf /etc/terminfo /etc/dconf /etc/ca-certificates.conf /etc/texmf/web2c /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/revdep-rebuild +DISTCC_VERBOSE=0 +JAVA_HOME=/etc/java-config-2/current-system-vm +DOTNET_ROOT=/opt/dotnet_core +ANT_HOME=/usr/share/ant -PWD=/home/winston +EDITOR=/usr/bin/vi +PWD=/root +NIX_PROFILES=/nix/var/nix/profiles/default /root/.nix-profile +CONFIG_PROTECT=/etc/stunnel/stunnel.conf /usr/share/maven-bin-3.6/conf /usr/share/gnupg/qualified.txt /usr/share/easy-rsa /usr/share/config /usr/lib64/libreoffice/program/sofficerc +QT_QPA_PLATFORMTHEME=qt5ct +DISTCC_TCP_CORK= +MANPATH=/root/.nix-profile/share/man:/etc/java-config-2/current-system-vm/man:/usr/share/gcc-data/x86_64-pc-linux-gnu/9.3.0/man:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.35.1/man:/etc/java-config-2/current-system-vm/man/:/usr/local/share/man:/usr/share/man:/usr/lib/rust/man:/usr/lib/llvm/11/share/man:/opt/plan9/man +NIX_PATH=nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixpkgs:/nix/var/nix/profiles/per-user/root/channels:/root/.nix-defexpr/channels +OPENCL_PROFILE=nvidia +UNCACHED_ERR_FD= +FLTK_DOCDIR=/usr/share/doc/fltk-1.3.5-r4/html +NIX_SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt +OPENGL_PROFILE=xorg-x11 +DISTCC_FALLBACK=1 +DCC_EMAILLOG_WHOM_TO_BLAME= +INFOPATH=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.3.0/info:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.35.1/info:/usr/share/info:/usr/share/info/emacs-26 +JAVAC=/etc/java-config-2/current-system-vm/bin/javac +LESSOPEN=|lesspipe %s +MANPAGER=manpager -PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/bin:/usr/lib/llvm/11/bin:/opt/plan9/bin +DISTCC_SAVE_TEMPS=0 +PAGER=/usr/bin/less +DISTCC_SSH= +SBCL_HOME=/usr/lib64/sbcl +GCC_SPECS= +GSETTINGS_BACKEND=dconf +DISTCC_ENABLE_DISCREPANCY_EMAIL= +XDG_DATA_DIRS=/usr/local/share:/usr/share +PATH=/root/.nix-profile/bin:/root/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/lib/llvm/11/bin:/opt/plan9/bin:/usr/games/bin +VBOX_APP_HOME=/usr/lib64/virtualbox +LV2_PATH=/usr/lib64/lv2 -_=/bin/env +SBCL_SOURCE_ROOT=/usr/lib64/sbcl/src +LADSPA_PATH=/usr/lib64/ladspa +_=/usr/bin/env
sudo su
vs sudo su -
I don't have a root password set on my systems, so I will use sudo
with su
for example's sake.
--- sudo_su 2021-02-14 17:25:53.932832743 -0600 +++ sudo_su_- 2021-02-14 17:26:10.259394587 -0600 +PLAN9=/opt/plan9 -SUDO_GID=1000 -SUDO_COMMAND=/bin/su -SUDO_USER=winston -PWD=/home/winston +XDG_CONFIG_DIRS=/etc/xdg +LESS=-R -M --shift 5 +JDK_HOME=/etc/java-config-2/current-system-vm +CONFIG_PROTECT_MASK=/etc/sandbox.d /etc/fonts/fonts.conf /etc/gentoo-release /etc/gconf /etc/terminfo /etc/dconf /etc/ca-certificates.conf /etc/texmf/web2c /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/revdep-rebuild +DISTCC_VERBOSE=0 +JAVA_HOME=/etc/java-config-2/current-system-vm +DOTNET_ROOT=/opt/dotnet_core +ANT_HOME=/usr/share/ant +EDITOR=/usr/bin/vi +PWD=/root +NIX_PROFILES=/nix/var/nix/profiles/default /root/.nix-profile +CONFIG_PROTECT=/etc/stunnel/stunnel.conf /usr/share/maven-bin-3.6/conf /usr/share/gnupg/qualified.txt /usr/share/easy-rsa /usr/share/config /usr/lib64/libreoffice/program/sofficerc -XAUTHORITY=/root/.xauthgcUQue +QT_QPA_PLATFORMTHEME=qt5ct +DISTCC_TCP_CORK= +MANPATH=/root/.nix-profile/share/man:/etc/java-config-2/current-system-vm/man:/usr/share/gcc-data/x86_64-pc-linux-gnu/9.3.0/man:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.35.1/man:/etc/java-config-2/current-system-vm/man/:/usr/local/share/man:/usr/share/man:/usr/lib/rust/man:/usr/lib/llvm/11/share/man:/opt/plan9/man +NIX_PATH=nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixpkgs:/nix/var/nix/profiles/per-user/root/channels:/root/.nix-defexpr/channels +XAUTHORITY=/root/.xauthq2CYTL +OPENCL_PROFILE=nvidia +UNCACHED_ERR_FD= +FLTK_DOCDIR=/usr/share/doc/fltk-1.3.5-r4/html +NIX_SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt +OPENGL_PROFILE=xorg-x11 +DISTCC_FALLBACK=1 +DCC_EMAILLOG_WHOM_TO_BLAME= +INFOPATH=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.3.0/info:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.35.1/info:/usr/share/info:/usr/share/info/emacs-26 +JAVAC=/etc/java-config-2/current-system-vm/bin/javac +LESSOPEN=|lesspipe %s +MANPAGER=manpager -PATH=/sbin:/bin:/usr/sbin:/usr/bin -SUDO_UID=1000 -MAIL=/var/mail/root -_=/bin/env +DISTCC_SAVE_TEMPS=0 +PAGER=/usr/bin/less +DISTCC_SSH= +SBCL_HOME=/usr/lib64/sbcl +GCC_SPECS= +GSETTINGS_BACKEND=dconf +DISTCC_ENABLE_DISCREPANCY_EMAIL= +XDG_DATA_DIRS=/usr/local/share:/usr/share +PATH=/root/.nix-profile/bin:/root/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/lib/llvm/11/bin:/opt/plan9/bin:/usr/games/bin +VBOX_APP_HOME=/usr/lib64/virtualbox +LV2_PATH=/usr/lib64/lv2 +SBCL_SOURCE_ROOT=/usr/lib64/sbcl/src +LADSPA_PATH=/usr/lib64/ladspa +_=/usr/bin/env
Conclusion
Use either sudo -i
or su -
. Don't mix sudo
and su
. Maybe don't use
sudo
(always good advice, though I don't follow it… yet).
Footnotes:
While writing this post, I found that sudo su -
will erase SUDO_*
environment variables. Maybe this is beneficial to a workflow, but in most
cases I suggest fixing your software to not check for these vars. Looking at
you, beep
.
When NAT Bites — Use a Reverse VPN
Sometimes I find myself setting up servers on networks with less than ideal network configuration. Most home internets use dynamic IP addresses, which requires extra work to ensure I know the IP address to use when logging into the network from the internet. Another concern is how unreliable home networking gear can be, especially with users tweaking settings without fully appreciating what they're doing. As a result, I've devised an alternate solution to ensure I can always log into boxes hosted on home internet connections.
How to ssh into a home server box — the standard way
Let's say I am to set up a box on a home network. I want to ensure I can log in via SSH from the outside world — that is anywhere online. In IPv6 and IPv4 land it is equally painful and possibly unreliable.
Allowing ssh logins oven IPv6
After setting up the box on the network, I will need to configure the router's firewall to allow SSH traffic on through. In the case of IPv6 this is as simple as telling the router to allow traffic inbound on port 22 to my box's IPv6 address, which is globally routable and unique to my box. One more thing: the ISP gives each IPv6-enabled subscriber a IPv6 prefix, which is a bunch of IPv6 addresses that the customer's equipment utilize via SLAAC. Let's work through an example:
- The ISP gives your router IPv6 prefix
2001:db8:a:b::/64
1 via DHCPv6 - Every device on the network assigns itself an IPv6 address via SLAAC
- Each device on the network is globally routed via a unique IPv6 address.
For example the server box is
2001:db8:a:b::1
, the router is2001:db8:a:b::5:1
, and my phone is2001:db8:a:b::c:d
. - I want to log into the server box. I ssh in to the server box's address directly.
- The router permits SSH traffic to my server box through its firewall
- Success! I am logged in.
It seems pretty straight forward, But there is a catch: when your DHCPv6 lease expires (e.g. 1 week), there is a possibility your network is assigned a new IPv6 prefix. Hence the IPv6 address can be thought as dynamic.
In order to get around this limitation, one can use Dynamic DNS and update a AAAA DNS record for every host on the LAN that needs to be accessed via the internet. This means you'll either have to run a Dynamic DNS client on each host, or use some sort of host discovery to set Dynamic DNS AAAA records on another host's behalf. Seems a bit confusing; in short when using IPv6 on a home network, special care is needed to ensure SSH is accessible.
Allowing ssh logins over IPv4
Looking at IPv4, some things differ. As in IPv6's case, one needs to allow SSH traffic to the target box on the home network. What differs is one does not directly address the internal host; instead one has to address the router (which is usually plugged directly into the modem/internet). The router then passes this SSH traffic to the box on the home network. In effect the router is translating between different network segments. The feature of your router passing traffic on a port to another host on the home network is known as port forwarding.
Now that we understand the main difference, we should also discuss how IPv4 addresses are given to home ISP users. Typically the mechanism of how your home router gets an IPv4 address is the same as how your personal devices get addresses from the router. In both cases the device requesting an address uses DHCP to solicit a IPv4 address and related configuration (such as DNS servers and default gateway). Again the same process is applied to each device on the home network. They simply get IPv4 addresses from the router, but in this case, the router does not give out globally routable addresses, but instead gives out private IPv4 addresses. These private IPv4 addresses are not globally routable, and instead depend on the router to figure out where to send internet stuff. Also, as with DHCPv6, there is a finite lifespan for DHCP leases. As such after awhile, your router's IPv4 address may change.
Let's work through a IPv4 example:
- Your router solicits a IPv4 address from the ISP, such as
203.0.113.1
.2 This is the only internet-routable address used in the setup. - Each device on the network solicits an IPv4 address from the router.3
Each device has its own private IPv4 address.4 In this example the server
box has
192.168.1.20
, the router has192.168.1.1
, my phone has192.168.1.101
. - Internet traffic is directed towards the router (acting as the default gateway), which in turn translates the network traffic. This is done by replacing the private IPv4 address and source port with the public IPv4 address and a "session port", which is used to uniquely map between traffic belonging on the internet side to traffic belonging on the home network side. This is network address translation.
- When I want to log into the server box, I ssh into the router's public IPv4 address.
- The router translates the ssh traffic to a the private IPv4 address space and sends it off to the server box. The server box then handles the ssh session sent from the router, which then the router relays back and forth between the server box on the private LAN with my internet computer.
Seems a bit convoluted too. But it does work. When combined with Dynamic DNS to update an A DNS record for the router's public IPv4 address, it can work pretty flawlessly.
What can go wrong?
When using any of the above approaches to set up SSH login, a few things can go wrong.
- In both IPv4 and IPv6, the router depends on the ISP to assign internet addresses. As such in both cases, Dynamic DNS or a similar mechanism is needed to ensure I can log in via a DNS record, or at least know what address works when using SSH. If I don't know the latest internet address, I cannot log in.
- The router has to be specially configured to allow traffic in both IPv4 and IPv6 situations. Using IPv6, the router must permit traffic to the server box. Using IPv4, the router has modify network traffic it receives on the SSH port and pass it on to the server box. And then it does the same to send traffic back to my internet device. If the router configuration breaks for whatever reason, I cannot log in.
- Some internet connections cannot receive SSH traffic. Most mobile internet service providers (such as ATE) use Carrier Grade NAT to allow multiple internet subscribers to connect to the internet using the same public IPv4 address. Port forwarding won't work with Carrier Grade NAT. Additionally some ISPs do nefarious things like blocking ports for you. After all it's pretty normal for ISPs to block SMTP (even if I think it's bonkers).
In short the network operator must cooperate with you to make this happen. In some cases the hardware isn't capable either.
A few options
Before continuing I'd like to point out—like most technology—the exact same tools every single user uses daily (in one form or fashion) can be used to attack the same systems. As such the next example is for demonstration purposes. Data is sent plain text and no authentication/verification of host or user is performed. Computers networks are only as secure as their users ;).
Punt the socket using socat
One can use socat to (1) run a socat client that connects to a internet server, which sends all data to the home server box's SSH port and (2) run a socat server on the internet host which sends traffic it receives to the socat client connected on another port on the internet server.
Let's try this out: given host public.example.com
and your LAN host
not-reachable.lan
, run socat TCP-LISTEN:9991,fork,forever TCP-LISTEN:9992
on public.example.com
, then run socat
TCP-CONNECT:public.example.com:9991,fork,forever TCP-CONNECT:localhost:22
on
not-reachable.lan
. Now run ssh -p 9992 public.example.com
. You should see
the familiar messages of OpenSSH.
Sounds a bit confusing, because it is. Additionally there is not an easy mechanism to ensure multiple ssh sessions can be used simultaneously, and to ensure the authenticity of the socat client that connects to the internet server. So I wouldn't use this in production ;). Note: socat can use OpenSSL, which can address the authenticity problem, but I still don't think this is a very intuitive way to solve the problem.

Better: Using OpenSSH
During a recent episode of Linux Unplugged, there was discussion of using
OpenSSH to dial in to a internet-reachable host. The secret sauce is to use
the RemoteForward
option (ssh -R
). This can be achieved like ssh -R
2222:localhost:22 my-server.example
. Then from the server, one can run ssh
-p 2222 localhost
to log into the firewalled host's SSH server. Alternately,
OpenSSH also supports forwarding a SOCKS5 proxy, which can be used in
conjunction with a web browser to browse web configuration UIs with little
effort, or with use other applications via the tool proxychains.

My Solution: Use OpenVPN
I wanted to allow for the firewalled host be accessible as an unique network address, without need for SOCKS5 or other steps. The solution I came up with is to stand up a OpenVPN server process on the internet host, then run OpenVPN client on the remote host. The configuration is deceptively simple, despite OpenVPN's featureful footprint.
Some OpenVPN operators allow users to log in via username or password associated with their account, but this appears a bit complicated to set up. Instead, the OpenVPN folks recommend setting up a Self-signed certificate authority to dole out TLS certificates. This is achieved using the very handy script EasyRSA, which streamlines the process of creating a Certificate Authority and issuing keypairs into a handful of very short commands.

After setting up OpenVPN like I will outline below, one can simply run ssh
10.100.0.10
, if the home server's VPN "virtual" IP is 10.100.0.0
. Other
services hosted on the home server are also accessible by that IP.
Steps to set up (based off my playbooks)
The following steps are based off of the official OpenVPN tutorial for setting up multiple clients with their own certificates. This means a compromise of one client's private key will not compromise the integrity of other client private keys.
Set up the certificate authority & certificates
# Download EasyRSA VERSION=3.0.7 curl -O easy-rsa-${VERSION}.tar.gz \ https://github.com/OpenVPN/easy-rsa/archive/v${VERSION}.tar.gz tar -xzvf easy-rsa-${VERSION}.tar.gz cd easy-rsa-${VERSION}/easyrsa3 # Get script usage (it does not understand --help) ./easyrsa help # One-time CA and DH params initialization ./easyrsa init-pki echo 'My Cool CA Name' | ./easyrsa build-ca nopass ./easyrsa gen-dh # Do this for each OpenVPN Server. Each server name # (e.g. "my_server") must be unique. ./easyrsa build-server-full my_server nopass # Do this for each OpenVPN Client. Each client name # (e.g. "my_client") must be unique. ./easyrsa build-client-full my_client nopass
The generated files used later in the install are:
pki/ca.crt
- The certificate authority public
certificate. Install alongside any other
another
*.crt
file. (Or maybe chain them?) pki/private/ca.key
- The certificate authority private certificate.
pki/dh.pem
- Diffie Hellman parameters, TODO what is this for? Install on the server.
pki/issued/my_server.crt
- The server public certificate. Install on the server.
pki/private/my_server.key
- The server private certificate. Install on the server.
pki/issued/my_client.crt
- The client public certificate. Install on the client.
pki/private/my_client.key
- The client private certificate. Install on the client.
I currently do not have the CA management scripted by Ansible. I am a little uncomfortable with the idea of Ansible entirely managing the CA creation followed by certification creation. More experience with Ansible should help put my concerns at ease. Chiefly, I don't want Ansible to write out certificates from the "master CA tree". Sometimes a little manual operation is pgood.
Set up the OpenVPN Server (Internet-side)
See the corresponding Ansible role's tasks.
The server's openvpn.conf
should look like this:
tls-server port 12345 proto udp dev tun0 ca /etc/openvpn/secrets/ca.crt cert /etc/openvpn/secrets/{{inventory_hostname}}.crt key /etc/openvpn/secrets/{{inventory_hostname}}.key dh /etc/openvpn/secrets/dh.pem server 10.100.0.0 255.255.255.0 persist-key persist-tun ifconfig-pool-persist ipp.txt push "route 10.100.0.0 255.255.255.0" keepalive 10 120 comp-lzo user openvpn group openvpn status openvpn-status.log log /var/log/openvpn/openvpn.log verb 4
This is probably the easiest part.
- Install OpenVPN.
apk add openvpn
- Install the CA keypair.5
- Install the VPN server's keypair.
- Install the Diffie Hellman parameters file.
- Install the server's openvpn.conf
- Run the server
openvpn --config /etc/openvpn/openvpn.conf
Set up the OpenVPN Client (Your Firewalled Host)
See the corresponding Ansible role's tasks.
The client's openvpn.conf
should look like this:
client proto udp dev tun0 remote public.example.com 12345 nobind resolv-retry 30 script-security 2 ca /etc/openvpn/secrets/ca.crt cert /etc/openvpn/secrets/{{inventory_hostname}}.crt key /etc/openvpn/secrets/{{inventory_hostname}}.key persist-key persist-tun keepalive 10 120 comp-lzo log /var/log/openvpn/openvpn.log verb 4
- Install OpenVPN.
apk add openvpn
- Install the CA public key.
- Install the VPN Client's keypair.
- Install the client's openvpn.conf
- Run the client using
openvpn --config /etc/openvpn/openvpn.conf
Set up another OpenVPN client on your PC
Follow the same instructions as setting up your firewalled host, but be sure to generate a unique SSL keypair to identify your PC. If the OpenVPN server, the firewalled host OpenVPN client, and your PC's OpenVPN client are all set up correctly, you should be able to directly connect to any of the hosts participating in the VPN via the VPN's private IPv4 network.
Wishlist
This setup works swimmingly, but there are a few nits in the amount of effort involved in discovering hosts connected to the VPN. I also realized instead of only routing directly to other VPN clients, one could also join the various LANs, so I gave that some thought as well.
VPN Host Discovery
At present I scan for remote connected hosts via:
winston@snowcrash ~ $ sudo nmap -sP -PE 10.100.0.0/24 Password: Starting Nmap 7.80 ( https://nmap.org ) at 2020-08-15 01:31 CDT Nmap scan report for 10.100.0.1 Host is up (0.026s latency). Nmap scan report for 10.100.0.6 Host is up (0.053s latency). Nmap scan report for 10.100.0.14 Host is up (0.064s latency). Nmap scan report for 10.100.0.10 Host is up. Nmap done: 256 IP addresses (4 hosts up) scanned in 2.61 seconds
In the above output, 10.100.0.1
is the OpenVPN server
address, and according to ip addr show dev tun0
, my
current OpenVPN client is 10.100.0.10
.
Pretty icky. Alternatively, one can take a look at the server's openvpn.log
:
Fri Aug 14 02:02:48 2020 us=357198 cyberdemon/127.0.0.1:42995 MULTI: primary virtual IP for cyberdemon/127.0.0.1:42995: 10.100.0.18
In both cases it's a bit tedious to figure out which IP belongs to which host participating on the VPN.
DNS-SD (via Avahi) might be suitable for this. Instead of scanning for
available hosts, one can simply query for a DNS-SD (DNS Service Discovery)
type, quite possibly either _http._tcp
or _ssh._tcp
. Here is one such
guide.
Domain Names for VPN Hosts
An related wish (which overlaps in some ways) is to configure OpenVPN to register VPN clients' information including their assigned IP address with a DNS server such as dnsmasq or tinydns or even busybox's dns server. I did some skimming about this, and came up with the conclusion that it's possible, but is not well understood or documented, so I decided to go with host discovery via network scanning for the time being.
Configure the VPN Clients' LAN segments be routable from other Clients
This is a killer feature I have yet to figure out. It appears possible, but this requires a bit of further testing, and is relatively tricky to set up correctly, since if a VPN Client ends up using the same IP network in two different contexts, things will silently stop working without explanation, because that's as intended, IP is funny like that.
Footnotes:
203.0.113.1
is part of 203.0.113.0/24
(TEST-NET-3) which is reserved
for documentation in RFC 5737.
Devices could be configured on the IPv4 LAN with so-called static IPv4 addresses, but usually you don't want this. You can achieve the same by assigning hosts the same IPv4 every time (which are identified the network interface's MAC address or even the optional hostname field).
The CA private key is likely not necessary and probably a serious smell, if you know anything about this, please write me explaining this.
Linux dmesg --follow (-w) not working?
For a couple months now, I have noticed that running dmesg -w
on my
workstation does not appear to print out new kernel messages. In other
words dmesg --follow
"hangs". Additionally when running tail -f
/var/log/kern.log
to monitor new dmesg messages picked up by
sysklogd
(part of syslogng), the latest messages do not come through
until sysklogd periodically "reopens" the /dev/kmsg
kernel message
buffer.
Why is this a problem?
This is a problem because I use the dmesg
log to monitor important
hardware related messages such as the kernel recognizing a USB device
or diagnosing bluetooth/wifi issues. When I plug in a USB drive, the
first thing I do is check dmesg for the following messages:
[10701.359834] usb 2-4.4: new high-speed USB device number 8 using ehci-pci [10701.394801] usb 2-4.4: New USB device found, idVendor=12f7, idProduct=0313, bcdDevice= 1.10 [10701.394807] usb 2-4.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [10701.394810] usb 2-4.4: Product: MerryGoRound [10701.394813] usb 2-4.4: Manufacturer: Memorex [10701.394816] usb 2-4.4: SerialNumber: AAAAAAAAAAAA [10701.395182] usb-storage 2-4.4:1.0: USB Mass Storage device detected [10701.398885] scsi host7: usb-storage 2-4.4:1.0 [10702.401161] scsi 7:0:0:0: Direct-Access Memorex MerryGoRound PMAP PQ: 0 ANSI: 0 CCS [10702.401710] sd 7:0:0:0: Attached scsi generic sg6 type 0 [10702.651720] sd 7:0:0:0: [sde] 15654912 512-byte logical blocks: (8.02 GB/7.46 GiB) [10702.652341] sd 7:0:0:0: [sde] Write Protect is off [10702.652346] sd 7:0:0:0: [sde] Mode Sense: 23 00 00 00 [10702.652961] sd 7:0:0:0: [sde] No Caching mode page found [10702.652965] sd 7:0:0:0: [sde] Assuming drive cache: write through [10702.681473] sde: sde1 sde2 [10702.684869] sd 7:0:0:0: [sde] Attached SCSI removable disk
This output reports that a USB device was detected, where it is
plugged in, what is the vendor/product information, what USB speed it
is using, the size of the storage, the device name (/dev/sde
), and
its partitions (/dev/sde1
, /dev/sde2
). There are a lot of other
messages written out to dmesg
, such as the kernel detecting a bad
USB cable, segementation faults, and so on.
Given the importance of the above log output, I have developed a habit
of running dmesg -w
to monitor such kernel events. The -w
tells
dmesg to monitor for new messages. The long option is --follow
.
In addition to dmesg -w
not working as intended, syslogng log
entries written to /var/log/kern.log
are not written as they occur;
instead the log is written in "bursts", which suggests sysklogd
occasionally reopens /dev/kmsg
, thereby reading in new log messages,
but all the timestamps are the same time for each "burst" read.
Which of systems were affected?
I have two systems with a virtually identical OS installation; one is a
workstation named snowcrash
with an AMD FX-8350 on an ASRock M5A97
R2.0 motherboard; the other is a HP Elitebook 820 G4 named
cyberdemon
with an Intel Core i5-7300U. Curiously enough, the
strange dmesg -w
hang does not occur on cyberdemon
, but does occur
on snowcrash
. Both hosts run Linux mainline, with both machines on
5.6.4. Looking through my /var/log/kern.log
files, this behavior was
apparent on a 5.4.25 kernel. As we will see later, this coincides with
the affected versions that others have reported
Additionally, I asked my friend tyil who happens to also use an AMD FX-8350 with Gentoo to check for the bug; he also had the problem on 5.6.0.
Pinpointing the bug
First thing I did was find a way to reproduce the issue. I recorded an asciinema recording. You can watch it here. I then shared the recording on IRC, hoping somebody would know of a solution. I got some helpful and encouraging feedback, but nobody knew of this particular bug. See the recording below, or click here:
The next step was to figure out if there was something wrong with
/bin/dmesg
. Running strace -o dmesg-strace.log dmesg -w
shows the
follow pertinent lines:
openat(AT_FDCWD, "/dev/kmsg", O_RDONLY) = 3 lseek(3, 0, SEEK_DATA) = 0 read(3, "6,242,717857,-;futex hash table "..., 8191) = 79 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x16), ...}) = 0 openat(AT_FDCWD, "/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 4 fstat(4, {st_mode=S_IFREG|0644, st_size=26388, ...}) = 0 mmap(NULL, 26388, PROT_READ, MAP_SHARED, 4, 0) = 0x7fed92688000 close(4) = 0 futex(0x7fed925f9a14, FUTEX_WAKE_PRIVATE, 2147483647) = 0 write(1, "\33[32m[ 0.717857] \33[0m\33[33mfut"..., 97) = 97 … SNIP … read(3, "6,1853,137347289701,-;input: Mic"..., 8191) = 128 write(1, "\33[32m[137347.289701] \33[0m\33[33min"..., 140) = 140 read(3,
The last line indicates a pending read()
that never completes. Note
the 3
file description refers to the /dev/kmsg
device. It appears
nothing out of the ordinary occurs, except for the fact that the
read()
simply hangs.
I was a bit at a loss to explain the hanged read()
. I was really
lost honestly. So I went on, and inspected the changes to /bin/dmesg
shipped by util-linux, and did not find any sign of significant
changes. I did run dmesg
from master
just to be sure. See the
commit log of dmesg.c
here. Additionally I also searched the
util-linux bug tracker and found nothing relevant.
Given I had no solution yet, I decided to resort to Googling things, hoping somebody had discussed this bug before. Keywords I tried are:
- dmesg follow no longer working
- dmesg kmsg no more messages
- linux kmsg read hang
- "/dev/kmsg" hang
- "dmesg -w" hangs
None of these came up with anything useful. I was using DuckDuckGo mainly, with some Google queries sprinkled on top.
I then visited the torvalds/linux GitHub repository, searched for
"kmsg", and did not find a commit that looked like a fix. I picked on
through reading commits that /dev/kmsg
is written to via the printk
functions, so on a whim I decided to look changes made to
kernel/printk/printk.c
. Reading through the commit logs of printk.c
,
I realized the last commit is likely the fix:
commit ab6f762f0f53162d41497708b33c9a3236d3609e Author: Sergey Senozhatsky <protected@email> Date: Tue Mar 3 20:30:02 2020 +0900 printk: queue wake_up_klogd irq_work only if per-CPU areas are ready printk_deferred(), similarly to printk_safe/printk_nmi, does not immediately attempt to print a new message on the consoles, avoiding calls into non-reentrant kernel paths, e.g. scheduler or timekeeping, which potentially can deadlock the system. Those printk() flavors, instead, rely on per-CPU flush irq_work to print messages from safer contexts. For same reasons (recursive scheduler or timekeeping calls) printk() uses per-CPU irq_work in order to wake up user space syslog/kmsg readers. However, only printk_safe/printk_nmi do make sure that per-CPU areas have been initialised and that it's safe to modify per-CPU irq_work. This means that, for instance, should printk_deferred() be invoked "too early", that is before per-CPU areas are initialised, printk_deferred() will perform illegal per-CPU access. Lech Perczak [0] reports that after commit 1b710b1b10ef ("char/random: silence a lockdep splat with printk()") user-space syslog/kmsg readers are not able to read new kernel messages. The reason is printk_deferred() being called too early (as was pointed out by Petr and John). Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU areas are initialized. Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/ Reported-by: Lech Perczak <protected@email> Signed-off-by: Sergey Senozhatsky <protected@email> Tested-by: Jann Horn <protected@email> Reviewed-by: Petr Mladek <protected@email> Cc: Greg Kroah-Hartman <protected@email> Cc: Theodore Ts'o <protected@email> Cc: John Ogness <protected@email> Signed-off-by: Linus Torvalds <protected@email>
Unfortunately my understanding of the linux kernel architecture is not comprehensive, let alone competent, the commit message describes
- syslog/kmsg readers — which includes dmesg and syslogng,
- certain functions don't immediate attempt to print a new message to console,
- and syslog/kmsg readers might not wake up.
Indeed it's a bit hard for me to wrap my minimal kernel understanding around, however, reading the linked email list thread clears things up significantly:
After upgrading kernel on our boards from v4.19.105 to v4.19.106 we found out that syslog fails to read the messages after ones read initially after opening /proc/kmsg just after booting. I also found out, that output of 'dmesg –follow' also doesn't react on new printks appearing for whatever reason - to read new messages, reopening /proc/kmsg or /dev/kmsg was needed. I bisected this down to commit 15341b1dd409749fa5625e4b632013b6ba81609b ("char/random: silence a lockdep splat with printk()"), and reverting it on top of v4.19.106 restored correct behaviour.
— Lech Perczak
Now that, sounds like the issue I'm having! The thread also discusses the bug is apparent on 4.19.106 (fixed in 4.19.107 — see this commit), and affects users of 5.5.9, 5.5.15, 5.6.3 (see the PATCHv2 thread).
Further reading related to the above commit
- The commit that broke things
- The commit that fixed things
- Regression in v4.19.106 breaking waking up of readers of /proc/kmsg and /dev/kmsg
- {PATCH} printk: queue
wake_up_klogd
irq_work
only if per-CPU areas are ready - {PATCHv2} printk: queue
wake_up_klogd
irq_work
only if per-CPU areas are ready - linux-4.19.y: Revert "char/random: silence a lockdep splat with printk()"
Applying the patch
Next step is to apply the patch in order to test and verify this fixes the issue.
Since Fall last year, I have used sys-kernel/vanilla-kernel
to
compile, install, and create an initramfs for my two machines. This is
a great ebuild because it uses a kernel .config
based off of
Archlinux's, so it is compatible with most machines. It is also
streamlined in that it does all the work for you — no more manually
configuring and remembering which make invocations are necessary to
update the kernel. It's not hard to get right, but it's not
particularly interesting in my use-case. Additionally, using
sys-kernel/vanilla-kernel
, the kernel & its modules are now
packaged, and can be distributed to my other machine as a binpkg. This
streamlines deployment significantly.
In order to add the patch to this ebuild, I simply have to drop the
patch file into /etc/portage/patches/sys-kernel/vanilla-kernel
. In
my case I chose to drop it in
/etc/portage/patches/sys-kernel/vanilla-kernel:5.6.4
because I
rather the patch only be applied no the current kernel I have
installed, than all versions of sys-kernel/vanilla-kernel
. This
ensures when I upgrade to to the upcoming 5.7 release (which has the
fix included), the patch won't be applied and emerge won't fail due to
the patch not being applied cleanly.
The commands (commit to my /etc/portage
):
mkdir -p /etc/portage/patches/sys-kernel/vanilla-kernel:5.2.6
curl -o /etc/portage/patches/sys-kernel/vanilla-kernel:5.2.6/fix-dmesg--follow.patch \
https://github.com/torvalds/linux/commit/ab6f762f0f53162d41497708b33c9a3236d3609e.patch
emerge -1av sys-kernel/vanilla-kernel:5.2.6
An hour later and the kernel is installed. After the reboot, indeed
dmesg -w
works once again! And the log messages in
/var/log/kern.log
have timestamps that correctly reflect the kernel
time!
Conclusion
Even kernels have regressions. As discussed on IRC, I was reminded that the kernel project is not responsible for the userland, so it's possible such testcases might not be on the radar of most kernel developers. Perhaps it's the distros' responsibilities to execute integrated system testing to catch bugs like this. In any case it is still a surprise to see such a regression occur. We like to think of the kernel as this infallible magical machine that doesn't break except when you do something patently wrong, but this isn't really the case. We're all human.
I want to thank Tyil, Sergey (the patch author), Lech (the bug reporter) and some folks from the #linux IRC channel for helping me pinpoint this issue. The reader may think this is a lot of effort to go through to fix such a simple bug — but it's really important for the kernel to work — if the kernel misbehaves, anything is up for grab. It's not like your bug-laden browser product we accept will have crashing bugs in it — if the kernel crashes or misbehaves, the ramifications are almost as bad as if the hardware is failing — you'll lose your application data, productivity, and trust in the operating system itself.
It is important to mention LTS (Long term support) kernels exist. Given the amount of trouble I went to address this issue, and the fact I rather not have things breaking, I don't think I should be running a mainline kernel at the moment. Perhaps I can install both side by side. then pick 'n choose which kernel to use de jure.
I am very interested to hear of you, the reader's, suggestions for kernel maintenance and version selection strategies. You can find my contact details at https://winny.tech/ . Thank you for reading.
How to fix early framebuffer problems, or "Can I type my disk password yet??"
Most of my workstations & laptops require a passphrase typed in to open the encrypted root filesystem. So my steps to booting are as follows:
- Power on machine
- Wait for FDE passphrase prompt
- Type in FDE passphrase
- Wait for boot to complete and automatic XFCE session to start
Since I need to know when the computer is ready to accept the
passphrase, it is important the framebuffer is usable during the early
part of the boot. In the case of of HP Elitebook 820 G4, the EFI
framebuffer does not appear to work, and I rather not boot in BIOS
mode to get a functional VESA framebuffer. Making things more awkward,
a firmware is needed when the i915 driver is loaded, or the
framebuffer will not work either. (It’s not always clear if a firmware
is needed, so one should run dmesg | grep -F firmware
and check if
firmware is being loaded.)
With this information, the problem is summarized to: “How do I ensure i915 is available at boot with the appropriate firmware?”. This question can be easily generalized to any framebuffer driver, as the steps are more-or-less the same.
Zeroth step: Do you need only a driver, or a driver with firmware?
IT is a good idea to verify if your kernel is missing a driver at boot, or is missing firmware or both. Boot up a Live USB with good hardware compatibility, such as GRML1 or Ubuntu’s, and let’s see what framebuffer driver our host is trying to use2:
$ dmesg | grep -i 'frame.*buffer' [ 4.790570] efifb: framebuffer at 0xe0000000, using 8128k, total 8128k [ 4.790611] fb0: EFI VGA frame buffer device [ 4.820637] Console: switching to colour frame buffer device 240x67 [ 6.643895] i915 0000:00:02.0: fb1: i915drmfb frame buffer device
Se we can see the efifb is initially used for a couple seconds, then
i915 is used for the rest of the computer’s uptime. Now let’s look
at if firmware is necessary, first checking if modinfo(8)
knows of
any firmware:
$ modinfo i915 -F firmware i915/bxt_dmc_ver1_07.bin i915/skl_dmc_ver1_27.bin i915/kbl_dmc_ver1_04.bin ... SNIP ... i915/kbl_guc_33.0.0.bin i915/icl_huc_ver8_4_3238.bin i915/icl_guc_33.0.0.bin
This indicates this driver will load firmware when available, and if necessary for the particular mode of operation or hardware.
Now let’s look at dmesg to see if any firmware is loaded:
[ 0.222906] Spectre V2 : Enabling Restricted Speculation for firmware calls [ 5.511731] [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) [ 25.579703] iwlwifi 0000:02:00.0: loaded firmware version 36.77d01142.0 op_mode iwlmvm [ 25.612759] Bluetooth: hci0: Minimum firmware build 1 week 10 2014 [ 25.620251] Bluetooth: hci0: Found device firmware: intel/ibt-12-16.sfi [ 25.712793] iwlwifi 0000:02:00.0: Allocated 0x00400000 bytes for firmware monitor. [ 27.042080] Bluetooth: hci0: Waiting for firmware download to complete
Aha! So it appears we need i915/kbl_dmc_ver1_04.bin
for i915. In
the case case one doesn’t need firmware, it won’t show anything
related to drm
or a line with your driver name in it.
By the way, it is a good idea to check dmesg for hints about missing firmware, or alternative drivers, for example my trackpad is supported by both i2c and synaptics based trackpad drivers, and the kernel was kind enough to tell me.
First step: Obtain the firmware
On Gentoo install
sys-kernel/linux-firmware
. You will have to agree to some non-free
licenses; nothing too inane, but worth mentioning. Now just
run emerge -av sys-kernel/linux-firmware
. (On other distros it
might be this easy, or more difficult; for example—in my experience
Debian does not ship every single firmware like Gentoo does, so
YMMV.)
Second step, Option A: Compile firmware into your kernel
Since most of my systems run Gentoo, it is business as usual to deploy a kernel with most excess drivers disabled except for common hot-swappable components such as USB network interfaces, audio devices, and so on. For example, this laptop’s config was originally derived from genkernel’ stock amd64 config with most extra drivers disabled, then augmented with support for an Acer ES1-111M-C7DE, and finally with support for this Elitebook.
I had compiled the kernel with i915 support built into the image, as
opposed to an additional kernel module. Unfortunately this meant the
kernel is unable to load firmware from filesystem, because it
appears only kernel modules can load firmware from filesystem. To
work around this without resorting to making i915 a kernel module,
we can include the drivers within the kernel image (vmlinuz
).
Including firmware and drivers both in the vmlinuz has a couple
benefits. First it will always be available. There is no need to
figure out how to load the driver and firmware from initrd, let
alone getting the initrd generator one is using, to cooperate. A
downside is it makes the kernel very specific to the machine,
because perhaps a different Intel machine needs a different firmware
file compiled in.
To achieve including the firmware in kernel, I set the following
values in my kernel config (.config
in your kernel source tree).
CONFIG_EXTRA_FIRMWARE="i915/kbl_dmc_ver1_04.bin" CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"
Note, if you’re using menuconfig, you can type /EXTRA_FIRMWARE
(slash for search, then the text) followed by keyboard return to
find where these settings exist in the menu system.
Then I verified i915 is indeed not a kernel module, but built into
the kernel image (it would be m
if it’s a module):
CONFIG_DRM_I915=y
After compiling & installing the kernel (and generating a dracut initrd for cryptsetup/lvm), I was able to reboot and get an early pre-mounted-root framebuffer on this device.
Second step, Option B: A portable kernel approach (using sys-kernel/vanilla-kernel
)
I discovered the Gentoo devs have begun shipping an ebuild that builds and installs a kernel with a portable, livecd friendly config. In addition this package will optionally generates an initrd with dracut as a pkgpostinst step, making it very suitable as a replacement for users who just want a working kernel, and don’t mind a excessive compatibility (at a cost to size and build time).
This presents a different challenge, because while this package does
allow the user to drop in their own .config, it is not very
multiple-machine-deployment friendly to hard-code each individual
firmware into the kernel. Instead we tell dracut to include our
framebuffer driver. As mentioned above I found this computer uses
the i915
kernel driver for framebuffer. Let’s tell dracut to
include the driver:
cat > /etc/dracut.conf.d/i915.conf <<EOF add_drivers+=" i915 " EOF
Dracut is smart enough to pick up the firmware the kernel module
needs, provided they are installed. To get an idea what firmware
dracut will include, run modinfo i915 -F firmware
which will print
out a bunch of firmware relative paths.
After applying this fix, just regenerate your initrd using dracut; in
my case I let portage do the work:
emerge -1av sys-kernel/vanilla-kernel
. Finally reboot.
Conclusion
Check dmesg. Always check dmesg. We found two ways to deploy firmware, in-kernel and in-initrd. The in-kernel technique is best for a device-specific kernel, the in-initrd is best for a portable kernel. I am a big fan of the second technique because it scales well to many machines.
I did not touch on the political side of using binary blobs. It would be nice to not use any non-free software, but I rather have a working system with a couple small non-free components, than a non-working system. Which is more valuable, your freedom, or reduced capacity of your tools?
Footnotes:
GRML is my favorite live media. It is simple, to the point, has lots of little scripts to streamline tasks such as setting up a wireless AP, a iPXE netboot environment, a router, installing debian, and so on. And Remastering is relatively straight forward. It also has a sane gui sutable for any machine (fluxbox).