Scale Cockpit troubleshooting to your computing fleet

You might know Cockpit as a troubleshooting tool for individual machines. But once you discover and test a solution, wouldn’t it be nice to apply it to all your other machines in your data center?

Of course, not every problem works this way. You wouldn’t extend LVM with a new hard disk on a hundred machines at the same time. But there are situations where applying the same task across a multitude of computers makes sense.

In this example, we will adjust the SELinux policy to our needs and apply it elsewhere.

Cockpit 210 introduced a new approach that we want to explore; asking the computer to “show me what I have done”. It’s now easy to look at SELinux policy changes, compared to the defaults on a machine, in a human-readable form, as a shell script, or an Ansible role. The shell script and Ansible role are both suitable for automation across multiple machines.

SElinux autoscript ansible

To see SELinux debugging, fixing, and deploying in action, watch this short three-and-a-half minute demo covering the whole workflow. Follow along with steps from “there’s a problem I need to debug” to “I’ll figure out a solution with Cockpit” and finally to “I’ll apply this tested solution to my entire fleet of computers”.

While this approach works for SELinux, not every part of Cockpit can be used in a similar manner. First and foremost, Cockpit always shows the server’s current state, not its configuration. At a glance, the difference between static configuration, dynamic state (like the IP address of a DHCP network card), or even hardware properties (like the number, capacity, and serials of hard disks) are not necessarily obvious.

But there are certainly more configuration-related places and actions in Cockpit. For example:

  • Enabling or disabling Simultaneous Multi-Threading to mitigate CPU vulnerabilities
  • Enabling or disabling PCP, firewall, kdump, or general systemd units
  • Setting up automatic package updates

Would copying the settings of these and applying them across other machines be useful to you? Do you have other ideas and uses for something like this? If so, please give us your feedback!