- How do I pronounce Medik8s?
- Does Medik8s require OpenShift?
- Does Medik8s require Machine API?
- Does Medik8s require special hardware?
- Does Medik8s work on bare metal only?
- Do all nodes need to be treated the same?
- Can I create my own definition of what counts as a healthy node?
- Can I create my mechanism for recovering a node?
- How can I get involved?
- What is the relationships to sig-cluster?
- What is the Relationships to Cluster/Machine API?
- What is the connection to Machine Healthcheck Controller?
- What is the Relationships to External Remediation API?
- Is a company behind this?
Medik8s is intended to be a playful misspelling of the English word “medicates” and is pronouced the same way.
No. Medik8s can run on any kubernetes cluster.
No. Medik8s puts Nodes at the center of failure detection and recovery and can run on any kubernetes cluster.
No. While Medik8s can take advantage of hardware watchdogs and/or BMCs, it also has options for shared-nothing recovery.
No. Medik8s operators can work on any platform, unless specified otherwise by a specific remediator.
No. The Node Healthcheck configuration includes a node selector, so you can treat the control plane differently to workers, and have pools of workers with different conditions and thresholds to provide a variety of SLAs.
Yes. Node Healthcheck determines node health based on NodeConditions. There are a set of basic conditions built into Kubernetes, but additional conditions can be defined and then referenced by Node Healthcheck. Node Problem Detector is a common tool for creating and updating NodeConditions based on log scraping.
Yes. Node Healthcheck uses the sig-cluster’s External Remediation API to uniquely associate a node failure with a specific recovery mechanism of your choosing.
The Medik8s team has worked with the sig-cluster community for many years. While we have many things in common, they are naturally focussed on furthering the Machine/Cluster APIs. Basing our solution on those APIs would limit the types of clusters we can provide a solution for.
The original implementation put Machines at the center of failure detection and exclusively used the Machine API for recovery. Node Healthcheck Controller can use the Machine API if it is available, but also supports other mechanisms.
The primary difference between the two implementations is putting Nodes at the center of failure detection to avoid a dependency on
Machine objects, which are not common to all kubernetes installations.
The original MHC implementation assumed that using the Machine API to destroy the bad node and replace it with a new one was the only necessary recovery mechanism.
The Medik8s team partnered with Ericsson to convince the sig-cluster community that other mechanisms were needed (particularly on bare metal). Together we created the External Remediation API that is used by both the Machine and Node Healthcheck Controllers.
The Medik8s team is employed at Red Hat, where we leverage 20 years of personal experience creating HA architectures to create a kubernetes-native HA experience for workloads such as Stateful sets and RWO Volumes.