DevOps'ish 231: Kubernetes 1.22 release team livestream, problems in Perl, glibc, eBPF, Pod Security Admission, secure supply chains, tools galore, and more

My military service and tech worlds collided this week. I can’t say much about it yet but, I’ve been insanely busy with an array of things I never thought I’d need to do. More to come later. Join the DevOps’ish subreddit and talk about how bad the intro was. Or how dope the notes page is for this issue. People Cloud Tech Tuesdays: Kubernetes 1.22 Josh Berkus, Amy Marrich, and I sat down for a livestream with Savitha Raghunathan, James Laverack, Jesse Butler, and Guinevere Saenger to discuss all things Kubernetes and the Kubernetes 1.22 release. Free eBook: Docker Security Essentials by HackerSploit Docker is a popular platform to quickly create, deploy and host web applications, databases and other business critical solutions. Learn how to audit and secure Docker in this comprehensive guide and 9-part video series. Download instantly – no registration required. SPONSORED Samsung’s leader is out of jail, allowing US factory plans to move forward “Samsung heir served 18 months in prison for capital flight and perjury.” Can someone that’s got a great understanding of Korean business and politics please reply to this email. I have no idea how this works and I want to understand it before I label it anything. ...

August 22, 2021 · 7 min · Chris Short

DevOps'ish 230: Complex Systems == No Single Root Cause, WFHers juggling two jobs, Service Reliability Math, eBPF Foundation, Dashboards, Tools from Black Hat and more

Another week another bout of bad weather. Systems here in our home have gotten a bit more robust since our multi-day total blackout. I took a meeting this week in a house with no power. The meeting was short, but it demonstrated that if everything goes to hell in a handbasket, my systems are redundant enough to enable me to pass whatever batons when needed. But, lately, it’s felt like a lot. You can feel the cost of communication when a cacophony of UPSes suddenly fills the house. Luckily power was restored before we went to bed that night. But, what came later was something of a surprise. In 36 hours, Michigan received almost a quarter of its annual total of lightning stikes (a lot of them cloud to ground). While this didn’t seem to affect services we consume, I can only imagine the hell it played out for multiple fire responders of all stripes. One of the worse incidents I was part of was a lightning strike that hit a datacenter’s generator transfer switch. It kicked off a chaotic series of events that caused HVAC systems to go offline. The storm that night was hellacious too. A datacenter can generate enough heat to make network switches act up is a miserable series of events. There was no single root cause. Multiple systems failed or malfunctioned in unplanned or thought of ways. The fact we weren’t up and running once temperatures started to cool down unlocked a new mystery that ultimately led us to restart our core switches because the heat had thrown the ASICs out of whack. But, there was never a single root cause. You could say the lightning strike was the root cause. But, that hit systems outside the datacenter and related to power. Our systems went down because core switching had overheated. Cooling units inside the datacenter reset but didn’t start using refrigerant until they were reset again in a particular order (the cooling system was never supposed to respond the way it did). There’s never a single root cause for a large-scale outage (John Allspaw argues the point further below). ...

August 15, 2021 · 8 min · Chris Short

DevOps'ish 229: Kubernetes 1.22, KubeCon schedule announced, security fails abound, Zoom's paltry fine, finally death to 996, NSA Kubernetes Hardening Guidance, and much more

Kubernetes 1.22 shipped this week. I suggest you, at a minimum, read the release blog post or take a gander at the CHANGELOG and definitely read the No, really, you MUST read this before you upgrade. Some of the bigger changes: Audit log files are created with mode 0600 (owner read-only) Rootless mode containers moving to alpha: In my opinion, if you use Podman, you’re used to this. If you’re not, you should be using rootless containers intentionally for security reasons (more on that later). Cgroupsv2 moving to alpha Pod Security Policy replacement (aka Pod Security Admission Controller): Yes, PSPs are deprecated and being replaced. There are a lot of reasons why. LoadBalancer moving to beta Enable seccomp by default and a whole bunch more KubeCon NA 2021 acceptances went out this week and the schedule is live. I’m excited to say I’m teaming up with Kaslin Fields, Bart Farrell, Matthew Broberg, and Kunal Kushwaha for a panel talk about what we’ve been doing in the Kubernetes Upstream Marketing Team (which includes the @K8sContributors Twitter handle and so much more). ...

August 8, 2021 · 6 min · Chris Short

DevOps'ish 228: Natural disasters, GitOps with Codefresh, NSO Group, MeteorExpress, Linkerd, Kubernetes 1.22, TSMC’s 2nm chips, cloud outposts, and more

At 8:13 PM last Saturday, the family and I were gathered in our basement, evading a tornado warning that came through the area. The storm spawned three tornadoes. Luckily, we weren’t hit directly. But we lost power, internet, and cell service. After getting the all-clear and assessing the situation, it was clear that we would be without power for quite a few hours. Making a newsletter last week wasn’t happening. It was technically impossible, and to be honest, I had a big ole stack of higher priorities come in. Then a few hours turned into a few days without these services. Luckily, we have a gas stove and water heater. I spent Monday morning frantically trying to find a place with the trifecta of power, internet, and cell service. It didn’t exist within a twenty-minute radius of our house. We spent over 44 hours without power. We were lucky we didn’t have to wait much longer than that. The roof that I thought was damaged wasn’t (the shingles in our yard weren’t ours 😬😬😬). Cell service came back up in the morning on Tuesday. ...

August 1, 2021 · 7 min · Chris Short

DevOps'ish 227: So hot right now, Sunk Cost Fallacy, Right to Repair, future of tech events, HelloKitty ransomware now targets VMware ESXi, GitHub Copilot, and more.

I was struck with a very mild case of heat exhaustion a couple of weeks ago after standing over a hot grill hosting our family’s 4th of July party. So when the article “How hot is too hot for the human body?” came across my desk this week, I was uniquely interested in it. I’ve run several miles in the Middle East, the high plains of Colorado, Florida, the jungles of Honduras, and many points in between. “This shouldn’t impact me like it is.” I thought. Why is heat such a deadly factor in cooler climates? Why did I get slammed by this one hot day? I discovered, “While most researchers agree that a wet-bulb temperature of 95 °F is unlivable for most humans, the reality is that less extreme conditions can be deadly too. We’ve only hit those wet-bulb temperatures on Earth a few times, but heat kills people around the world every year.” Oh… “Residents of cooler places are also just less acclimatized to the heat, so wet-bulb temperatures below 95 °F can be deadly.” ...

July 18, 2021 · 8 min · Chris Short