A trying week capped off by trigger point injections. Long story short, I’ve been trying to get a family out of Afghanistan for the past two weeks to no avail. I won’t bore you with info or divulge identifying details. But, the possibility for their safe passage to the US has pretty much gone to 0. It’s hard telling a 16-year-old kid that you’ve exhausted all your resources. You can only offer tidbits of info. HUGE shoutout to the team behind Ehtesab for enabling me to get SOME intel from folks on the ground. The situation itself is a failure.

A failure on multiple levels. But, it’s a stark reminder that you have to experiment and sometimes try all the ways possible to get a solution into production. Can you deploy this feature as a feature flag, or do you need a canary or blue/green deployment? At what layer are you going to manage THAT? Your global load balancer? Maybe inside your application stack on a keepalived instance? Perhaps it’s better to handle this in your Kubernetes cluster by managing replica sets or ingresses. Once you get past that decision, there are many more along the way. Then it’s “go time.” Your solution is ready to handle some production traffic.

You start seeing increased errors, so you roll back. This release isn’t a failure, though. Maybe it’s a bug somewhere else in the system. Maybe your environments aren’t as consistent as you thought. Perhaps you’re lacking a critical piece of information that somehow didn’t make it to you (a recent change in the system like admission controllers being enabled in prod). This is why having a team that you can actively collaborate with is vital. You can do this in person or via Slack/Teams/etc. Regardless your peers might have seen behaviors like this before, and they know the answer. Better yet, the team rallies around the error and figures out your ingress isn’t configured quite right (or you forgot a crucial new config). Remember the concept of above the line and below the line. You might have an excellent understanding of the abstractions but, maybe not that 20% use case people bump into.

The point is, test, validate, get peer reviews, and use your people network (internal or external) to the best of your ability. People are your greatest asset, no matter what part of the stack you’re working on. People hold knowledge and capabilities. Never stop growing those networks. They will pay off at some point. Yes, soft skills matter more than your coding, configuration, and tests A LOT.

People

Burning out and quitting
A friend who thought I needed to read this sent it to me. Maya Kaczorowski makes burnout more clear for me. Jury is still out on whether or not I’m burned out or tired of my injuries affecting work performance.

Easy steps on how to secure your Kubernetes cluster by installing Teleport, an open-source, identity-aware access proxy.
Teleport allows engineers and security professionals to unify access for SSH servers, Kubernetes clusters, web applications, and databases across all environments. Learn more at https://goteleport.com/docs/ SPONSORED

A Third Of Stitch Fix’s Workers Have Quit En Masse
This seems like a calculated decision by at least one of the parties involved. The cause and effect is clear. If you take away flexibility, you’re going to be losing talent.

Programmers Don’t Understand Hash Functions
“I don’t blame software developers for their lack of understanding on anything I’m going to discuss. The term “hash function” can accurately represent any of the following [five] disparate topics in computer science…”

It’s Time for Police to Stop Using ShotSpotter
Bad AI leads to really bad outcomes.

GitOpsCon Schedule
GitOpsCon is a KubeCon day zero event and the schedule of speakers is certified dope.

The secret bias hidden in mortgage-approval algorithms
“An investigation by The Markup has found that lenders in 2019 were more likely to deny home loans to people of color than to white people with similar financial characteristics — even when we controlled for newly available financial factors that the mortgage industry has in the past said would explain racial disparities in lending.”

Process

Google Cloud Status
Google had an unusual reason for downtime in australia-southeast2. “The issue was transient voltage at the feeder to the network equipment, causing the equipment to reboot. In order to mitigate the issue, traffic within the australia-southeast2 region was redirected temporarily.”

Infrastructure as Code Automation for Terrafrom and GitOps workflows
Code, No Manual Processes. Automate Terraform tasks, reduce errors and drifts, improve security and auditability of your infrastructure. env0 automates and simplifies the provisioning of cloud deployments for Terraform, Terragrunt and GitOps workflows. SPONSORED

Misaligned factory robot may have sparked Chevy Bolt battery fires
Read about how one robot at LG Chem has drastic impact downstream. Only the future will tell if the Chevy Bolt can get past this Pinto-esque moment.

30 years of Linux: OS was successful because of how it was licensed, says Red Hat
“On the 30th anniversary of the announcement of Linux by Linus Torvalds, Red Hat has said that it all worked out because of the way the OS was licensed.”

‘Lifelong learning will be normalised and seen as essential’
I’ve been saying this for a long time. But, resting on your laurels is only going to hurt yourself these days.

SIG Docs needs your help! “TL;DR: SIG Docs needs help from you (or your company) to continue providing quality docs. Are you interested in helping contribute, or taking on a leadership role, in the open source community? Reach out to us on slack or reply here!”

After Razer, SteelSeries Software Also Hit by Zero-Day Vulnerability, SteelSeries Responds (Update)
This might be the quickest way to break into Windows systems these days.

Tools

Conditionally setting your gitconfig
I did not realize you could do conditions in gitconfigs. This is kind of a game changer.

Manage incidents directly from Slack 🧑‍🚒
Rootly helps automate the tedious manual work like creating incident channels, searching for runbooks, documenting the postmortem timeline, and more. Teams sized 20 to 2000 manage hundreds of incidents daily and save thousands of engineering hours a year within Rootly. Get started in <5min or book a demo to learn more and get Starbucks ☕ on us! SPONSORED

GitOps Guide to the Galaxy (Ep 21): RBAC Revisited
“Failure is not an option! We revisit the subject of Role-based access control (RBAC) with three, yes count them THREE, different patterns. We will be taking a look at SSO, Authentication, and Git workflows approach to RBAC. So join us to learn more about the dynamic duo of GitOps and RBAC. If you missed the previous episode about RBAC, watch it here: https://youtu.be/XsiPPjnKFGw"

A New Tool Wants to Save Open Source From Supply Chain Attacks
If you didn’t read this article or want to get up to speed on Sigstore, join us on Tuesday at 10 AM ET/1400 UTC as we sit down with the one and only, Luke Hinds, creator of Sigstore. Check out the Red Hat Livestreaming calendar for details.

A Checklist of Cloud Security Orienteering
“How to orienteer in a cloud environment, dig in to identify the risks that matter, and put together actionable plans that address short, medium, and long term goals.”

Enable seccomp for all workloads with a new v1.22 alpha feature
“Kubernetes v1.22.0 introduces a new kubelet feature gate SeccompDefault, which has been added in alpha state as every other new feature. This means that it is disabled by default and can be enabled manually for every single Kubernetes node.”

Managing Kubernetes seccomp profiles with security profiles operator
“In my last blog, we learned about setting the RuntimeDefault flag in Kubernetes which configures all Pods to use a specific seccomp profile. While this is a great addition to improve your Kubernetes security posture, the runtime default seccomp profile might expose more syscalls than your application needs. In addition, the creation and management of seccomp profiles is cumbersome and error prone for cluster administrators. This is problem space is what the security profile operator sets out to solve. In this blog we’ll cover the features of the security profile operator and how to use it.”

metacontroller/metacontroller
“Writing kubernetes controllers can be simple”

cloud-native-toolkit/multi-tenancy-gitops “This repository shows the reference architecture for gitops directory structure”

sigstore/community
“General sigstore community repo”

nailgun
DNS benchmarking tool

DevOps’ish Tweet of the Week

Jaana Dogan ヤナ ドガン (@rakyll) on Twitter: &ldquo;Hard skills are hard, soft skills are harder.&rdquo;

Want more? Be sure to check out the notes from this week’s issue to see what didn’t make it to the newsletter but are still worth your time.