About Me
Hello there! I'm Stephan, a DevOps Engineer who's passionate about building and optimizing resilient and efficient systems. With over six years of hands-on experience within the dynamic landscape of Allianz, I've had the privilege of working across the full spectrum of DevOps practices.
From architecting and implementing robust CI/CD pipelines to defining and deploying Infrastructure as Code, I thrive on automating processes and establishing comprehensive monitoring solutions. Transitioning from corporate to freelance, I bring valuable experience collaborating globally with teams across different regions to ensure seamless and reliable technology delivery.
Services
Expert solutions in CI/CD, IaC, Kubernetes and monitoring for seamless deployment and operations.
Cloud infrastructure design & management (AWS)
Designing, building, and managing scalable, secure, and cost-effective AWS environments.
Infrastructure as Code implementation
Automating infrastructure provisioning and management using Terraform for consistency and reliability.
CI/CD Pipeline Strategy & Automation
Designing, building, and optimising continuous integration and deployment pipelines tailored to your needs using Jenkins or GitHub Actions.
Kubernetes Enablement & Operations
Helping teams adopt or manage Kubernetes, including cluster setup, application deployment strategies, monitoring, and operational best practices.
Portfolio
Showcasing some of my previous projects and expertise.
From chaos to clarity: Building a Single Source of Truth for the Cloud
One of the core challenges in any growing cloud environment is keeping track of everything. To tackle this head-on, I designed and implemented a centralised system leveraging GitHub Actions, Python and DynamoDB. You can think of it as a single-source-of-truth and the go-to system for any information gathering about the environment. To ensure reliability and consistency, the underlying infrastructure of this SSoT, including DynamoDB tables and API Gateway resources, was declaratively managed using Infrastructure as Code (IaC) with Terraform. Deployment of infrastructure was automated using GitHub Actions, triggered automatically on merge to the main branch.
Specifically, this system included:
- AWS accounts: Any environment consisting of multiple AWS accounts can quickly get difficult to manage, making it crucial to have an overview of all available accounts with their ID, customer, and region information. This provided customers and internal teams with a clear, consistent overview of environments via tools like Confluence or internal dashboards.
- EKS clusters: Storing vital data about every EKS cluster (AWS account, customer, node info, EKS version, start/stop schedules, ...) enables better management and an easy overview for the customers.
- EKS Release calendar: To complement the cluster version data, the system automatically scraped and stored vendor EKS end-of-life dates to be further used in identifying potentially outdated clusters.
- Users: Centralised team member data, providing a foundation for automating access management, especially for systems not integrated with Active Directory.
- GitHub: Cataloged GitHub organisations and repositories associated with projects, linking them to team members.
Automation of repetitive tasks
Repetitive manual tasks often consume significant engineering time and introduce potential for errors. Therefore, I focus on automating wherever possible. Having a Single Source of Truth in place, the logical next step was to automate tasks relying on its data.
Some automations I built include:
- EKS patching tickets: With a growing number of environments and clusters, even simple copy-paste tasks like creating tickets to patch said clusters becomes increasingly time-consuming. This is why I completely automated the creation of patching tickets in Jira. The required information about clusters and EoL was already stored in DynamoDB tables, making it easy to identify clusters that require patching.
- GitHub access: GitHub repos come and go, and even team members might join or leave the team. Combining one DynamoDB table storing user information, and another one storing GitHub organisation and repo information, granting and revoking access can be automated. Depending on the use case, new repos and users are added to/removed from a GitHub team, or users get direct access to repos. With this automation, nothing but the DynamoDB tables has to be maintained, reducing manual effort.
- Jira PAT renewal: With a limited validity of maximum 90 days, a Jira personal access token naturally has to be renewed regularly. While doing this roughly every 3 months might not be major manual effort, it sure is something prone to be forgotten. An expired token then leads to other tasks failing. To mitigate all of this, I developed a small but powerful workflow to automatically renew the token every month, using credentials stored securely in AWS Secrets Manager, where the PAT will be stored as well. Every other automation making use of this token will get the valid token every time, ensuring uninterrupted operation of dependent automations.
Scalable Jenkins CI/CD for Multi-Customer Deployments
Managing application deployments across numerous customer environments, often involving the same applications with minor variations, required a scalable and maintainable CI/CD solution to avoid inefficient and inconsistent manual configurations.
- Reusable pipeline templates: Leveraged the Jenkins Job DSL plugin and shared libraries to define highly parameterised pipeline templates as code. This allowed us to programmatically generate consistent pipelines for multiple applications and customers, significantly reducing code duplication and maintenance effort.
- Infrastructure & Deployment Automation: Jenkins served as the central orchestrator. Pipelines executed deployment tasks leveraging tools like Terraform (for infrastructure provisioning) and Helm (for application packaging/deployment), providing engineering flexibility within an automated workflow.
-Reproducible Jenkins Configuration: The configuration as code (CaC) plugin was used to manage the Jenkins controller configuration itself via version-controlled YAML, ensuring a consistent, reliable, and easily reproducible Jenkins environment.