Microsoft Azure DevOps & GitHub Actions — Complete Guide
CI/CD · Pipelines · Workflows · IaC · Bicep · GitOps · Deployment Strategies · Security · Scenarios · Cheat Sheet
Table of Contents
- Core Concepts — Basics
- Azure DevOps — Deep Dive
- GitHub Actions — Deep Dive
- CI/CD Patterns & Deployment Strategies
- Infrastructure as Code with Bicep
- Security & Governance
- Scenario-Based Questions
- Cheat Sheet — Quick Reference
1. Core Concepts — Basics
What is DevOps and what are its core principles?
DevOps is a culture and set of practices combining development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver high-quality software continuously. It is a methodology, not a tool.
Core principles (CALMS):
- Culture: shared responsibility between dev and ops — no "throw it over the wall"
- Automation: automate everything repeatable — builds, tests, deployments, infrastructure
- Lean: minimise waste, reduce batch sizes, deliver value continuously
- Measurement: data-driven decisions — track DORA metrics
- Sharing: knowledge sharing, transparency, collaboration
The four DORA metrics (essential knowledge):
- Deployment Frequency: how often code is deployed to production
- Lead Time for Changes: time from commit to production
- Change Failure Rate: % of deployments causing failures
- Mean Time to Recovery (MTTR): time to restore service after failure
Tip: DORA metrics come up in every DevOps maturity question. Memorise them — they're the universal benchmarks.
What is the difference between Azure DevOps and GitHub Actions?
| Azure DevOps | GitHub Actions | |
|---|---|---|
| Type | Enterprise DevOps suite (Boards, Repos, Pipelines, Test Plans, Artifacts) | Code hosting + CI/CD platform |
| Pipelines | YAML or Classic (UI-based, deprecating) | YAML only — event-driven workflows |
| Audience | Enterprise IT, regulated industries | Modern dev teams, open-source, GitHub-centric |
| Marketplace | Azure DevOps extensions | GitHub Marketplace (massive ecosystem) |
| Strengths | Mature ALM, work tracking, compliance | Modern UX, marketplace, OSS ecosystem |
| Future | Maintained for existing customers | Microsoft's strategic direction |
Tip: For new projects in 2025, GitHub Actions is the modern recommended choice. Azure DevOps remains right for enterprises with existing investment, complex compliance needs, or deep Microsoft tooling integration.
What is Continuous Integration (CI) vs Continuous Delivery (CD) vs Continuous Deployment?
Continuous Integration (CI):
→ Developers merge code into main multiple times per day
→ Each merge triggers automated build and test
→ Detects integration issues early
Continuous Delivery (CD):
→ Every change passing CI is automatically prepared for release
→ Deployment to production is a manual decision (approval gate)
→ Always production-ready
Continuous Deployment:
→ Every change passing CI/CD is automatically deployed to production
→ No manual intervention
→ Highest automation maturity — requires extensive automated testing + feature flags
Maturity progression:
Manual deploys → CI → CI/CD (Delivery) → CI/CD (Deployment)
Most enterprises stop at CD (Delivery) due to compliance and risk tolerance.
What is Infrastructure as Code (IaC)?
IaC manages and provisions infrastructure through code rather than manual processes. Infrastructure definitions are stored in source control alongside application code.
Why IaC matters:
- Reproducibility: spin up identical environments anywhere from the same code
- Version control: changes tracked in Git — see who, what, when, why
- Code review: infrastructure changes go through PR review
- Drift detection: detect manual changes and remediate
- Disaster recovery: rebuild environments in minutes, not days
| Tool | Best For |
|---|---|
| Bicep | Azure-native, modern syntax, recommended for Azure |
| ARM templates | Original Azure IaC, JSON-based, complex syntax |
| Terraform | Multi-cloud (Azure, AWS, GCP), industry standard |
| Pulumi | IaC in real programming languages (TS, Python, C#) |
| Azure CLI/PowerShell | Imperative scripts, one-off automation |
Tip: For Microsoft-only stacks, Bicep is the modern recommendation — no state file management, native Azure integration. For multi-cloud, Terraform.
2. Azure DevOps — Deep Dive
What are the five core services of Azure DevOps?
| Service | Purpose |
|---|---|
| Azure Boards | Work tracking — backlogs, sprints, kanban, user stories, tasks, bugs |
| Azure Repos | Source control — Git (or legacy TFVC), branch policies, pull requests |
| Azure Pipelines | CI/CD — YAML or Classic pipelines, any language, any platform |
| Azure Test Plans | Manual and exploratory testing, test case management |
| Azure Artifacts | Package management — NuGet, npm, Maven, Python, Universal |
Tip: Azure DevOps services are modular. Use Pipelines with GitHub-hosted code. Or Azure Boards with GitHub Repos. The integration is built-in.
What is the structure of a YAML pipeline in Azure DevOps?
# azure-pipelines.yml
trigger: # When to run
branches:
include: [main, develop]
paths:
include: [src/**]
exclude: [docs/**]
pr: # PR validation
branches:
include: [main]
variables:
buildConfiguration: 'Release'
vmImage: 'ubuntu-latest'
stages: # Top-level workflow
- stage: Build
jobs:
- job: BuildJob
pool:
vmImage: $(vmImage)
steps:
- task: UseDotNet@2
inputs:
version: '8.0.x'
- script: dotnet build --configuration $(buildConfiguration)
- task: DotNetCoreCLI@2
inputs:
command: 'test'
projects: '**/*.Tests.csproj'
- task: PublishBuildArtifacts@1
inputs:
PathtoPublish: '$(Build.ArtifactStagingDirectory)'
- stage: DeployToTest
dependsOn: Build
condition: succeeded()
jobs:
- deployment: DeployToTest
environment: 'Test' # Triggers approvals/checks
strategy:
runOnce:
deploy:
steps:
- script: echo "Deploy to Test environment"
Hierarchy: Pipeline → Stages → Jobs → Steps (Tasks)
What are agents and agent pools?
Agent: a service running pipeline jobs. Each agent runs one job at a time. Agent pool: a collection of agents.
Microsoft-hosted agents:
→ Microsoft maintains the VMs
→ Free tier: 1,800 minutes/month per organisation
→ Pre-installed software (Node, .NET, Python, Docker)
→ Disposable: fresh VM per job
→ vmImage: ubuntu-latest, windows-latest, macos-latest
→ Best for: standard builds with no infrastructure dependencies
Self-hosted agents:
→ Run on YOUR infrastructure (Windows, Linux, macOS, Docker, K8s)
→ Connect outbound to Azure DevOps
→ Persistent — software cache persists between jobs
→ Access to internal resources (private networks, on-prem databases)
→ No build minute limits
→ Best for: regulated industries, on-prem integrations, large monorepos
Configure pool:
pool:
vmImage: 'ubuntu-latest' # Microsoft-hosted
# OR:
pool: 'MyOnPremPool' # Self-hosted pool name
What are Service Connections and why are they critical?
A Service Connection is a configured authentication credential used by pipelines to connect to external systems (Azure subscriptions, GitHub, Docker registries, Kubernetes, NuGet feeds).
Authentication methods (Azure ARM):
- Service Principal (manual): manually create SP, paste secret
- Service Principal (automatic): Azure DevOps creates the SP
- Managed Identity: pipeline uses managed identity (more secure)
- Workload Identity Federation (OIDC): NO secrets — uses short-lived tokens. Modern best practice.
# OIDC-based service connection (no secrets stored):
- task: AzureCLI@2
inputs:
azureSubscription: 'MyAzureSub-OIDC'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
az group list
# Token issued at runtime via federation — no client secret
Warning: Workload Identity Federation (OIDC) is the modern, secure best practice. Avoid storing service principal secrets — they require rotation and are credential theft targets.
What are Environments in Azure Pipelines and how do approvals work?
Environments represent deployment targets (Dev, Test, Prod) with approval gates and resource references.
- stage: Production
jobs:
- deployment: DeployProd
environment: 'Production' # Approval check fires here
strategy:
runOnce:
deploy:
steps:
- script: ./deploy.sh
Approval and check types:
- Approvals: configure approvers (any one or all required)
- Branch control: only allow deployments from specific branches
- Business hours: restrict deployments to time windows
- Required template: deployment must use specific YAML template
- Exclusive lock: only one deployment to environment at a time
- Invoke REST API / Azure Function check: external validation gates
3. GitHub Actions — Deep Dive
What is the GitHub Actions workflow structure?
Workflows are YAML files in .github/workflows/, triggered by GitHub events.
# .github/workflows/ci.yml
name: CI Pipeline
on: # Trigger events
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * *' # Nightly at 2 AM
workflow_dispatch: # Manual trigger button
env:
NODE_VERSION: '20'
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: build-output
path: dist/
deploy:
needs: build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment: production # Triggers approval if configured
steps:
- uses: actions/checkout@v4
- name: Deploy
run: ./scripts/deploy.sh
Hierarchy: Workflow → Jobs → Steps
What are GitHub Actions, the Marketplace, and how do you use them?
An "Action" is a reusable unit of code published as a marketplace component.
# Common marketplace actions:
actions/checkout@v4 # Clone the repo
actions/setup-node@v4 # Install Node.js
actions/cache@v4 # Cache dependencies
actions/upload-artifact@v4 # Persist build outputs
azure/login@v2 # Authenticate to Azure
azure/webapps-deploy@v3 # Deploy to App Service
azure/arm-deploy@v2 # Deploy ARM/Bicep
docker/build-push-action@v5 # Build and push Docker images
docker/login-action@v3 # Login to Docker registries
github/codeql-action@v3 # Run CodeQL security scan
microsoft/playwright-github-action@v1 # Run Playwright tests
# Three action types:
# 1. JavaScript actions: Node.js code on the runner
# 2. Docker container actions: run inside a container
# 3. Composite actions: multiple steps as a reusable action
Critical: Always pin actions to a specific version tag (
@v4) or SHA hash. Using@mainor@latestis a security risk — the action's behaviour can change without warning.
What are GitHub-hosted vs self-hosted runners?
| GitHub-hosted | Self-hosted | |
|---|---|---|
| Infrastructure | GitHub-managed VMs | Your infrastructure |
| Free tier | 2,000 min/month private repos (free public) | No build minute limits |
| Pre-installed software | Yes (Node, .NET, Python, Docker) | What you install |
| Persistence | Disposable (fresh VM per job) | Persistent (cache survives) |
| Network access | Public internet only | Internal/private networks |
| Best for | Standard builds | Regulated industries, internal resources |
runs-on: ubuntu-latest # Hosted
runs-on: [self-hosted, linux, prod] # Self-hosted with labels
runs-on: ubuntu-latest-16-cores # Larger hosted runner
Warning: Self-hosted runners on PUBLIC repos are dangerous — anyone creating a PR can run code on your runner. Never expose self-hosted runners to public repos without strict controls.
What are GitHub Actions Environments?
Environments represent deployment targets with protection rules.
deploy-prod:
needs: build
runs-on: ubuntu-latest
environment:
name: production
url: https://app.contoso.com # Display URL
steps:
- name: Deploy to production
run: ./deploy.sh
env:
API_KEY: ${{ secrets.PROD_API_KEY }} # Environment secret
Environment protection rules:
- Required reviewers: up to 6 reviewers (any/all required)
- Wait timer: enforce delay before deployments proceed
- Deployment branches: restrict which branches can deploy
- Environment secrets: scoped to this environment only
- Environment variables: non-sensitive config per environment
What are reusable workflows vs composite actions?
Reusable workflows: complete workflows callable from other workflows. Used to share entire CI/CD logic across repos.
Composite actions: a series of steps bundled as a reusable action. Smaller scope — encapsulates a few related steps.
# .github/workflows/reusable-deploy.yml
name: Deploy
on:
workflow_call:
inputs:
environment:
required: true
type: string
secrets:
AZURE_CREDENTIALS:
required: true
jobs:
deploy:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- run: ./deploy.sh ${{ inputs.environment }}
# Calling reusable workflow:
# .github/workflows/main.yml
jobs:
call-deploy:
uses: ./.github/workflows/reusable-deploy.yml
with:
environment: production
secrets:
AZURE_CREDENTIALS: ${{ secrets.AZURE_CREDENTIALS }}
4. CI/CD Patterns & Deployment Strategies
What deployment strategies are available?
| Strategy | Description | Best For |
|---|---|---|
| Recreate | Stop old → deploy new (downtime) | Non-critical apps |
| Rolling | Update instances in batches | Stateless web apps |
| Blue-Green | Two identical environments, swap traffic | Production with instant rollback |
| Canary | Deploy to small subset (5%→10%→50%→100%), monitor, scale | High-traffic production |
| Feature flags | Deploy code with features disabled, enable gradually | Decouple deployment from release |
# Azure Pipelines deployment strategies:
strategy:
rolling:
maxParallel: 25%
canary:
increments: [10, 20]
runOnce:
deploy:
steps: [...]
Tip: For high-traffic production services, combine Canary + Feature Flags. The canary catches infrastructure/integration issues; flags let you control feature exposure independently of deployment.
What is GitOps and how does it differ from traditional CI/CD?
GitOps is a declarative deployment paradigm where infrastructure and application desired state is in Git, and an automated agent continuously reconciles the live environment to match.
Traditional CI/CD (push model):
Developer → Git → CI builds → CD pushes to environment
→ Pipeline has direct access to deploy
→ Deploy actions are imperative (run scripts)
→ Drift between Git and environment possible
GitOps (pull model):
Developer → Git ← Agent in target environment pulls
→ Agent runs IN target environment (Flux, ArgoCD on Kubernetes)
→ Continuously reconciles state to match Git
→ No drift — agent enforces Git as source of truth
→ Roll back = revert Git commit → agent auto-rolls back
Key tools:
- Flux: Kubernetes-native GitOps controller (CNCF graduate)
- ArgoCD: Kubernetes-native GitOps with rich UI
- Azure GitOps Configuration: Microsoft-managed Flux on AKS
What are pipeline templates and how do they enforce standards?
Templates allow centrally maintained YAML defining standard CI/CD logic referenced by other pipelines.
# templates/build-dotnet.yml (in central pipeline-templates repo)
parameters:
- name: configuration
type: string
default: 'Release'
- name: runTests
type: boolean
default: true
steps:
- task: UseDotNet@2
inputs:
version: '8.0.x'
- script: dotnet restore
- script: dotnet build --configuration ${{ parameters.configuration }}
- ${{ if eq(parameters.runTests, true) }}:
- script: dotnet test --collect:"XPlat Code Coverage"
- task: PublishCodeCoverageResults@2
inputs:
codeCoverageTool: 'Cobertura'
summaryFileLocation: '**/coverage.cobertura.xml'
# Consumer pipeline:
resources:
repositories:
- repository: templates
type: git
name: PlatformTeam/pipeline-templates
steps:
- template: templates/build-dotnet.yml@templates
parameters:
configuration: 'Release'
runTests: true
Tip: Templates enforce: consistent build steps, mandatory security scans, code coverage thresholds, branding/notifications. The platform team maintains gold-standard pipelines.
5. Infrastructure as Code with Bicep
How do you implement Bicep in CI/CD?
The recommended pattern: Validate → What-If → Deploy.
# GitHub Actions example:
name: Deploy Azure Infrastructure
on:
push:
branches: [main]
paths: ['infra/**']
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Validate Bicep
uses: azure/arm-deploy@v2
with:
subscriptionId: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
resourceGroupName: rg-myapp-prod
template: ./infra/main.bicep
parameters: ./infra/main.bicepparam
deploymentMode: Validate # Validate only
what-if:
needs: validate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Run What-If
run: |
az deployment group what-if \
--resource-group rg-myapp-prod \
--template-file ./infra/main.bicep \
--parameters ./infra/main.bicepparam
deploy:
needs: [validate, what-if]
runs-on: ubuntu-latest
environment: production # Manual approval here
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Deploy Bicep
uses: azure/arm-deploy@v2
with:
subscriptionId: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
resourceGroupName: rg-myapp-prod
template: ./infra/main.bicep
parameters: ./infra/main.bicepparam
Tip: The Validate → What-If → Deploy pattern is the IaC best practice. Validate confirms syntax. What-If shows what will change. Deploy applies. Always show What-If output before requiring approval.
6. Security & Governance
How do you securely manage secrets?
- Variable groups (Azure DevOps) / Secrets (GitHub): encrypted at rest, masked in logs
- Azure Key Vault integration: best practice — store in Key Vault, reference from pipeline
- Workload Identity Federation (OIDC): no stored secrets — short-lived OIDC tokens
- Environment-scoped secrets: production secrets only accessible from production jobs
# GitHub OIDC to Azure (no secrets stored):
- name: Azure Login
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }} # App registration ID
tenant-id: ${{ secrets.AZURE_TENANT_ID }} # Tenant GUID
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
# No client secret! Token issued at run-time via federation
# Azure DevOps Key Vault integration:
- task: AzureKeyVault@2
inputs:
azureSubscription: 'MyServiceConnection'
KeyVaultName: 'myKeyVault'
SecretsFilter: 'database-password,api-key'
Critical: Never hardcode secrets in YAML files or commit to Git — even private repos. Secret scanners catch credentials in commit history. Use OIDC or Key Vault references.
What is GitHub Advanced Security?
| Feature | Description |
|---|---|
| CodeQL | SAST — finds vulnerabilities in source code |
| Secret scanning | Detects committed secrets, push protection blocks commits |
| Dependabot | Identifies vulnerable dependencies, auto-creates update PRs |
| Security overview | Org-wide vulnerability dashboard |
| Custom queries | Define org-specific CodeQL security patterns |
# Enable CodeQL in workflow:
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: javascript, python
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
Tip: GitHub Advanced Security is free for public repos and a paid add-on for private repos. For enterprise customers it's part of the GitHub Advanced Security plan.
What are branch protection rules?
Recommended branch protection for "main":
☐ Require pull request reviews before merging
→ Require X reviewers (typically 1-2)
→ Dismiss stale reviews when new commits pushed
→ Require review from code owners (CODEOWNERS file)
☐ Require status checks to pass
→ CI build must succeed
→ Tests must pass
→ Code coverage threshold met
→ Security scans clean
☐ Require branches to be up to date before merging
☐ Require signed commits (advanced)
☐ Require linear history (no merge commits)
☐ Restrict who can push to matching branches
☐ Do not allow bypassing the above settings (even for admins)
Critical: Always enable "Do not allow bypassing the above settings" for production branches. Without it, admins bypass all rules — defeating compliance.
What is the principle of least privilege in DevOps?
- Service Principal scopes: each pipeline SP has access ONLY to needed resources. Production SPs separate from non-prod.
- Environment-specific credentials: production secrets only accessible from production deployments
- RBAC on pipelines: who can edit vs who can run — separate roles
- Approval gates: production requires approval from someone other than committer (two-person rule)
- Read-only by default: GitHub Actions GITHUB_TOKEN read-only by default
- Restricted runners: production deployments use dedicated runners with prod-only network access
# GitHub Actions GITHUB_TOKEN permissions:
permissions:
contents: read # Default for most jobs
pull-requests: write # Only if PR commenting needed
packages: write # Only if pushing packages
# All other permissions implicitly denied
7. Scenario-Based Questions
Scenario: Design a CI/CD pipeline for a .NET microservice deployed to Azure App Service.
-
Source control: GitHub or Azure Repos with branch protection on main (1 PR review required, status checks must pass)
-
CI workflow (on push/PR):
- Restore NuGet packages (with cache)
- Build with
dotnet build --configuration Release - Run unit tests with code coverage (min 80%)
- Static analysis: SonarCloud or CodeQL
- Container scan if building Docker image
- Publish artifacts
-
CD to Test (on merge to main):
- Deploy Bicep IaC to test resource group
- Deploy app to Test App Service
- Run integration tests against Test environment
- Run smoke tests
-
CD to Production (manual trigger or tag-based):
- Required reviewer approval (production environment)
- Deploy to staging slot of Production App Service
- Smoke test the staging slot
- Slot swap → traffic moves to new version (zero downtime)
- Run production smoke tests
- On failures → automatic slot swap back (rollback)
-
Authentication: GitHub OIDC to Azure (no secrets stored)
-
Monitoring: Application Insights integrated, alert on failure rate spike post-deploy
Scenario: Migrate from Azure DevOps to GitHub Actions.
- Inventory current state: pipelines, variable groups, service connections, environments, agent pools, work items, repos
- Migrate code first: GitHub Importer migrates Azure Repos to GitHub (preserves history, branches, tags)
- Use GitHub Actions Importer: Microsoft CLI tool that converts Azure DevOps YAML to GitHub Actions automatically (~80% of conversion)
- Manual conversion:
- Custom tasks → find GitHub Marketplace equivalents or write composite actions
- Variable groups → repository or environment secrets
- Service connections → GitHub OIDC federated credentials
- Environments → GitHub Environments with protection rules
- Migrate Azure Boards to GitHub Issues + Projects: use Azure DevOps Migration Tools or Azure Boards GitHub action for bidirectional integration during transition
- Parallel run period: 4-8 weeks running both systems, validate Actions work correctly
- Decommission: archive Azure DevOps project (don't delete — keep audit history)
Scenario: Implement a multi-stage CI/CD pipeline with security gates.
Stage 1: Build & Unit Test
→ Compile, run unit tests, fail if coverage < 80%
Stage 2: Security Scans (parallel jobs)
├─ SAST: CodeQL or SonarCloud → fail on critical/high
├─ Secret scan: GitHub secret scanning
├─ License compliance: detect non-compliant licenses
└─ Dependency vulnerabilities: Dependabot / Snyk
→ ALL must pass to proceed
Stage 3: Container/Package Build
→ Build Docker image
→ Container scan: Trivy, Defender for Containers
→ Sign image with Cosign
→ Push to ACR with proper tags
Stage 4: IaC Deploy to Test
→ Bicep what-if (review changes)
→ Apply IaC to test resource group
→ Run integration tests
Stage 5: DAST (Dynamic security testing)
→ OWASP ZAP scan on test environment
→ Penetration testing automation
→ Fail on critical findings
Stage 6: UAT (Manual approval)
→ Deploy to UAT environment
→ Business stakeholder approval gate
Stage 7: Production Deploy
→ 2-person approval (security + product owner)
→ Deploy to staging slot
→ Smoke tests
→ Blue/green slot swap
→ Post-deploy verification
Stage 8: Post-deploy
→ Send Teams notification
→ Update change management system
→ Monitor Application Insights for 30 minutes
→ Auto-rollback if error rate spikes
Scenario: A pipeline run failed and rolled back production. How do you investigate and prevent recurrence?
Immediate triage:
- Check pipeline logs — which step failed and what error?
- Check Application Insights / Log Analytics — what errors are users seeing?
- Confirm rollback was successful — production reverted to previous version
Root cause analysis:
- What changed in this deploy vs the last successful one? Diff the commits.
- Was it code, infrastructure, or configuration drift?
- Was there a database migration that didn't roll back cleanly?
- Were there environment-specific differences not caught in test?
Document findings: post-mortem in Confluence/Wiki — timeline, root cause, customer impact, resolution actions
Preventive actions:
- Add a smoke test that catches this specific failure mode pre-prod
- Improve test environment to match production more closely
- Add alerting that catches the issue before users do
- Update runbooks with rollback procedure for this scenario
Tip: Post-mortems should be blameless. Focus on systems and processes, not individual blame. Teams that punish failures get less honest reporting and more hidden incidents.
8. Cheat Sheet — Quick Reference
DORA Metrics
Deployment Frequency → How often you deploy to production
Lead Time for Changes → Commit to production time
Change Failure Rate → % of deployments causing failures
MTTR → Mean Time to Recovery from failure
Performance levels:
Elite: Multiple deploys/day, < 1 hour lead time, < 15% failure, < 1 hour MTTR
High: Daily deploys, < 1 day lead time, < 30% failure, < 1 day MTTR
Medium: Weekly deploys, < 1 week lead time, < 30% failure, < 1 week MTTR
Low: Monthly deploys, > 1 month lead time, > 30% failure, > 1 week MTTR
Azure DevOps Services
Boards → Work tracking (User Stories, Tasks, Bugs, Features, Epics)
Repos → Git source control + branch policies + pull requests
Pipelines → CI/CD (YAML or Classic — Classic deprecating)
Test Plans→ Manual/exploratory testing + test case management
Artifacts → NuGet, npm, Maven, Python, Universal package management
GitHub Actions Workflow Events
# Common triggers:
on:
push: # Code pushed to repo
branches: [main, develop]
paths: ['src/**']
pull_request: # PR opened/updated
branches: [main]
types: [opened, synchronize, reopened]
schedule:
- cron: '0 2 * * *' # Cron schedule
workflow_dispatch: # Manual UI trigger
inputs:
environment:
type: choice
options: [dev, staging, prod]
workflow_call: # Reusable workflow
release:
types: [published] # Release events
issues:
types: [opened, labeled] # Issue events
Deployment Strategies Comparison
Strategy Downtime Rollback Cost Best For
Recreate Yes Slow Low Non-critical
Rolling No Slow Low Stateless apps
Blue-Green No Instant 2× infra Production w/ rollback
Canary No Fast Low High-traffic prod
Feature flags No Instant Low All (combine w/ above)
OIDC Authentication Pattern
1. Create Azure AD app registration with federated credential
2. Configure subject: repo:org/repo:environment:production
3. Grant role to Service Principal (e.g., Contributor on RG)
4. In GitHub: store client-id, tenant-id, subscription-id as secrets
5. In workflow: use azure/login@v2 with id-token: write permission
Workflow snippet:
permissions:
id-token: write # Required for OIDC
contents: read
steps:
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- run: az group list
# No client secret stored — token issued at runtime
Top 10 Tips
- OIDC over client secrets — Workload Identity Federation is the modern best practice. No stored credentials, short-lived tokens, no rotation. Mention this in any authentication question.
- Branch protection is non-negotiable — production branches must require PR reviews, status checks, and "do not allow bypassing" enabled even for admins. This is the compliance answer.
- Blameless post-mortems — focus on systems, not individuals. Teams that punish failures get hidden incidents. The cultural answer for any incident scenario.
- Validate → What-If → Deploy — the IaC pipeline pattern. Validate confirms syntax, What-If previews changes, Deploy applies. Always show What-If before approval.
- Know the DORA metrics — Deployment Frequency, Lead Time, Change Failure Rate, MTTR. Universal benchmarks, asked in every DevOps maturity discussion.
- GitOps for Kubernetes — Flux or ArgoCD with Git as source of truth. Pull-based reconciliation eliminates drift. The modern Kubernetes deployment answer.
- Pipeline templates enforce standards — central YAML maintained by platform team, consumed by all repos. Updates propagate automatically. The governance answer.
- Environment-scoped secrets — production secrets only accessible from production jobs. GitHub Environments and Azure DevOps environments enforce this. Critical for least-privilege.
- Pin actions to SHA —
@v4is the minimum,@SHAis the gold standard. Never use@mainor@latest— supply chain attack vector. - GitHub Advanced Security — CodeQL (SAST), secret scanning, Dependabot (dependency vulnerabilities). The integrated security answer for GitHub-based pipelines.
No comments:
Post a Comment