Automate Multicloud Security Change Cycles Without Losing Human Control
How enterprise cloud teams can systematise security change cycles across AWS, Azure, and GCP without removing the human approval gates that regulated industries genuinely require.
Tags: Security, Multicloud, AWS, Azure, GCP, Compliance, Automation, NIS2, GDPR, Governance
Most enterprise cloud teams operate across at least two public clouds. A significant number operate across three. Each cloud has its own security scanning service, its own patch management toolchain, its own config drift detection, and its own remediation API. The result is not a security programme. It is three separate security programmes loosely connected by Slack alerts and a shared spreadsheet that everyone promises to update.
The backlog accumulates fast. A vulnerability surfaces in AWS Inspector on a Monday. By Wednesday it has moved down the priority queue because an Azure Policy compliance report flagged forty misconfigurations and the GCP Security Command Center emailed a critical finding at the same time. By Friday, nobody knows which of the three was actually remediated, which was accepted as a risk, and which is still sitting open because the right person was out of office.
This is not a people problem. It is an architecture problem. The discipline of automating security change cycles, consistently, across providers, without bypassing the human judgement that regulated industries genuinely require, is one of the most operationally valuable things a cloud team can build. It is also one of the least glamorous, which is probably why so few teams have done it well.
Full automation is a myth for regulated industries
Before getting into the framework, one opinionated take that the rest of this article builds on: full automation of security remediation is not a goal you should be pursuing if you operate in a regulated sector.
This is not a conservative position. It is a practical one. GDPR Article 32 requires organisations to implement appropriate technical measures and to be able to demonstrate those measures to supervisory authorities. NIS2 Article 21 requires proportionate risk management and imposes explicit accountability on management. Neither regulation cares whether your automation pipeline ran the remediation. They care who approved it, on what basis, and whether the decision was recorded.
When a security automation pipeline patches a critical vulnerability in production at 2 a.m. without a human approval, you have fixed the vulnerability and created a compliance gap. The fix is not wrong. The missing approval record is. Regulated industries need the fix and the paper trail. Automation gives you the first. Human gates give you the second.
The framework: detect, triage, approve, remediate, verify
A workable multicloud security change cycle has five stages. Each stage has a clear owner, a clear output, and a clear handoff. The human gates live at the approve stage, not distributed randomly across the pipeline.
- Detect
- Triage
- Approve
- Remediate
- Verify
Detect
Detection means continuous, cross-cloud visibility aggregated into a single signal stream. Running three separate tools and reading three separate dashboards is not detection; it is surveillance debt.
On AWS, the detection layer is AWS Config for configuration compliance and AWS Inspector for vulnerability scanning. Config rules evaluate resource configurations against your defined standards. Inspector scans EC2 instances, container images, and Lambda functions for known CVEs. Both feed findings into AWS Security Hub.
On Azure, Azure Policy evaluates resource configurations continuously and reports compliance state. Microsoft Defender for Cloud covers vulnerability assessment for VMs, containers, and databases. Both surface findings in a unified compliance score.
On GCP, Security Command Center (SCC) is the single pane. It aggregates findings from Web Security Scanner, Container Threat Detection, Event Threat Detection, and third-party integrations. SCC Premium tier includes continuous compliance monitoring against frameworks including CIS, PCI-DSS, and ISO 27001.
The output of the detect stage is a normalised finding stream. Provider-native findings need to be translated into a common schema before triaging. A finding has a severity, an affected resource, an affected cloud, a finding type, and a detection timestamp. Everything else is provider-specific noise.
Triage
Triage is risk scoring applied to the normalised finding stream. Not all critical-severity findings are equally urgent. A critical vulnerability in a public-facing API service in production is not the same as a critical vulnerability in a development instance with no external access and no customer data.
Effective triage enriches each finding with context: is the affected resource in production or non-production? Is it internet-facing? Does it process personal data? Is it covered by a NIS2 or GDPR scope boundary? That context drives a risk score that determines the remediation SLA and the approval tier required.
A practical three-tier classification works well: automatic-with-notification (low risk, non-production, no data exposure), change-request-required (moderate risk or production scope), and emergency-change-required (critical severity, production, data or availability impact). Only the first tier skips human approval.
Approve
The approval stage is where most automation projects make the mistake that breaks them in regulated environments. They treat approval as a bottleneck and try to minimise it. The right frame is the opposite: approval is the control, and the goal is to make it fast enough that it does not create more risk than it removes.
The approval gate is not the bottleneck. Bad tooling is. When an approver has to navigate to three different portals, cross-reference a finding against a risk register maintained in a different system, and then post approval in a ticket comment that may or may not be read, approval takes days. When the approval workflow presents the finding, the context, the proposed remediation, and a single approve/reject action in one place, approval takes minutes.
Human approval gates
When to require human approval vs. automate
- Policy changes of any kind: adding, modifying, or removing security policies
- Breaking configuration changes that affect network topology or access controls
- Any remediation that touches production workloads carrying personal data
- Any exception or risk acceptance that deviates from the approved runbook
- OS patch application to pre-approved patch tiers on non-production instances
- Automated remediation of findings that match an approved runbook without deviation
- Validation and verification runs that collect evidence without making changes
- Tag enforcement and cost allocation corrections on non-sensitive resources
The boundary between automated and gated is not static. As your runbook library matures and your team builds confidence in specific remediation patterns, more actions can be safely automated. Start conservative and expand the automated tier as evidence accumulates.
Remediate
Automated remediation executes against approved findings using provider-native tooling wherever possible.
On AWS, Systems Manager (SSM) Automation runbooks handle OS patching at scale. SSM Patch Manager lets you define patch baselines, classify patches by severity, and schedule patching windows. SSM documents can remediate specific Config findings automatically: enabling S3 bucket versioning, removing public access blocks, rotating IAM keys. The SSM execution log is your audit trail.
On Azure, Azure Update Manager handles OS patching for Azure VMs, Arc-enabled servers, and hybrid machines. Azure Policy remediation tasks handle configuration drift: deploying missing diagnostic settings, enabling encryption, applying tags. Remediation tasks are logged in the Azure Activity Log with the identity that triggered them.
On GCP, OS Config handles patch management for Compute Engine VMs. Security Command Center can trigger remediation via Cloud Functions or Pub/Sub integrations. For configuration drift, Organization Policy Service enforces constraints at the organisation, folder, or project level, and violations are remediated by re-applying the policy.
All three providers expose remediation actions via API, which means they can be orchestrated from a single control plane. A step function (AWS), Logic App (Azure), or Cloud Workflow (GCP) can drive cross-cloud remediation sequences triggered by approved findings.
Verify
Verification closes the loop. After remediation executes, an automated check confirms that the finding is resolved. The verification result is attached to the original finding record, creating a complete audit chain: detected at time X, approved at time Y by person Z, remediated at time W, verified at time V.
This chain is what your compliance team needs when an auditor asks for evidence. It is also what your security team needs when the same vulnerability surfaces again six months later and someone asks whether it was actually fixed the first time.
How CloudBoostUP helps EU firms build this discipline
For EU-regulated organisations, the compliance angle is not optional. GDPR and NIS2 both require documented, auditable controls. Supervisory authorities increasingly expect organisations to demonstrate not just that they have controls but that those controls are consistently applied across their entire cloud footprint, including multi-provider environments.
We help cloud teams in EU-regulated sectors build the detect-triage-approve-remediate-verify cycle as a durable engineering discipline rather than a one-time project. That means designing the approval workflow to produce records that satisfy NIS2 Article 21 requirements, structuring remediation runbooks so they can be evidenced under GDPR Article 32, and building the cross-cloud visibility layer so that compliance reporting does not require manual aggregation the week before an audit.
If your team is carrying a growing multicloud security backlog and you are not sure where the process is breaking down, that is usually the right starting point. Map the current cycle end-to-end, find where findings go to die, and fix that stage first. The rest of the automation follows.
Need this for your project?
We cover this exact scenario. Strategy, delivery, or both. See the use case or get in touch.