If you run workloads in multiple Alibaba Cloud regions like me (and if you serve users in both Asia and Europe like I do), you’ve probably dealt with the pain of getting container images where they need to be, fast. I first wrote about setting up cross-region container replication with ACR Enterprise Edition back in 2020, and the platform has evolved considerably since then. This post covers how I now run a full CI/CD pipeline that builds once, replicates globally, and deploys to ACK clusters in multiple regions.
The Problem
Here’s a scenario I deal with regularly. I have applications that need to run in both eu-central-1 (Frankfurt) for European users and cn-shanghai for users in mainland China. The naive approach is:
- Build the Docker image
- Push it to ACR in Frankfurt
- Push it again to ACR in Shanghai
- Deploy to ACK in Frankfurt
- Deploy to ACK in Shanghai
This means pushing multi-hundred-megabyte images across continents twice, maintaining two registries manually, and hoping the versions stay in sync. It’s slow, error-prone, and a waste of bandwidth.
The Solution: ACR EE with Automatic Replication
Alibaba Cloud Container Registry Enterprise Edition (ACR EE) supports cross-region image replication rules. You push once to your primary region, and ACR handles replicating the image to every target region automatically. Combined with ACK’s image pull acceleration, this cuts deployment time dramatically.
graph LR
A[GitHub Push] --> B[CI Build]
B -->|docker push| C[ACR EE Frankfurt]
C -->|auto replication| D[ACR EE Shanghai]
C -->|local pull| E[ACK Frankfurt]
D -->|local pull| F[ACK Shanghai]
Step 1: Set Up ACR EE Instances
You need an ACR EE instance in each region. I use the Advanced edition which supports up to 50 namespaces and an image pull QPS of 1,000:
aliyun cr CreateInstanceEndpointAclPolicy \
--InstanceId cri-xxxxx \
--RegionId eu-central-1 \
--EndpointType internet \
--Enable true
aliyun cr CreateInstanceEndpointAclPolicy \
--InstanceId cri-yyyyy \
--RegionId cn-shanghai \
--EndpointType internet \
--Enable true
Step 2: Configure Replication Rules
This is where the magic happens. Create a replication rule that syncs images from Frankfurt to Shanghai whenever a new tag is pushed:
aliyun cr CreateRepoSyncRule \
--InstanceId cri-xxxxx \
--NamespaceName production \
--RepoName my-app \
--TargetRegionId cn-shanghai \
--TargetInstanceId cri-yyyyy \
--TargetNamespaceName production \
--TargetRepoName my-app \
--SyncRuleName prod-to-shanghai \
--SyncScope REPO \
--SyncTrigger PASSIVE \
--TagFilter "^v.*"
A few params worth noting: SyncScope must be either REPO (sync a specific repository) or NAMESPACE (sync all repos in a namespace). SyncTrigger: PASSIVE means automatic replication on push; INITIATIVE would require manual triggering. The TagFilter is a regex — ^v.* matches any tag starting with v, so development tags, latest, and feature branches stay local. Only production releases flow across regions.
Step 3: The CI Pipeline
I use a GitHub Actions workflow that builds the image, pushes to Frankfurt, waits for replication, and then triggers deployments in both regions. Here’s the core of it:
name: Build and Deploy Multi-Region
on:
push:
tags: ['v*']
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Login to ACR Frankfurt
run: |
docker login \
registry.eu-central-1.aliyuncs.com \
-u $ \
-p $
- name: Build and push
run: |
IMAGE=registry.eu-central-1.aliyuncs.com/production/my-app:$
docker build -t $IMAGE .
docker push $IMAGE
wait-for-replication:
needs: build-and-push
runs-on: ubuntu-latest
steps:
- name: Wait for ACR replication to Shanghai
run: |
TAG=$
for i in $(seq 1 30); do
STATUS=$(aliyun cr GetRepoSyncTask \
--InstanceId $ \
--SyncTaskId $(aliyun cr ListRepoSyncTask \
--InstanceId $ \
--RepoName my-app \
--Tag $TAG \
--output cols=SyncTaskId rows=0) \
--output cols=TaskStatus rows=0)
if [ "$STATUS" = "SUCCESS" ]; then
echo "Replication complete"
exit 0
fi
echo "Waiting for replication... attempt $i"
sleep 10
done
echo "Replication timeout"
exit 1
deploy-frankfurt:
needs: wait-for-replication
runs-on: ubuntu-latest
steps:
- name: Deploy to ACK Frankfurt
run: |
kubectl set image deployment/my-app \
my-app=registry.eu-central-1.aliyuncs.com/production/my-app:$ \
--context ack-frankfurt
deploy-shanghai:
needs: wait-for-replication
runs-on: ubuntu-latest
steps:
- name: Deploy to ACK Shanghai
run: |
kubectl set image deployment/my-app \
my-app=registry.cn-shanghai.aliyuncs.com/production/my-app:$ \
--context ack-shanghai
The deploy jobs run in parallel after replication confirms success. Each ACK cluster pulls from its local ACR instance, so the image pull is fast regardless of the region.
Image Pull Acceleration on ACK
ACK supports on-demand image loading through ACR’s image acceleration feature (based on the Nydus project). Instead of downloading the entire image layer before starting the container, ACK streams only the data needed for startup. For large images, this can reduce pull times from minutes to seconds.
You enable this through the ACR EE console: go to your instance, navigate to Repositories > Image Acceleration, and enable acceleration for specific repositories. ACR will generate accelerated image versions alongside the original layers. On the ACK side, you configure the component Chinese: 按需加载 (on-demand loading) in the cluster’s component management.
In my testing, a 1.2 GB Node.js application image went from 45-second cold pulls to under 8 seconds with acceleration enabled. That’s a massive improvement for autoscaling scenarios where new pods need to start ASAP (which is pretty much all of them these days whenoptimising for cost).
Handling China-Specific Requirements
Deploying to cn-shanghai (or any mainland China region really) comes with its own set of challenges that I’ve dealt with for years through my work with CEN and ICP:
Separate Alibaba Cloud accounts: mainland China regions require a Chinese-entity Alibaba Cloud account, separate from your international account. ACR replication works across accounts with proper RAM role configuration.
ICP filing: if your application serves web content to users in China, you need an ICP filing. This isn’t related to the container pipeline itself, but it affects whether your ACK ingress endpoints are accessible.
Base image sources: when building images that will run in China, use Chinese mirrors for package managers. The Dockerfile should detect the target region and switch sources:
ARG REGION=international
RUN if [ "$REGION" = "china" ]; then \
sed -i 's/deb.debian.org/mirrors.aliyun.com/g' /etc/apt/sources.list; \
fi
Security: Scanning and Signing
ACR EE includes built-in vulnerability scanning. Enable it on your namespace to automatically scan every pushed image:
aliyun cr UpdateNamespace \
--InstanceId cri-xxxxx \
--NamespaceName production \
--AutoScan true
You can also block deployments of images with critical vulnerabilities by configuring ACK’s admission controller to check scan results before allowing a pod to start. This gives you a security gate between the registry and the cluster without any third-party tools.
For image signing, ACR EE supports content trust through Notation (the successor to Docker Content Trust). Sign images in CI and verify signatures in ACK:
notation sign \
registry.eu-central-1.aliyuncs.com/production/my-app:v1.2.3 \
--key my-signing-key
Lessons From Running This in Production
I’ve been running variations of this pipeline for several years. Some hard-won lessons for you so you don’t make the same mistakes I did:
Don’t replicate everything. Use tag filters aggressively. In my early setup, every CI build replicated to China, burning bandwidth and money. Now only release tags replicate.
Monitor replication lag. ACR doesn’t guarantee replication time. I’ve seen it complete in 30 seconds and I’ve seen it take 5 minutes during peak hours. The polling loop in the CI pipeline handles this, but set alerts if replication consistently exceeds your SLA.
Use VPC endpoints for ACR. Within ACK, pull images through the VPC endpoint (registry-vpc.region.aliyuncs.com) instead of the public endpoint. It’s faster, free of bandwidth charges, and more secure.
Separate namespaces for environments. I use production, staging, and development namespaces in ACR. Only production has replication rules. This keeps the pipeline clean and costs predictable.
If you’re still pushing images manually to multiple regions or running docker push twice in your CI pipeline, ACR EE’s replication rules will save you time and bandwidth. And if you’re dealing with the China mainland specifically, the cross-account replication support means you can maintain a single CI pipeline even with the separate-account requirement.
For more on how I handle container workflows on Alibaba Cloud, check my earlier posts on ACR replication, host-based routing on ACK, and dealing with large Docker images.