With the growing adoption of cloud-native technologies, containers and Kubernetes have become the backbone of modern application deployments. Microservices-based container workloads are easier to scale, more portable, and more resource efficient. By managing these Kubernetes workloads, organizations can deploy advanced AI and machine learning applications across multiple compute resources, significantly increasing operational productivity at scale. With this
With the growing adoption of cloud-native technologies, containers and Kubernetes have become the backbone of modern application deployments. Microservices-based container workloads are easier to scale, more portable, and more resource efficient. With Kubernetes managing these workloads, organizations can deploy advanced AI and machine learning applications across multiple compute resources, dramatically increasing operational productivity at scale. With this evolution of application architecture comes a strong need for built-in granular security controls and deep observability, however the ephemeral nature of containers makes this challenging. This is where Azure Advanced Container Networking Services comes in.
We are excited to announce the general availability of advanced container networking services for Azure Kubernetes Services (AKS), a cloud-native, purpose-built solution to increase security and observability for Kubernetes and container environments. Advanced Container Networking Services is focused on delivering a seamless and integrated experience that enables you to maintain a robust security posture and gain deep insight into network traffic and application performance. This ensures that your containerized applications are not only secure, but also meet your performance and reliability goals, allowing you to manage and scale your infrastructure with confidence.
Let’s take a look at the container network security and observability features of this release.
Container network observability
While Kubernetes excels at orchestrating and managing these workloads, one fundamental challenge remains: how do we gain meaningful insight into how these services interact? Monitoring microservice network traffic, monitoring performance, and understanding dependencies between components are critical to ensuring both reliability and security. Without this level of visibility, performance issues, outages, and even potential security risks may go undetected.
To really understand how well your microservices are performing, you need more than basic cluster-level metrics and virtual network logs. Comprehensive network observability requires detailed network metrics including node, pod, and Domain Name Service (DNS) level reports. These metrics allow teams to identify bottlenecks, troubleshoot issues, and monitor the health of each service in the cluster.
To address these challenges, Advanced Container Networking Services provides powerful observability features tailored specifically for Kubernetes and container environments. Advanced Container Networking Services provides detailed real-time information at the node level, pod level, and Transmission Control Protocol (TCP) and DNS level metrics, ensuring that no aspect of your network is overlooked. These metrics are critical to identifying performance bottlenecks and resolving network issues before they impact workloads.
Advanced Container Networking Services network monitoring features include:
- Node-level metrics: These metrics provide an overview of traffic volume, dropped packets, number of connections, etc. by a knot. Metrics are stored in Prometheus format and can be viewed in Grafana.
- Hubble metrics, DNS and sub-level metrics: Advanced Container Networking Services uses Hubble to collect metrics and include Kubernetes context, such as source and target pod name and namespace information, allowing network-related issues to be determined at a more granular level. Metrics cover traffic volume, dropped packets, TCP resets, L4/L7 packet flows and more. There are also DNS metrics that cover DNS errors and unanswered DNS requests.
- Flux records from Hubble: Flow logs provide visibility into workload communication and help understand how microservices communicate with each other. Flow logs also help answer questions such as: did the server receive the client’s request? What is the round-trip latency between the client request and the server response?
- Service dependency map: This traffic flow can also be visualized using the Hubble UI, creating a service connection graph based on the flow logs and displaying the flow logs for the selected namespace.
Container network security
One of the key challenges of container security stems from the fact that Kubernetes allows all communication between endpoints by default, which introduces high security risks. Advanced Container Networking Services with Azure CNI using Cilium enables advanced fine-grained network policies using Kubernetes identities that allow only permitted traffic and secure endpoints.
While traditional network policies rely on IP-based rules to control external traffic, external services frequently change their IP addresses. This makes it difficult to enforce and ensure consistent security for workloads communicating outside the cluster. Advanced Container Networking Services Fully Qualified Domain Name (FQDN) filtering and Security Agent DNS proxy can isolate network policies from IP address changes.
In the next section, we’ll take a deeper look at how FQDN filtering can change the way you provision Kubernetes networks.
Security agent proxy FQDN and DNS filtering
The solution consists of two main components: Agent Cilium and security agent DNS proxy. Together, they seamlessly integrate FQDN filtering into Kubernetes clusters, enabling more efficient and manageable control over external communications.
Agent Cilium
Cilium Agent is a critical network component that runs as a DaemonSet within clusters using Azure CNI using Cilium. The agent handles networking, load balancing, and network policies for the pods in the cluster. For pods with enforced FQDN policies, the Cilium Agent redirects packets to the DNS Proxy for name resolution and updates the network policy using the FQDN:IP mappings obtained from the DNS proxy.
Security Agent DNS Proxy
The DNS proxy included in the security agent runs as a DaemonSet in Azure CNI using a Cilium cluster with Advanced Container Networking services enabled. Handles DNS resolution for pods and upon successful DNS resolution updates Cilium Agent with FQDN to IP mapping.
Running the security agent DNS proxy in a separate daemon set (acns-security-agent) together with the Cilium agent ensures that the pods will continue to resolve DNS even if the Cilium agent is down or undergoing an upgrade. Thanks to Kubernetes maxSurge’s upgrade feature, the DNS proxy remains operational during upgrades. This design ensures that network connectivity for essential customer workloads is not disrupted due to DNS resolution issues.
Customer onboarding and scenarios
Advanced Container Networking Services has been deployed by many internal and external customers even during their preview for the following use cases:
- Troubleshoot application degradation and DNS resolution timeouts using DNS errors and metrics.
- Applications and modules occasionally lose connections to other modules or external endpoints. Pod metrics show cluster administrators dropped packet counts, TCP errors, and retransmissions, helping to debug connectivity issues more quickly.
- Flow logs for debugging network connectivity issues.
- To enable cluster security and make policies more resilient to IP address changes, setting Cilium network policies using FQDNs instead of IP addresses greatly simplifies policy management.
At H&M Group, platform engineering is a core practice supported by our cloud-based internal development platform that enables autonomous product teams to create and host microservices. Deep network observability and robust security are key to our success, and Advanced Container Networking Service features help us achieve this. Real-time flow logs speed up our ability to troubleshoot connectivity issues, while FQDN filtering ensures secure communication with trusted external domains.” — Magnus Welson, Technical Manager, Container Platform, H&M Group
The advanced observability offered by Advanced Container Networking Services helped us tremendously when we investigated a high-impact issue in one of the Japan Tobacco International AKS clusters. Thanks to the information provided by Advanced Container Networking Services, we were able to determine the problem in DNS performance and then confirm that the remediation we applied was successful.” — Andrew Wytyczak-Partyka, CodeWave CEO, Alexandru Popovici, DevOps & Security Manager, JT International
At Ferrovial, on our enterprise Kubernetes platform (called Kubecore), we use the Advanced Container Networking Service to debug connectivity issues in our applications using real-time network flow tools, giving us full details. Additionally, DNS errors and metrics available at the workload level give us deep network visibility to more quickly troubleshoot application degradation issues. — Victor Fernandez, Chief Cloud Architect, Ferrovial
Conclusion
As you continue your journey in the cloud-native space, the importance of integrating security and observability into every layer of your infrastructure cannot be overstated. With the right tools, you can move faster, innovate more, and do it with confidence that your workloads are visible and protected.