Network security for microservices with eBPF

Several open-source Kubernetes tools are already using eBPF. Mainly related to networking, monitoring, and security.
The intention of this post is not to provide complete coverage of all eBPF aspects, but rather tries to be a informational starting point guide, from the understanding of Linux kernel BPF concept, through the advantages and features that brings to microservices environments, to some known tools that currently make use of it, such Cilium or Weave.

Understanding eBPF

Berkely Packet Filters, short BPF, is an instruction set architecture that was first introduced by Steven McCanne and Van Jacobso in 1992, as a generic packet filtering solution for applications such as tcpdump, that was the first use case, and is long present in Linux Kernels.

BPF can be regarded as a minimalistic “virtual” machine construct. The machine it abstracts has only a few registers, stack space, an implicit program counter and a helper function concept that allows for side effects with the rest of the kernel.

This in-Kernel VM translates bpf byte-code to the underlying architecture and provides packet filtering.
Previous to this, packets needed to be copied to user space in order to be filtered, low efficiency and very expensive in resources.

A Verifier checks the program does not generate a loop and guarantees that the program comes to a halt.

First BPF use case: tcpdump

In recent years, the Linux community replaced the nowadays referred to as classic BPF (cBPF) interpreter inside the kernel, which was limited to packet filtering and monitoring, with a new instruction set architecture called eBPF.

Unofficial eBPF mascot

Extended BPF brings more flexibility and programmability aspects and new use cases along with it such as tracing, bfp system call externally usable, secure access to kernel memory or faster interpretation, Just-In-Time (JIT) compiler has been upgraded too to translate eBPF for running programs with native performance.

Also the ability to attach bpf programs to other kernel objects (cBPF could only be attached to sockets for socket filtering). Some supported attach points are: Kprobes, Tracepoints, Network schedules, XDP (eXpress Data Path). This last one, together with shared data-structures (maps) can be used to communicate user space and kernel or sharing information between different BPF programs, a relevant use case for microservices.

Creating a BPF program

One important aspect of eBPF is the possibility of implementing programs from high-level languages such C. LLVM has an eBPF back end for emitting ELF files that contain eBPF instructions, front ends like Clang can be used to craft programs.

After a backend conversion to byte-code, bpf programs are loaded with bpf() system call and validated for safety. JIT compile the code to the CPU architecture and the programs are attached to kernel objects, being executed when events happen on those objects. For example when a network interface emits a packet.

Writing eBPF program example

Policies dictated by BPF programs can be enforced even previous network stack is processed, this achieves the best possible packet processing performance since cannot get processed at an earlier point in software.

Already thinking in writing your own eBPF programs? These useful resources can get you to it:

Container security policies with IPTables

Historically containers runtimes such Docker, apply security policies and NAT rules per-container level by configuring IPTables rules in the docker hosts.

Notice that 5 hops of calls take place:

  1. connect() system call
  2. construct the packet
  3. forward packet through vETH pair
  4. apply iptables in host
  5. drop or forward
Packet sent from container with IPTables policies

With iptables, policies can be applied for instance, but only based on layer 3 and layer 4 parameters. The construction of the whole packet is needed. And also forwarding is involved in order to reach to a decision.

Container security policies with eBPF

Without the limitations of using iptables, eBPF policies can be applied on the system call, before entering the stack or construct the packet.
As eBPF attaches to the container network namespace, all the calls are intercepted and filtered on the spot.

Also, it allows to apply security policies based on application level verbs. For each microservice we can work not only wit L3 and L4 policies, also L7 such REST GET/POST/PUT/DELETE or specific paths /service1, /restricted.

Packet sent from container with eBPF policies

eBPF is replacing iptables

Learn from the hand of a linux kernel contributor why is the kernel community replacing iptables, problems face by Kubernetes kube-proxy or why the use of polices based on IP addresses and ports in the world of containers in which the IPs can change within seconds is not the right approach.

There are several open source Kubernetes tools benefiting from the high-performance and low-latency provided by eBPF. Specially in monitoring, security and networking fields.

Cilium: dynamic network control and visibility

Cilium networking project makes heavy use of eBPF to route and filter network traffic for container-based systems. It can dynamically generate and apply rules without making changes to the kernel itself.

Cillium example scenario

Example of L3/L4 policy that would enable app1 accessible only from app2 but not from app3, and only if using port 80.

endpointSelector: {matchLabels:{id:app1}},
ports:[{ports:80, protocol:tcp}]

We can also apply tighter security and use policies on the call level, for example make calls to /public path available but not to /restricted path.

This is how Cillium project works

The Agent runs on each host, translating network policy definitions to BPF programs, instead of managing iptables. These programs are loaded into the kernel and attached to the container’s virtual ethernet device. When they are executed, rules applied on each packet that is sent or received.

Cillium project

Because BPF runs inside the Linux kernel, Cilium security policies can be applied and updated without any changes to the application code or container configuration.

Read more about Cillium:

How Cillium enhances Istio

There are multiple levels of integration between Cilium and Istio that make sense for both projects.

Improving Istio datapath performance and latency

The following image from a last year post in Cillium Blog, shows latency measurements for most common high performing proxies present in microservices environments.

Common high performing proxies latency comparative

How is it possible that no proxy at all is not faster than using an in-kernel proxy? The difference is thus the cost of the TCP/IP stack in the Linux kernel.
With socket redirect, the in-kernel proxy is capable of having two pods talk to each other directly from socket to socket without creating a single TCP packet. This is very similar to having two processes talk to each other using a UNIX domain socket. The difference is that the applications can remain unchanged while using standard TCP sockets.

How else can Istio benefit from Cillium?

While the key aspect in which Istio can enrich from Cillium is the difference in datapath performance and latency, there are other aspects to consider.

The post Istio 1.0: How Cilium enhances Istio with socket-aware BPF programs, go into some of the details on how Cilium Increase Istio Security, Enable Istio for external services, and the already mentioned improve Performance.

A few more BPF use cases

Leave a Reply

Your email address will not be published. Required fields are marked *