DocsFix Miscalculated Hosts

How Datadog Miscalculates Containers as Hosts and How to Fix It

Datadog is a powerful monitoring and analytics platform, but its billing model, which is primarily based on the number of hosts, can sometimes lead to unexpected costs. A common issue is the miscalculation of containers as billable hosts, significantly increasing your monthly bill. This happens when the Datadog agent, or other reporting tools, incorrectly identifies a container’s hostname as a unique host.

This blog post will explore three common scenarios where this miscalculation can occur and provide actionable solutions to ensure you’re only paying for what you intend to monitor.

Understanding Datadog’s Host Billing

Datadog considers any physical or virtual OS instance that you monitor with its agent as a host. In a containerized environment, the best practice is to run a single Datadog agent per node, which then collects metrics from all containers on that node. When an agent is installed directly within each container, every container is counted as a separate host, leading to inflated bills. The key to avoiding this is proper hostname resolution; the agent needs to identify the underlying host it’s running on, not the container’s own ephemeral hostname.


1. Vector.dev Containers and the Hostname Bug

Vector is a popular open-source observability data pipeline. However, a bug in how older versions of the Vector Helm chart determine the hostname could lead to each Vector container being registered as a separate host in Datadog.

The issue stems from the container-level get_hostname system call, which returns the container’s ID or a similar identifier instead of the Kubernetes node’s name.

The Fix

This issue has been addressed in a pull request to the Vector project. The fix involves ensuring that the correct hostname from the Kubernetes node is passed to the Vector container.

To resolve this, you need to:

  1. Upgrade to the latest Vector Helm chart: Make sure you are using an up-to-date version of the chart to have access to the latest features and bug fixes.
  2. Set the VECTOR_HOSTNAME environment variable: In your values.yaml for the Vector Helm chart, you need to explicitly set the VECTOR_HOSTNAME environment variable to use the Kubernetes node’s name. This is done using a fieldRef to the spec.nodeName.

Here is a snippet of the required configuration in your values.yaml:

env:
  - name: VECTOR_HOSTNAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName

By applying this configuration, you are telling the Vector container to use the Kubernetes node name as its hostname, ensuring that all Vector instances running on the same node are correctly associated with a single host in Datadog. It’s crucial to ensure that any agent reporting to Datadog correctly identifies the host. The spec.nodeName field in Kubernetes provides a reliable way to get the node’s name.

2. Third-Party Agents Reporting Directly to Datadog

Another common source of miscalculation comes from third-party applications or services that have their own built-in Datadog integration. If these agents are running in containers and report directly to the Datadog API without going through a central Datadog agent on the host, they can be misidentified as unique hosts.

This often happens when the third-party agent is not configured to use the host’s hostname and instead defaults to using its container ID or some other unique identifier.

The Fix

The solution here is to reconfigure the third-party agent to correctly report the hostname. The exact method will vary depending on the specific application, but the general principle is the same: find the configuration option that controls the reported hostname and set it to the Kubernetes node’s name.

Here are some general steps you can take:

  • Consult the documentation: Check the documentation for the third-party tool’s Datadog integration for a setting related to hostname.
  • Use environment variables: Many applications can have their hostname configured via an environment variable. Similar to the Vector fix, you can use the Kubernetes downward API to pass the node name to the container as an environment variable.
  • Send data to the local Datadog agent: A more robust solution is to configure the third-party agent to send its metrics to the local Datadog agent’s DogStatsD service instead of directly to the Datadog API. The host agent is designed to handle hostname tagging correctly.

3. Customer-Owned Agents and Custom Libraries

The same problem can occur with your own custom applications or libraries that send data to Datadog. If your code, running inside a container, calls a system function like get_hostname, it will receive the container’s hostname, not the node’s. Sending this to Datadog will result in the container being billed as a host.

The Fix

To prevent this, you must explicitly pass the node’s hostname to your application container and use that value when reporting data to Datadog.

You can achieve this by using the Kubernetes downward API to expose the node name as an environment variable to your pod. The downward API allows a container to consume information about itself or the cluster without needing to query the Kubernetes API server directly.

Here’s an example of how you would modify your pod’s deployment manifest:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
    - name: my-app-container
      image: my-app-image
      env:
        - name: DATADOG_HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
  # ... other container specs
 

Then, within your application code, you would read the DATADOG_HOSTNAME environment variable and use its value as the hostname when sending metrics or logs to Datadog. This ensures that even your custom applications are correctly associated with the correct host.