SNMP Monitoring With Datadog

Having the ability to quickly pull statistics on application and server performance metrics, while easily alerting and reporting to various integrations has made Datadog a valuable tool for our dev team to track application issues, and our ops team to track server and database issues.

While browsing the various Datadog integrations, I found an addon for SNMP. While I had planned on using cacti to graph internal equipment, having a single looking glass was ideal.

So I made some pretty graphs.

CPU Usage

VLAN traffic usage.

Datadog charges per host, per month. Luckily, for SNMP, we can monitor an entire fleet of SNMP devices from a single host.

To start, I created a linux host on my internal network, and configured the Datadog agent with the defaults.

The device I chose for this example is a Cisco ASA, pulling basic operation stats (CPU, RAM), and per-VLAN traffic monitoring. The hunt for the OIDs was grueling. I ended up finding a list of OIDs somewhere in the mammoth Cisco knowledge base that had OIDs relevant to my device. Using snmpwalk, I was able to verify an OIDs existence, or poll values to ensure it was the correct OID.

This commaned will output the current value for our OUTSIDE VLAN, which is our WAN edge interface. Your OID may vary, depending on the device, interface speed, number of interfaces, etc.

snmpwalk -c password -v 2c 172.16.0.1 "1.3.6.1.2.1.2.2.1.10.21"

After I’d finished compiling a list of OIDs to be used, I began adding them to /etc/dd-agent/conf.d/snmp.yaml as listed below. You can add multiple devices by adding a new section starting with - ip_address:

Most of these OIDs should work on most ASAs, but you will want to adust the interface names to reflect your own internal VLAN assignments.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# SNMP v1-v2 configuration
   - ip_address: 172.16.0.1
     port: 161
     community_string: password
     snmp_version: 2 # Only required for snmp v1, will default to 2
     timeout: 1 # second, by default
     retries: 5

     metrics:
       - OID: 1.3.6.1.2.1.2.2.1.10.21
         name: OUTSIDE_Bytes
       - OID: 1.3.6.1.2.1.2.2.1.10.16
         name: COMPUTE_Bytes
       - OID: 1.3.6.1.2.1.2.2.1.10.17
         name: GUEST_Bytes
       - OID: 1.3.6.1.2.1.2.2.1.10.20
         name: INFERNO_Bytes
       - OID: 1.3.6.1.2.1.2.2.1.10.19
       - OID: 1.3.6.1.4.1.9.9.147.1.2.2.2.1.5.40.6
         name: ActiveConnections
       - OID: 1.3.6.1.4.1.9.9.171.1.3.1.1.0
         name: IPSecTunnels
       - OID: 1.3.6.1.2.1.2.2.1.14.16
         name: OUTSIDE_Errors
       - OID: 1.3.6.1.2.1.2.2.1.10.
         name: ifInOctets
       - OID: 1.3.6.1.2.1.2.2.1.13.
         name: ifInDiscards
       - OID: 1.3.6.1.2.1.2.2.1.14.
         name: ifInErrors
       - OID: 1.3.6.1.2.1.2.2.1.16.
         name: ifOutOctets
       - OID: 1.3.6.1.2.1.2.2.1.19.
         name: ifOutDiscards
       - OID: 1.3.6.1.2.1.2.2.1.20.
         name: ifOutErrors
       - OID: 1.3.6.1.4.1.9.9.48.1.1.1.2.
         name: ciscoMemoryPoolName
       - OID: 1.3.6.1.4.1.9.9.48.1.1.1.3.
         name: ciscoMemoryPoolAlternate
       - OID: 1.3.6.1.4.1.9.9.48.1.1.1.4.
         name: ciscoMemoryPoolValid
       - OID: 1.3.6.1.4.1.9.9.48.1.1.1.5.
         name: ciscoMemoryPoolUsed
       - OID: 1.3.6.1.4.1.9.9.48.1.1.1.6.
         name: ciscoMemoryPoolFree
       - OID: 1.3.6.1.4.1.9.9.48.1.1.1.7.
         name: ciscoMemoryPoolLargestFree
       - OID: 1.3.6.1.4.1.9.9.109.1.1.1.1.7.
         name: cpmCPUTotal1minRev
       - OID: 1.3.6.1.4.1.9.9.109.1.1.1.1.8.
         name: cpmCPUTotal5minRev
       - OID: 1.3.6.1.4.1.9.9.171.1.3.1.16.
         name: cipSecGlobalOutOctets
       - OID: 1.3.6.1.4.1.9.9.392.1.3.26.
         name: crasIPSecNumSessions

Keep in mind that you can poll any SNMP capable device; Access Points, switches, servers, telephones, etc.