I have an interesting problem with Packetbeat. Packetbeat is installed on a Debian 10 system. It is the latest version of Packetbeat (installed fresh this week from the Elastic download area) and sending data to Elastic v7.7 also installed on a Debian 10 system.
I am seeing the DNS data in the Elastic logs (when viewing them using Kibana-->Logs gui). But, I also see additional DNS packets in the log that I am not seeing on a packet analyzer running tcpdump from the same system that packetbeat is running on.
Here is the packet analyzer showing DNS calls to/from a client (10.5.52.47). The wireshark capture filter is set to 'port 53' and the display filter is set to 'ip.addr==10.5.52.47' It is running on the same system as packetbeat (for purposes of troubleshooting this issue). Wireshark screenshot
1552 2020-06-04 20:31:34.297973 10.5.52.47 10.1.3.200 52874 53 DNS 93 Standard query 0x95f7 SRV
1553 2020-06-04 20:31:34.298242 10.1.3.200 10.5.52.47 53 52874 DNS 165 Standard query response 0x95f7 No such name SRV
1862 2020-06-04 20:32:53.002439 10.5.52.47 10.1.3.200 59308 53 DNS 90 Standard query 0xd67f SRV
1863 2020-06-04 20:32:53.002626 10.1.3.200 10.5.52.47 53 59308 DNS 162 Standard query response 0xd67f No such name SRV
1864 2020-06-04 20:32:53.004126 10.1.3.200 10.5.52.47 64594 53 DNS 84 Standard query 0xaaaa A
1867 2020-06-04 20:32:53.516716 10.1.3.200 10.5.52.47 64594 53 DNS 84 Standard query 0xaaaa A
2731 2020-06-04 20:36:34.314959 10.5.52.47 10.1.3.200 53912 53 DNS 93 Standard query 0x2631 SRV
2732 2020-06-04 20:36:34.315058 10.1.3.200 10.5.52.47 53 53912 DNS 165 Standard query response 0x2631 No such name SRV
I removed the actual DNS query info from these packets as it is not pertinent to this topic. From the wireshark output, you can see a DNS query at 20:32:53 from 10.5.52.47 to the DNS server 10.1.3.200. The server responds to this query in the next packet. Also, there are two other responses from server after this on the same second of time.
The next DNS query by the client 10.5.52.47 occurs at 20:36:34. And this also gets an immediate response from the server.
This differs from the Kibana-->log output sent by packetbeat. In the Kibana logs, it shows the following: Screenshot of Kibana Log showing actual DNS call(s), and multiple non-existent DNS calls (highlighted in yellow)
All the above info as captured in the packet capture
plus the following:
20:33:00.000 destination IP of 10.5.52.47 destination port of 53
Same thing at
20:33:10.000
20:33:20.000
20:33:30.000
20:33:40.000
Then at 20:36:34 it shows the DNS query that the packet capture shows.
So, these port 53 that end at 00/10/20/30/40 seconds after the minute appear to be made up from thin air. Additionally, there are no other fields being populated in the Elastic logs for these entries. client.ip is empty, and so is client.bytes, client.port, and ALL the DNS fields for these log entries. All the DNS entries that are listed in both the packet capture and Kibana, have all the expected fields populated with correct data.
Does anyone have an idea of why this is occurring? This example above is a small sample. This occurs for multiple systems at 10 seconds intervals. for example, at 10 or 20 or 30 or 40 or 50 or 60 seconds after the minute, I see between 10 to 100 (guesstimate) of these log entries where all the fields are blank except destination.ip, destination.byte, and destination.port - there is no client info and no DNS info contained in the fields for these errant records.
The 'normal' DNS records hove about 20 fields of information listed on the Kibana log, and these errant ones have only four fields (the fields listed above and the timestamp).
Here is an example of the log from one of these 10 second intervals...
timestamp Dest.ip Dest.bytes Dest.port
20:02:50.000 10.1.3.200 105 53
20:02:50.000 10.1.3.200 326 53
20:02:50.000 10.1.3.200 199 53
20:02:50.000 10.1.3.200 208 53
20:02:50.000 10.1.3.201 260 53
20:02:50.000 10.1.3.200 219 53
20:02:50.000 10.1.3.200 208 53
20:02:50.000 10.1.3.200 199 53
.
.
Plus 42 more of these at the same second
.
.
20:02:50.000 10.1.3.201 98 53
And here is the packetbeat.yml file (only showing uncommented lines)
packetbeat.interfaces.device: enp0s25
packetbeat.flows:
timeout: 30s
period: 10s
packetbeat.protocols:
- type: dhcpv4
ports: [67, 68]
- type: dns
ports: [53]
include_authorities: true
include_additionals: true
setup.template.settings:
index.number_of_shards: 1
setup.dashboards.enabled: true
setup.kibana:
host: "1.1.1.1:5601"
output.elasticsearch:
hosts: ["1.1.1.2:59200"]
setup.template.overwrite: true
setup.template.enabled: true
Thank you for your thoughts on what might be causing this issue.
=======================================================================
Update on 6/8/20
I had to shutdown packetbeat due to this issue, until I can locate a resolution. One single packetbeat system generated 100 million documents over the weekend for just DNS queries. Of which 98% of them were somehow created by packetbeat and were not real DNS queries.
I stopped the packetbeat service this morning on the linux box that is capturing these DNS queries, and deleted this index. I then restarted the packetbeat instance and let it run for about 60 seconds. Then I stopped the packetbeat service. During the 60 seconds 22,119 DNS documents were added to the index. When I removed the documents packetbeat created (that were not real DNS queries), it deleted 21,391. leaving me with 728 actual DNS queries. In this case, 97% of the documents were created by packetbeat, and 3% where 'real' DNS queries made by our systems which packetbeat captured.
Any ideas as to why this behavior is being exhibited by this system?
Thank you