I am trying to find a way to integrate Splunk and the FireSight Database using the Database access API. Currently, we are using eStreamer for low volume events and syslog alerting for high volume events, such as connection events (as eStreamer chokes on high data volumes). The syslog output does not appear to be configurable, however, the database access API seems to be highly configurable. A few questions:
- Can the database access API be used for high-volume logs across a large deployment (think millions of line of connection events)?
- When running the test application "RunQuery" we get sporadic errors that are cryptic. How do we troubleshoot?
For example, the query:
"SELECT first_packet_sec, last_packet_sec, INET6_NTOA(initiator_ipaddr) AS src_ip, INET6_NTOA(responder_ipaddr) AS dest_ip, INET6_NTOA(src_device_ipaddr) as dvc_ip FROM connection_log ORDER BY first_packet_sec DESC, last_packet_sec DESC LIMIT 0, 25;"
Returns the error: java.sql.SQLException: Table 'rna_flow_stats_1493575800_0' doesn't exist
But only about 50% of the time. More complex queries return this result all the time.
Any help is appreciated!
The performance limitations you have experienced with eStreamer are very likely to do with the client - Cisco eStreamer for Splunk TA & App. I'm assuming you are using this: https://splunkbase.splunk.com/app/1629/ It is single threaded.
The server side of the API on the FMC is capable of handling thousands of events per second depending on the FMC hardware model you use.
If you are using Firepower version 6.x then you will be able to take advantage of a completely new, built from scratch Splunk TA for eStreamer. The new version is plugin-based, multi-threaded and will provide huge performance advantages. We will post this new TA&App on Splunkbase around June 1st. It will be free but there will also be a paid Cisco TAC support option for customers that want it.
The Database Access API is not recommended for high volume - continuous event collection. It is very flexible however and so for ad hoc queries of events or especially the Host database its a good way to go.
I'll ask a more technical person to look at your query. Do you still want to pursue this if the new estreamer - Splunk solution eliminates all the throughput issues?