Splunk Modular Inputs

Splunk Enterprise 7.3 is here and it got me excited to play with it. Learn more here. While I was splunking around, I thought I should revisit a topic that I have never really had a chance to play with a lot. It’s modular inputs. To better understand this feature, I thought of the following use case.

Let’s create an app that will fetch data from a publicly available api using an api-key. The api-key will be supplied by the user using our modular input. The app will use that key, query the api, receive json data and index it in Splunk. Simple Enough!

Create an app in /opt/splunk/etc/apps. Let’s name it my_splunk_mi. The directory structure within it can look like this.

my_splunk_mi/
  bin/
    my_splunk_mi.py
  default/
    app.conf
    indexes.conf
  lib/
    splunklib/
  README/
    inputs.conf.spec

Install splunk-sdk for python and place the splunklib folder inside lib/. This is a good practice as if you ever want to package and share that app, you should always include the dependencies. The python file my_splunk_mi.py will look like this.

# $SPLUNK_HOME/etc/apps/my_splunk_mi/bin/my_splunk_mi.py

import sys
import xml.dom.minidom, xml.sax.saxutils
import os
import json
import time
from requests import Request, Session
from requests.exceptions import ConnectionError, Timeout, TooManyRedirects

sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "lib"))

from splunklib.modularinput import *

class MyScript(Script):

    def get_scheme(self):
        scheme = Scheme("mysplunkmi")
        scheme.description = "Fetch data from an api."
        scheme.use_external_validation = True
        scheme.use_single_instance = False

        apikey_argument = Argument("apikey")
        apikey_argument.data_type = Argument.data_type_string
        apikey_argument.description = "The api key for the api request"
        apikey_argument.required_on_create = True
        scheme.add_argument(apikey_argument)

        return scheme

    def validate_input(self, validation_definition):
        apikey = str(validation_definition.parameters["apikey"])
        if len(apikey) != 36:
            raise ValueError("The apikey needs to be 36 character long!")


    def stream_events(self, inputs, ew):
        for input_name, input_item in inputs.inputs.iteritems():
            load_data(input_name, input_item, ew)


def load_data(input_name,input_item, ew):
    apikey = str(input_item["apikey"])
    url = 'put your api url here'
    # Add params and headers as needed by the api that you use
    parameters = {}
    headers = {}
    session = Session()
    session.headers.update(headers)

    try:
        response = session.get(url, params=parameters)
        data = json.loads(response.text)
        # The iteration logic might change depending on what the api 
        # returns.
        for d in data["data"]:
            event = Event()
            event.data = json.dumps(d)
            ew.write_event(event)
    except (ConnectionError, Timeout, TooManyRedirects) as e:
        print(e)

if __name__ == "__main__":
    sys.exit(MyScript().run(sys.argv))

Create an app.conf and indexes.conf in the default folder.

# app.conf
#   Version 7.3
#
# Splunk app configuration file
#

[install]
is_configured = true

[ui]
is_visible = true
label = My Splunk MI App

[launcher]
author = sportsfreak
description = My attempt to learn about modular inputs led to the creation of this app.
version = 1.0

[package]
id = mysplunkmi

# indexes.conf
[mysplunkmiidx]
homePath   = $SPLUNK_DB/mysplunkmiidx/db
coldPath   = $SPLUNK_DB/mysplunkmiidx/colddb
thawedPath = $SPLUNK_DB/mysplunkmiidx/thaweddb
frozenTimePeriodInSecs = <Retention of your choice>

And finally under README folder add an inputs.conf.spec file

# inputs.conf.spec file
*$SPLUNK_HOME/etc/apps/my_splunk_mi/README/inputs.conf.spec
[mysplunkmi://<name>]
*Set up the mysplunkmi scheme defaults.
apikey = <value>
index = mysplunkmiidx

That’s it. Now restart Splunk and you should see your new app. Go to Data Inputs and you should see your new modular input. Create a new input using the UI, set up a cron, select the index and sourcetype. The modular input will be invoked and the script will use the api key you gave while creating the input in the step above. It will parse the json and index it. You can think of a lot of use cases where you need to get data from a custom source and use this model to achieve what you want.

Happy Splunking!

Source Code for reference here

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s