Splunk Free: .htaccess Protection using Apache

You ever hear of Splunk? Splunk enables you to search and navigate all your logs and IT data in real time; logs, configurations, messages, traps and alerts, scripts and metrics. It’s an awesome tool to make it easier to monitor and watch your log files. Unfortunately, Splunk is expensive. How expensive? Try $5000 a year, for the cheapest license. Here’s the main problem; the free version of Splunk does not come with any user authentication, not even Admin authentication. This means that anyone can access your Admin area of Splunk, and can see any log files you have and can even set up new Splunks (log file watches). Let’s fix this!

I would have thought that a standard feature of Splunk would be at least Admin user authentication, but you can only get that with the professional version of Splunk. You have 30 days of the Professional version of Splunk, and then you must purchase the license after that. So, most individuals that just want to manage their log files remotely via the web, cannot afford and should not even purchase a Professional license, so the Free version is perfect. The lack of authentication kind of makes you turn your nose to Splunk, as this posses a security issue. Note one thing, when I say authentication, I mean username and password. You literally can access all admin features, including license information, just by going to the web address (which is usually a domain name on the default port 8000, e.g. http://domain.com:8000). This is totally rediculous. We can get around this by running a proxy within Apache and secure the subdomain (http://splunk.example.com/) with a .htaccess file.

Just A Few Things

The environment I’m running is Apache 2.x on a CentOS server and you must have root access to the server, as you will need to install Splunk and then make changes to the Apache server. Also, I presume that you already have a domain name and you are wanting to create a sub domain called splunk (splunk.domain.tld), that has some sort of user authentication.

Installing Splunk

Installing Splunk on a system using the RPM is very easy; almost too easy. First, you will want to download the current version of Splunk (3.1.3 at the time of writing). You can use compile it from the source if you would like, but this article will cover how to install Splunk using the RPM. After selecting the download you want (RPM), it will redirect you to a download page that will give your the wget URL for downloading Splunk; select and copy that full URL that it gives you. The link that I provide may be old, depending on when you read this post. Now, in your BASH prompt:

BASH

[root@server ~]# wget 'http://www.splunk.com/index.php/download_track?file=3.1.3/linux/splunk-3.1.3-28524-linux-2.6-x86_64.rpm&ac=&wget=true&name=wget&type=releases'

This will download Splunk into the current directory you are in. When the download has completed, you can start the install. The RPM install is the easiest, you just need to run one command:

BASH

[root@server ~]# rpm -i --force --prefix=/opt/splunk3.0/splunk splunk-path-to-rpm.rpm

You should see something close to the following:

BASH Output

----------------------------------------------------------------------
The Splunk Server has been installed in:
        /opt/splunk3.0/splunk/splunk

To start the Splunk Server, run the command:
        /opt/splunk3.0/splunk/splunk/bin/splunk start

To use Splunk's web interface, point your browser at:
        http://server:8000

Complete documentation is at http://www.splunk.com/r/docs
----------------------------------------------------------------------

Disabling SELinux

When you tell Splunk to start, it will create some files and directories and then check to see if SELinux is enforced. If you have SELinux enabled, then Splunk will not run correctly, and you will need to either disable SELinux, or configure SELinux to allow Splunk to run correctly (not covered in this article). You can temporarily stop SELinux, but unfortunately, Splunk looks at the selinux file, and checks to see if it is set to enforcing. If it is set to enforcing, then we will need to change this in the SELinux configuration file, which is located at /etc/sysconfig/selinux. Edit the selinux file and set the SELINUX=enforcing to SELINUX=disabled. Once you have done this, you will need to save the file and then stop SELinux in real-time, as changing the configuration file only tells SELinux to disable itself at boot-up. So, you will need to set run the command setenforce 0 to disable SELinux in real-time. If you do not do this, you can also reboot the system and it will take the new settings for SELinux.

Starting Splunk

As the documentation states, start the Splunk server:

BASH

[root@server ~]# /opt/splunk3.0/splunk/splunk/bin/splunk start

You will need to scroll down to the bottom of the license agreement and accept it to continue. It will run its init script and should start with no issues. After it starts, it will let you know that Splunk is running on port 8000 on the host name of the server; you can substitute the host name with the IP address of the server. In this case, the host name of the server is server, so we can access Splunk using http://server:8000. More than likely, you will actually have a domain name on a remote network/server, so you will access it by way of http://example.com:8000.

Just a few notes

Remember, if this is a new server, you might not have Apache started and your firewall might cause issues when trying to access Splunk on your server. Make sure Apache is started by running /etc/init.d/httpd status If it is not running it will say httpd is stopped. You will need to start it by running /etc/init.d/httpd start. It should start with no issues. Now, try connecting to your server, by opening a web browser using http://server:8000 (or whatever your hostname is, in this case we are using server). This should display your Splunk startup screen. This means that Splunk has successfully been installed and is ready to be used! Congrats.

Configuring Apache for Splunk .htaccess Protection

As stated before, Splunk doesn’t offer any user authentication by default, so we have to configure Apache to protect our Splunk logs so that no one else can view your log files, which can have some very valuable information in them. Let’s secure this drawback of the free version of Splunk and make it so that .htaccess can authorize a user login. In order to get this working, we have to configure Apache as a proxy server for the IP address and the server name.

Load the Apache Proxy Modules

Before we continue, you need to make sure that you have at least the mod_proxy, mod_proxy_http, and mod_proxy_connect Apache modules installed. Normally, these are installed and loaded by default, so you shouldn’t have to worry about this. To verify this, just type in httpd -M and make sure those modules are loaded.

Making the changes to httpd.conf

Now, it’s time to actually setup the proxy. What we are going for here is to redirect any requests for the IP address and server name (such as a subdomain of splunk.example.com) and redirect it to localhost on the port that Splunk is running on and serve the .htaccess from a localhost connection.

Editing /etc/httpd/conf/httpd.conf

<virtualhost x.x.x.x:80>
        ServerAdmin root@localhost
        ServerAlias splunk.example.com
        ProxyPass / http://127.0.0.1:8000/
        ProxyPassReverse / http://127.0.0.1:8000/
        ErrorLog logs/splunk.example.com-error_log
        CustomLog logs/splunk.example.com-access_log common
</virtualhost>

<proxy http://127.0.0.1:8000/*>
        Order deny,allow
        Deny from all
        Allow from all
        AuthName "splunk.example.com"
        AuthType Basic
        AuthUserFile /var/www/.htpasswd.users
        Require valid-user
</proxy>

Where x.x.x.x is your public IP Address.

Of course, you will need to configure this for you environment. Make sure you change the x.x.x.x to your public IP address and change example.com to you own domain. Additionally, if you would like to, you can change the splunk subdomain to whatever you would like to also. Just make sure you create and update you DNS information as needed. If you are going to have a splunk.example.com subdomain, make sure you have this configured in your DNS first before you do this (also allow for it to propagate). Also, make sure that you restart Apache, or else the new changes will not work:

Restarting Apache

/etc/init.d/httpd restart

Creating the .htaccess File

With the above configuration, you told Apache to use the .htpasswd.users file in the /var/www directory. You can follow my other article on how to configure .htaccess. If you plan on storing your .htaccess/.htpasswd files somewhere else, you will need to update your httpd.conf file to reflect the absolute location.

Closing Notes

Personally, I think the free version of Splunk should at least provide an admin user login, but that just isn’t something they are offering. Splunk, is very powerful and extremely helpful to see all your log files from one view. I don’t have alot of data that is written to my log files, however, the data that gets generated really helps to solve some issues. I guarantee that using Splunk will help you out greatly, especially if you have alot of custom logs that you are trying to manage manually.


Discussion always soothes thy heart.

1 Quote this comment

How dare ye not let me know of new posts? :P

heh… I’m going to try this after I reinstall Ubuntu (the network driver reinstall iddin’t work to well. :P )

By: Joe

2 Quote this comment

How dare ye not let me know of new posts? :P

heh… I’m going to try this after I reinstall Ubuntu (the network driver reinstall iddin’t work to well. :P )

Thought you had my RSS feed? Anyways, I’ve been quite busy with work. Haven’t talked with you online in awhile. Hope all is well.

By: Drew

3 Quote this comment

How dare ye not let me know of new posts? :P

heh… I’m going to try this after I reinstall Ubuntu (the network driver reinstall iddin’t work to well. :P )

Thought you had my RSS feed? Anyways, I’ve been quite busy with work. Haven’t talked with you online in awhile. Hope all is well.

Well, aside from the fact that I killed ubuntu after a week of using it. :P Not much. Got a lot of projects I’m starting in the next few weeks so I’ll be busy too.

By: Joe

4 Quote this comment

I am very interested in using this method. Free Splunk should come with a basic one-user authantication. I bet that admin is going to get hacked because of the information in Splunk and it will look bad on Splunk.

What webserver does Splunk use by default? Something embedded? I installed 3.x on my Ubuntu 7.10 server, and the web-interface started working without installing apache.

Will your setup still work?

Thanks,

Tristan

5 Quote this comment

I am very interested in using this method. Free Splunk should come with a basic one-user authantication. I bet that admin is going to get hacked because of the information in Splunk and it will look bad on Splunk. What webserver does Splunk use by default? Something embedded? I installed 3.x on my Ubuntu 7.10 server, and the web-interface started working without installing apache. Will your setup still work?

Agreed. Splunk uses its own AppServer. Have a look at two files:

/opt/splunk/etc/bundles/default/web.conf
/opt/splunk/etc/bundles/default/server.conf

Splunk itself starts its AppServer and runs its Python Code core. I wouldn’t know how to do this without Apache though, as you are configuring Apache as the proxy, therefore, you need Apache for the proxy. You can install Apache pretty easily with yum or apt-get (if you’re running Ubuntu, apt-get is what you will probably use). You can install a base install of Apache without configuring a huge LAMP server.

So, overall, this will not work (that I know of) without Apache, or some type of webserver that can serve as a web proxy.

Let me know how it goes.

Regards,
Drew

By: Drew

6 Quote this comment

I am very interested in using this method. Free Splunk should come with a basic one-user authantication. I bet that admin is going to get hacked because of the information in Splunk and it will look bad on Splunk. What webserver does Splunk use by default? Something embedded? I installed 3.x on my Ubuntu 7.10 server, and the web-interface started working without installing apache. Will your setup still work?

Agreed. Splunk uses its own AppServer. Have a look at two files:

/opt/splunk/etc/bundles/default/web.conf
/opt/splunk/etc/bundles/default/server.conf

Splunk itself starts its AppServer and runs its Python Code core. I wouldn’t know how to do this without Apache though, as you are configuring Apache as the proxy, therefore, you need Apache for the proxy. You can install Apache pretty easily with yum or apt-get (if you’re running Ubuntu, apt-get is what you will probably use). You can install a base install of Apache without configuring a huge LAMP server.

So, overall, this will not work (that I know of) without Apache, or some type of webserver that can serve as a web proxy.

Let me know how it goes.

Regards,
Drew

lots of hacking and someone really good at python

By: Joe

7 Quote this comment

I didn’t try this yet, but was wondering, does this method stop someone from going to http://yourdomain.com:8000 and getting in? It looks like it just adds another URL where authentication would be required.

Thanks,
Don

By: donnyspi

8 Quote this comment

I didn’t try this yet, but was wondering, does this method stop someone from going to http://yourdomain.com:8000 and getting in? It looks like it just adds another URL where authentication would be required.

Nope. You should have firewall rules running. You should (by default) block anything on your public network, and only allow what you want in, such as port 80 for web traffic, port 22 for SSH (I’ve changed mine to something else), etc.

Most people, like me, I allow any traffic on my loopback device (device lo), so I can have the proxy running on this, and forward all the splunk traffic on this. So when a request comes in for the domain, that points to a proxy, then the IP/domain isn’t allowing or listening on port 8000, but your loopback device is.

Any questions, or help, let me know.

Regards,
Drew

By: Drew

9 Quote this comment

Drew, thanks for your prompt replies. Let me see if I understand this process.

1. Install splunk, and start the built-in python web-server. By default this runs on TCP port 8000.

2. Install Apache2. Configure an httpd.conf file almost identical to the one you posted. This will tell Apache to listen on port 80 and send any received requests to port 8000. The user will only see the port 80 traffic.

Does that sound correct?

If I can get this working with http, I will then try it with https.

Thanks,

Tristan

10 Quote this comment

I guess an additional step would be to configure your firewall to allow port 80 incoming, and deny all other ports (including port 8000).

11 Quote this comment

Hooray! I got this working on Ubuntu 7.10.

I am going to configure SSL on Apache and leave Splunk without SSL (since that communication is local to the server).

Thanks for posting this information, Drew. I might be creating a wiki page for Ubuntu soon.

12 Quote this comment

I guess an additional step would be to configure your firewall to allow port 80 incoming, and deny all other ports (including port 8000).

Tristan,

SO sorry about the lack of replies. I never got an email stating you had replied, so I didn’t notice. I’m glad you got it working.

Good luck on your Ubuntu wiki and let me know if you need any suggestions or help.

Regards,
Drew

By: Drew

13 Quote this comment

where do you place the .htaccess file? With Splunk, it’s very difficult to figure out where the actual “home” directory is located.

thanks in advance for your assistance!

By: Eugene

14 Quote this comment

where do you place the .htaccess file? With Splunk, it’s very difficult to figure out where the actual “home” directory is located.

You can put your .htaccess file really anywhere, as long as apache can read it (permissions). I recommend somewhere like /var/www. What distibution of Linux are you running? Let me know if you have anymore questions.

Regards,
Drew

By: Drew

15 Quote this comment

Great writeup… I was wondering, to protect access to port 8000 on the LAN.. could you use iptables and only allow the loopback access (since the apache server will tunnel you through after AUTH?)

I am considering this setup, just not sure if we have too much data…is 80mb in logs going to exceed the indexed amount? (500mb)

Thanks and sorry for the off-topic question.

Matteo

By: Matteo

16 Quote this comment

is 80mb in logs going to exceed the indexed amount? (500mb)

This should be fine. It should not exceed the indexed amount.

You can protect access through port 8000 on the LAN, if you would like to, with IPTables. You can configure Apache to only listen to the “virtual host” of the Splunk directory on localhost, also.

By: Drew

17 Quote this comment

Configure SELinux

If you have SELinux active on your system, you must add Splunk to the list of authenticated applications that can run in your SELinux environment.

To configure SELinux to allow Splunk to run, you need to run the
chcon command on the Splunk lib directory. Here is what you type :

chcon -c -v -R -u system_u -r object_r -t lib_t $SPLUNK_HOME/lib 2>&1 > /dev/null

You must also disable the check when Splunk starts by adding this line
to $SPLUNK_HOME/etc/splunk-launch.conf.

SPLUNK_IGNORE_SELINUX=1

Found that on Splunks website:
http://www.splunk.com/doc/3.3/installation/SplunkSELinux

By: Zach

18 Quote this comment

I am getting:
503 Service Temporarily Unavailable

I have done exactly what this tutorial says. Also tried small variants i found online. Any help?

By: Zach

19 Quote this comment

Drew,

I followed your instructions on a CentOS 5.2 + Splunk 3.3 box and they worked great. I ended up having to switch to Ubuntu for unrelated reasons, and i cannot get this setup to work for the life of me. Can somebody help me adapt these instructions to Ubuntu Server 8.04? Much appreciated!

By: bh5505

20 Quote this comment

I am getting: 503 Service Temporarily Unavailable

What do you log files tell you? Try looking in /var/log/httpd/error_log

I ended up having to switch to Ubuntu for unrelated reasons, and i cannot get this setup to work for the life of me. Can somebody help me adapt these instructions to Ubuntu Server 8.04? Much appreciated!

Do you still need help with this? If so, I am willing to set up a VM at home this week and write an updated Ubuntu version of this for you. Let me know if you still need assistance.

Regards,
Drew D.

By: Drew

21 Quote this comment

Drew,

I know this is relatively old post by now…but I’m in the process of configuring Splunk as described above and am REALLY close…

My configuration is slightly different from yours in that I want http://myhost.com/splunk/ to be interpreted instead of the root directory that you describe above.

Apache seems to be fine with this, but none of the dynamically-fetched data (aka everything useful) seems to load whenever I map anything but the root dir to my Splunk server.

Is there another address I have to declare in my httpd.conf? Right now its just the /splunk/ ProxyPass and ProxyPassReverse directives.

Thanks!!!

By: Luke

22 Quote this comment

Apache seems to be fine with this, but none of the dynamically-fetched data (aka everything useful) seems to load whenever I map anything but the root dir to my Splunk server.

So, are you saying that you get Splunk to work or are you saying that it doesn’t work? What do you mean that none of the dynamically-fetched data (aka everything useful) seems to load.

I can help you with the configuration and such, but you need to give me more information. You can also provide screenshots and link to imageshack.us or something. I’m more than willing to get you setup and going with this. Just need more information.

Regards,
Drew D.

By: Drew

23 Quote this comment

Drew,

Thanks for the step-by-step instruction! Worked out really good for me.

Quick question. How come when i go directly to the url (http://:8000) it doees not prompt for a login?

Thanks!!

By: RAM

24 Quote this comment

Drew,

I got it running. I made Splunk to listen locally on 127.0.0.1. Now, nobody can go directly to port 8000 using the hostname. They will get a failed to connect.

Thanks! =)

By: RAM

25 Quote this comment

I got it running. I made Splunk to listen locally on 127.0.0.1. Now, nobody can go directly to port 8000 using the hostname. They will get a failed to connect.

RAM,

Great to hear. I apologize for not getting back to you fast enough. Things have been quite crazy, so I am glad you figured out the issue. Hope my article was easy enough to understand.

Regards,
Drew D.

By: Drew

26 Quote this comment

You can tell splunk to bind only to localhost by setting SPLUNK_BINDIP=127.0.0.1 before starting it.

By: Josh

27 Quote this comment

Fantastic. Adding BIND before starting up, then adding the info as described to httpd.conf made it all work first time.

By: Richard

28 Quote this comment

Richard,

Thanks for the kind words! Anytime I get great comments, it makes me feel that my time here writing from my experiences are well worth the effort.

Regards,
Drew

By: Drew

29 Quote this comment

Smashing! Thank you for this solution … clearly written and understandable.

I just installed Splunk today – the first thing I did was try to find a password or authentication option. Eeek – none!

So finding your solution was ideal. Worked “out of the box” first time.

Thanks!

By: Dean

30 Quote this comment

Smashing! Thank you for this solution … clearly written and understandable.

So finding your solution was ideal. Worked “out of the box” first time.

Dean,

Thanks! Glad it helped you out. It’s always nice hearing awesome results from my experiences. Makes me feel like I actually did something productive!

Have a great day.
Drew

By: Drew

32 Quote this comment

@Josh: SPLUNK_BINDIP will affect splunkd (the splunk daemon) and not splunkweb (the python web interface, what you really want to proxy).
which is a good thing and what you probably want to do anyway, but not quite the same thing.
to make splunkweb bind to localhost only, you have to edit etc/system/local/web.conf

# Host values may be any IPv4 or IPv6 address, or any valid hostname.
# The string ‘localhost’ is a synonym for ‘127.0.0.1′ (or ‘::1′, if
# your hosts file prefers IPv6). The string ‘0.0.0.0′ is a special
# IPv4 entry meaning “any active interface” (INADDR_ANY), and ‘::’
# is the similar IN6ADDR_ANY for IPv6. The empty string or None are
# not allowed.
#server.socket_host = 0.0.0.0
server.socket_host = 127.0.0.1

as a side note, you can also define SPLUNK_BINDIP in etc/splunk-launch.conf.

@Luke, @Drew:

you cannot reverse-proxy splunk to something different than root (as in “/something/”), because the HTML code makes lots of reference to “/” (e.g. /script.js”).
You can put all your Proxy directives inside a (Name)VirtualHost block, though. Works like a charm.

Hope that helps,

-a

By: andrew

Go ahead, say somethin'

Note: If this is your first time commenting on my site, there will be a delay, as I have to approve your comment.