IIS logs and ELK Stack

This is a sequel to https://ragolsec.blogspot.com/2019/12/win2012r2-iis-85-and-logging-client.html

So, now I had multiple log files with crypto settings included. What next?

Some days after I had restarted the logs with current settings I got a message from the project manager: "I have the meeting with customer, starting in a couple of minutes. Do you have the data for me?"

I saw this message some minutes after the meeting had already started, *sigh*, but decided to do what could be done. And I opened Excel. Yup, Excel. Log files are kind of CSV, although the separator is space, not comma, so for really quick and dirty work Excel is quite fine. I've always said that Excel is probably the most misused program in any corporate environment.... :)

Excel is kind of like Leatherman type of tools. There are better special tools for every job, but you can do so many things (poorly) with Leatherman (or Excel) that as a package, it's just invincible. Almost.

I had about 1,2 million log lines, and Excel almost said 'Enough!'. But luckily for me, just almost. Of those 1,2 million log lines, only 82 had something else than TLS 1.2 in use, about 0,007%. No idea about how many clients or anything else. These were just raw log lines, so multiple lines from every client. Depending on how long they had browsed this particular website.

For the actual data mangling I needed something more powerful, and luckily I had one such a beast available.

ELK Stack to the rescue!

Although ELK Stack is usually used for central logging purposes, you can import individual files there through Logstash, and can process them as you would process any other log data. It's just necessary to set the start_position correctly, because otherwise it would start from the end of the file and wait that new data will appear. The default functionality looks quite familiar to a Linux power user. Did I hear 'tail -f', anyone?

input {
  file {
    path => "/tmp/data.log"
    start_position => "beginning"
  }
}


Then you need to parse the log files, because plain log lines are not usable on Elasticsearch. You need to parse them to fill the specified fields, so the search will actually work as you want it to work.

In Logstash you can use Grok-plugin to parse the log files, or Dissect.
Dissect differs from Grok in that it does not use regular expressions and is faster. Dissect works well when data is reliably repeated. Grok is a better choice when the structure of your text varies from line to line.
I decided to use Grok, because the line content vary quite much in IIS logs. Getting correct Grok match-queries can be quite time consuming step, but I used this page to help me in the process: https://grokdebug.herokuapp.com/

The actual grok-filter depends on your log files, but here is the version which worked for me. I actually had multiple match sections at first, but when writing this blog I understood how I could merge them and make the config simpler. Always remember the KISS principle, and remember that you'll understand the concept well enough when you can teach it to someone.

There is also the date-block, which populates the @timestamp field with the data found from the actual log line. Correct timestamps are the base of any practical log analysis, so you'd better get them correct!

And finally, this config discards all those lines which won't fit to the match-section. The IIS log files used in this exercise seemed to have some header stuff etc, and those would just cause issues in Elasticsearch, so we don't want to ship them there. If a line is not matched to any of the match-sections, it will have a _grokparsefailure -tag added. So, kind of easy to drop. Luckily devs are sometimes thinking (or listening to) their actual users! ;)


filter {

  grok {
    match => [ "message", "%{TIMESTAMP_ISO8601:log_timestamp}    %{NOTSPACE:server_host} %{IP:server_ip} %{WORD:method} %{NOTSPACE:request_uri} %{NOTSPACE:request_uri_query} %{NUMBER:port} - %{IP:client_ip} %{NOTSPACE:http_version} %{NOTSPACE:user_agent} %{NOTSPACE:referer} %{NOTSPACE:host} %{NUMBER:status_code:int} %{NUMBER:substatus:int} %{NUMBER:win32_status:int} %{NUMBER:bytes:int} %{NOTSPACE:tls_version}"]

  }

  date {

    match => ["log_timestamp", "YYYY-MM-dd HH:mm:ss"]
    target => "@timestamp"
  }


  if "_grokparsefailure" in [tags] {
    drop { }
  }
}


Here's a log line which matches this query:

2019-12-15 14:37:57 server1 10.0.0.1 GET /api/query1 param1=1&param2=2019-12-15T00:00:12&param3=a 443 - 192.168.0.1 HTTP/1.1 Mozilla/5.0+(iPad;+U;+CPU+OS+3_2_1+like+Mac+OS+X;+en-us)+AppleWebKit/531.21.10+(KHTML,+like+Gecko)+Mobile/7B405 https://server1.domain.invalid/mobile/?origin=mobile server1.domain.invalid 200 0 0 46 400 6610 800d ae06

When I was building this setup, I scratched my head once, twice, and finally absolutely too many times. It just didn't work correctly.

The data appeared just fine on Elasticsearch. Then I did some changes to the filter config, removed the data from Elasticsearch, restarted the Logstash and the data didn't appear on the Elasticsearch. I removed changes, removed the data + restarted Logstash. Nothing.Whyt it worked before but not anymore with the exactly same config file? I was almost ready to blame stupid computer, but sighed heavily and dug deeper.

I changed the input file. Something appeared, but not the correct data.

After quite thorough debugging process (and lost nerves) I finally noticed, that Logstash keeps the status of the input file in SinceDB database.  But who reads the documentation beforehand, I just ask?? This means that if the input file didn't change, it wouldn't send the data to Elasticsearch, even though filters and other things changed in the Logstash pipeline config file.

Yet again one of these hit me.
There are two hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.
So, now the process for playing with different Logstash configs was finally working correctly.
  1. systemctl stop logstash && rm /var/lib/logstash/plugins/inputs/file/.sincedb_*; curl -X DELETE 'http://localhost:9200/logstash-2020.01.06-000001';
  2. Edit the /etc/logstash/config.d/import-file.conf
  3. systemctl start logstash
 
What about the results? More about them later. And also more about how the results were get from

Comments

Popular posts from this blog

The only constant is change

Passion is a fruit

Hack the Box, CTF, challenges, and ethical hacking (+ some thoughts about courses)