ITware Case Study: Log Collector Solution

Home

NEWS

ITware Case Study: Log Collector Solution

: Created on 11 January 2017

ITware is currently in the first phase of creating a new solution for one of our clients, we have created a case study to share with our viewers.

The customer’s inquiry: The customer is currently running an ELK (Elasicsearch + Logstash + Kibana) stack for log collecting and sorting, they would like a better solution than this due to the following issues.

They want more zero configuration/ convention over configuration
Multi line log processing is a very big issue currently
They would like a non-custom protocol between the log senders and aggregators, because they want to proxy the communication through Apache instances
They have legacy applications which can only write logs into files, has no syslog/syslog-ng implementation
The customer would like some microservice like components in the solution without needing major new components in their infrastructure (e.g service registry)
Every service must have a health check/ statistics interface over HTTP

Our solution: At ITware we use top technologies and methodologies including microservices, containerisation, cloud services and more. For this particular solution we will deliver a three-component solution, the parts are divided as follows

Log Collector:

Listens for filesystem actions under a directory and starts reading newly created files
Continuously reads the logfiles in this folder tree and passing the read lines to the log aggregator
Source logfile paths are passed along their contents, with this path, the aggregator can decide which processing method and pattern should be used
Aware of inode number, therefor, application level log rotating will not make a new logfile to read
Communicates with the log aggregator over websocket (proxyable by apache)
Can handle file truncate events
Will have the ability to save its state, therefore, if it is restarted it will have the right file positions to continue the read from

Log Aggregator:

Listens on a websocket for a collector
Includes a parser database, which supports Grok and Javascript parsers
Chosen processing method and pattern is decided by the source host and log file’s path
It parses the log messages and stores them in Elasticsearch

Elasticsearch

The common component with the predecessor ELK stack
Stores the log messages

About the log parsing parsing/ processing:

Grok is implemented because the predecessor system also contains the logs in this format
A javascript engine is implemented in the aggregator, which allows it to write finite-state machines for log parsing, due to the fact that it has its own javascript context for every logfile source. With that, very complex rules can be implemented for multi-line log parsing.

About zero configuration: The aggregator decides the log processing parameters, depending on the source log path, for example, if every nginx log is put into a /var/log/nginx folder and a rule is defined for this, it will automatically be processed if a new source is added.

Conclusion: This solution will not only solve all the issues the customer is facing with their current log collector, it will also provide them with a more efficient, easy to use log collector, that has many added benefits for their company.