Magyar English 日本語
Sun

ITware is currently in the first phase of creating a new solution for one of our clients, we have created a case study to share with our viewers.

 

The customer’s inquiry: The customer is currently running an ELK (Elasicsearch + Logstash + Kibana) stack for log collecting and sorting, they would like a better solution than this due to the following issues.

  • They want more zero configuration/ convention over configuration
  • Multi line log processing is a very big issue currently
  • They would like a non-custom protocol between the log senders and aggregators, because they want to proxy the communication through Apache instances
  • They have legacy applications which can only write logs into files, has no syslog/syslog-ng implementation
  • The customer would like some microservice like components in the solution without needing major new components in their infrastructure (e.g service registry)
  • Every service must have a health check/ statistics interface over HTTP

Our solution: At ITware we use top technologies and methodologies including microservices, containerisation, cloud services and more. For this particular solution we will deliver a three-component solution, the parts are divided as follows

 

Log Collector:

  • Listens for filesystem actions under a directory and starts reading newly created files
  • Continuously reads the logfiles in this folder tree and passing the read lines to the log aggregator
  • Source logfile paths are passed along their contents, with this path, the aggregator can decide which processing method and pattern should be used
  • Aware of inode number, therefor, application level log rotating will not make a new logfile to read
  • Communicates with the log aggregator over websocket (proxyable by apache)
  • Can handle file truncate events
  • Will have the ability to save its state, therefore, if it is restarted it will have the right file positions to continue the read from

Log Aggregator:

  • Listens on a websocket for a collector
  • Includes a parser database, which supports Grok and Javascript parsers
  • Chosen processing method and pattern is decided by the source host and log file’s path
  • It parses the log messages and stores them in Elasticsearch

Elasticsearch

  • The common component with the predecessor ELK stack
  • Stores the log messages

About the log parsing parsing/ processing:

  • Grok is implemented because the predecessor system also contains the logs in this format
  • A javascript engine is implemented in the aggregator, which allows it to write finite-state machines for log parsing, due to the fact that it has its own javascript context for every logfile source. With that, very complex rules can be implemented for multi-line log parsing.

About zero configuration: The aggregator decides the log processing parameters, depending on the source log path, for example, if every nginx log is put into a /var/log/nginx folder and a rule is defined for this, it will automatically be processed if a new source is added.

 

Conclusion: This solution will not only solve all the issues the customer is facing with their current log collector, it will also provide them with a more efficient, easy to use log collector, that has many added benefits for their company.