The value of structured app logging for programmers

in Software developmentBy Vitor Oliveira, Software Engineer

In my teams, logging is always a main concern and one of the first things to do when we bootstrap an application. The value of a meaningful and structured logging architecture is immeasurable because it helps us to solve any problem faster and more efficiently. Without logs, your team could be stuck trying to explain why your application is behaving unexpectedly, or it could take hours just to respond to a simple client query about an error. In Hexis, that doesn’t happen because we plan ahead and we take this matter very seriously.

We live in a world where clients want applications developed in an unbelievably short time. We, software engineers, struggle (almost) every time to meet deadlines because we want to deliver the best work that we can. Strict deadlines means that there’s always a risk that something is left behind: automatic tests, code quality from taking shortcuts, documentation, among other things, can turn a project into a nightmare in the long run.

We are aware of all these critical things but there is a particular piece of any application that we only remember when we really need it: the logs. Yes. The logs are all those letters that you painfully see in your terminal when something goes wrong and you have your boss asking every 5 minutes why the application is still malfunctioning.

Source: datamation.com

In my career, I learned and absorbed a set of rules to streamline the logging of my applications. Do you find this boring? Would you prefer to never have to stay awake until dawn again, trying to understand why your users are receiving errors when they want to do something trivial? Interested? Then come with me through the W’s that make part of the rules of logging.

What

Before we really start on this, you have to remember only one thing: you have to log everything! Every request. I don’t care if it’s a GET, a POST, a PUT, if it had success, if returned some kind of validation error, I really don’t care. Log everything.

You only log the errors? Sorry, you are doing it wrong! How do you know when a specific user made a request? How are you going to create that wonderful dashboards full of statistics with tools like Kibana or Grafana? And of course, another thing that is great for those dashboards is to log the log level itself. Actually, it should be mandatory to record if that log entry is an INFO, WARN, ERROR, etc.

Typically, a backend application serves an API (it could be B2B, B2C, but that doesn’t matter). What’s important is that you want to know which service the user just called, so you have to log the name of the endpoint. And even though that name is quite specific, it’s always a good idea to log also the type of HTTP method of the endpoint. What changed? Was a resource created? Was a resource retrieved? How many? Do you see what I mean? This is the data that will become invaluable at some point in your project. And typically, it’s information that you have at the moment of the response, so you don’t have any excuse not to log it!

Bonus points: thread/request id. It’s always nice to have this when you have to track a long operation that logs a lot of details: like queries or even custom information, which is being recorded at the same time that the application is still receiving other requests. The multiple lines of logging different requests can get mixed up and it can become a real mess to find anything. The identifier of a request can prevent a situation like this, or it can be easily resolved with a simple search.

2019–04–06 17:02:30.414 INFO [req-000000086] [c.b.a.controller.ApiController] [operationName=ApiController.getAll, httpMethod=GET, IPAddress=100.200.300.400, userId=55df8919, executionTime=12, size=5] [operationResult=OK]
2019–04–06 12:10:21.741 INFO [req-000000052] [c.b.a.controller.ApiController] [operationName=ApiController.get, httpMethod=GET, IPAddress=100.200.300.400, userId=55df8919, executionTime=5, resourceId=1000] [operationResult=OK]

When

Let’s face it: knowing the time of a situation in the log is the most important aspect. And don’t be a fool: if you are using a framework for your language or for logging and this isn’t a default value that appears in the log, my friend, you better start googling for alternatives.

The typical format that we are looking for is a human readable timestamp: YYYY-MM-DD HH:MM:SS.SSS. If you need, you should go down to millisecond precision. And of course, pay attention to that thing that haunts you in your sleep: you have to be careful with timezones!

We live in a world where every second counts. In 2000, it was acceptable to wait 10 seconds to open a web page but now if takes more than 1 second the user is going to close that page. So, you are already guessing what I’m going to say: execution time. Every request has to log the corresponding response time. And yes, this is a nice one for that fancy chart in your dashboards!

2019–04–06 17:02:03.375 INFO [req-000000083] [c.b.a.controller.ApiController] [operationName=ApiController.edit, httpMethod=PATCH, IPAddress=100.200.300.400, userId=55df8919, executionTime=127] [operationResult=NO_CONTENT]]

Who

Every aspect that we’ve talked about until now is important but it could become irrelevant if you don’t know who made a certain operation in your application.

Typically you have an authentication system for you application. It doesn’t matter if it’s a REST API with a bearer token or basic auth, or if it’s a web app with a Keycloak, Cierge or AWS Cognito. When the request enters your endpoint you should have access to the user identifier that made it, and at that moment you should log that identifier. If you have endpoints that allow unauthenticated requests, then you should log something like: userId=Anonymous.

If you don’t have an authentication system, you can take another approach and opt to print the IP that made a request. It’s not the most credible identifier (the user could be connected to a network that supports one hundred users but has a static IP, for example), but it’s better than having no clue about who made a certain request. And one thing is obvious: if security is your application’s main concern, independently logging the IP is always important, whether you have an authentication system or not.

2019–04–07 08:18:04.293 INFO [req-000000016] [c.b.a.controller.ApiController] [operationName=ApiController.create, httpMethod=POST, IPAddress=100.200.300.400, userId=55df8919, executionTime=57, resourceId=1002] [operationResult=CREATED]

Why

Well, this one is obvious from the beginning. Only if the logging of your application is turned off, will you not have the errors printed in your log. So, this is simple: to the client we always return a pretty message. In the log, we should print every nasty little thing that caused an error. Even that one that will give nightmares because that little bug is your fault!

Slack beautiful 404 error page

But not everything revolves around errors. In my opinion, you don’t have to print all the validation errors that you send to the user but there are some that you should log. For example, there is one that is trivial: trying to access a resource that doesn’t exist. The user is going to receive a 404 HTTP code and the log should reflect that. If you use some tool that identifies those kind of errors in the log, you could raise alerts based on the number of those events.

2019–04–07 08:24:26.138 WARN [req-000000040] [c.b.a.e.ExceptionHandlingControllerAdvice] [operationName=ApiController.get, httpMethod=GET, IPAddress=100.200.300.400, userId=55df8919, resourceId=123456789, executionTime=4] [operationResult=NOT_FOUND] Resource not found: 123456789
2019–04–07 09:47:13.263 WARN [req-000000055] [c.b.a.e.ExceptionHandlingControllerAdvice] [operationName=ApiController.create, httpMethod=POST, IPAddress=100.200.300.400, userId=55df8919, executionTime=140] [operationResult=CONFLICT] The selected dates are already reserved

Where

In this final section, I want to mention two things that could become handy in certain situations. If your application is an API that serves a frontend application, I’m sure there is valuable browser information that made the request. We all know that dealing with the panoply of browsers that exist nowadays is a nightmare. Help that poor frontend developer with this.

This next one isn’t easily implemented but I believe that some applications need to log this: the geolocation of the request. You have a lot of APIs to know the location of an IP but calling an API just to add information to the log is a little bit expensive.

Conclusion

If you read everything and your log has all the information, you are a superstar. If you have some, I honestly believe that is a good start and it’s a sign that you care about your project! I highly recommend allocating some time to improve this in your application. As we’ve just seen, it’s so important! If you don’t have anything but the default information, you have a lot of work to do, so hurry up before a user or your client raises a question that you won’t know how to answer!

I hope that this article has convinced you about the importance of correct logging within your applications. And believe me, when it comes to logging, there is always room for improvement. It’s a never ending task. So what are you waiting for?