Web server security – Part 6: GDPR-friendly logging, and server monitoring

Web server security – Part 6: GDPR-friendly logging, and server monitoring

In this part of the Web server security series, we discuss GDPR-friendly logging, and server monitoring. Both actions are essential to securely operate a web server. The whole idea can be extended to all server-side log files.

Contents

  1. Basics
  2. Requirements
  3. GDPR-friendly server-side logging
  4. Server monitoring
  5. Summary
  6. Links

Always stay in the loop!
Subscribe to our RSS/Atom feeds.

Kindly note that the following article isn’t legal advice. If you need legal advice, contact a lawyer. It is forbidden in many countries to legally advise people if the advisor isn’t a lawyer. Subsequent GDPR-related topics describe the situation of InfoSec Handbook, and may not be applicable to your legal situation.

Basics

The most important rule first: Log files without (at least manual/occasional) monitoring are absolutely useless. A log file doesn’t protect your web server if it’s “just there”. Attackers can modify log files to hide their actions, attackers can read log files to access sensitive information, attackers can delete log files–you will never learn about this as long as you don’t monitor your log files. Hence, monitor your log files, or disable logging at all. Incidentally, this is also true for many insecurely operated IP cameras: attackers access these cameras, change parameters, and the camera’s owners rarely notice this since many owners don’t monitor their cameras.

Why is logging important?

As mentioned before, log files are similar to recorded camera feeds. A camera feed allows you to see events in real time (if you monitor your feeds in real time), or after events happened. This is also true for log files. They can tell you about:

  • suspicious activity (e.g. attempts to access several accounts)
  • broken links
  • missing files
  • wrong or outdated server configuration
  • features that aren’t supported client-side
  • security incidents (e.g. an unauthorized party successfully accessed your web server)

Moreover, it can be required by law to log certain events.

Logging and GDPR

Logging doesn’t always imply tracking. Logging isn’t necessarily restricted to “logging personal data”. Unfortunately, many private users seem (or at least seemed) to be confused by new regulations introduced by the European GDPR in 2018. No, logging isn’t illegal in general, and no, you don’t always need the explicit permission of every visitor to log events. However, don’t just log everything. This will be very likely illegal. You need a basic concept for logging. Carefully document:

  • Purpose of logging (e.g. monitor access attempts)
  • Personal data affected
    • type (e.g. IP address, username, full name, e-mail address, date of birth)
    • shared with (e.g. log server provider)
    • what you do with the personal data (e.g. automatically block IP addresses)
    • how long do you store it (e.g. for 14 days)
    • where is it located (e.g. in “/var/log/access-attempts.log”)
    • data flows (e.g. from web server to log server)
    • legal basis for storing this information (e.g. article 6(1) f GDPR)
  • content of the log file (e.g. IP address, timestamp, requested file, HTTP status code)

Did you actually include personal data? Then, you need to ask yourself if it’s absolutely necessary to log this personal data. For instance, you don’t need to log IP addresses if you only want to identify broken links.

The next question is if it’s legally allowed to store personal data for the defined purposes. There is no easy answer here. Again, we aren’t lawyers. However, in general, minimal logging for security purposes should be okay in most cases. Some websites always refer to article 6(1) f GDPR when it comes to logging. Don’t do this. Don’t log personal data if you don’t need it. Other websites tell their visitors that they delete their log files every day. That doesn’t make sense since it would require to permanently evaluate log events. Be honest about this. Add information about logging to your privacy policy, and (again) carefully document logging of personal data.

Additionally, be aware that the GDPR understands “processing personal data” as “doing something with personal data”. “Processing” isn’t limited to “storing”. The term “processing” includes collecting, recording, using, structuring, and much more.

The Ctrl blog wrote a detailed article about legal aspects of logging in “EU GDPR and personal data in web server logs”.

Don’t log sensitive data

Sometimes, your server software allows you to log sensitive data. Do not log data like:

  • access tokens, passwords, keys, secrets
  • source code
  • payment data
  • session identification
  • database connection strings
  • internal networks names

For example, the XMPP server ejabberd allows you to log passwords in cleartext. Never do this, and frequently check if such “features” are disabled.

Requirements

In the following, we discuss common log files on Debian 9 and Apache 2.4.25. If you use another operating system/web server, file names/content might differ. An essential part of the guide below is a GnuPG key used to encrypt log files server-side. We provide a quickstart guide for people who don’t know how to create GPG keys.

Additionally, you should ensure that your server is synchronized time-wise. Inconsistent timestamps likely render log files useless.

Thirdly, clearly define your log formats, or use common log formats like the “NCSA Common log format”.

GDPR-friendly server-side logging

The following steps describe how you tell logrotate to use your GPG public key to encrypt log files. You can apply this for all log files. However, in this guide we only change the configuration for Apache-related log files. The GDPR doesn’t explicitly dictate that one has to encrypt log files. However, minimal logging combined with encryption is in line with it.

Step 1: Upload and import your GPG public key

As mentioned above, you need a GPG key on your server to automatically encrypt log files. For security reasons, we generate the GPG key client-side. The server only needs the public GPG key. If you still need a GPG key, have a look at our quickstart guide. Advanced users can use a Nitrokey or YubiKey to securely generate and store their GPG key.

After creating your GPG key, export your GPG public key. Then, copy it to your server using scp on your client: scp your-public-gpg-key.asc [username]@[server-ip-address]:/home/[username]/. Your GPG public key is uploaded to the home directory of the remote user.

Connect to your server using ssh [username]@[server-ip-address]. Add the GPG key to the keyring of root: sudo gpg --import your-public-gpg-key.asc. Finally, you must trust this GPG key to use it later: sudo gpg --edit-key [your-gpg-key-id]. Enter trust within gpg. Set the trust level to 5 (I trust ultimately). Enter quit to leave gpg. Your GPG public key is ready now.

Step 2: Find your log files

To see log files on your server, connect to it using ssh [username]@[server-ip-address], and enter ls -l /var/log/. The “/var/log/” folder usually contains most log files. For example:

  • auth.log (usage of authorization systems such as PAM)
  • daemon.log (information about running system and application daemons)
  • kern.log (detailed log of messages from the kernel)
  • syslog (comprehensive information about events)

Besides, there are log files created by applications:

  • apache2/ (folder containing Apache-related log files)
  • fail2ban.log (IP addresses banned by fail2ban)
  • lynis.log (information about system hardening created by Lynis)
  • rkhunter.log (information about rootkits and file integrity created by rkhunter)
  • ufw.log (firewall-related information like blocked IP addresses)

Every log file can contain personal data. We recommend to carefully document the purpose of each log file, and check if it contains personal data. You can modify the GPG setup below to encrypt these files too.

Step 3: Understand logging by Apache

Apache’s module mod_log_config allows you to define custom log formats. Furthermore, you can write logs to files, or pass them to other programs. For simplicity, we only look at writing logs to file in this article.

On Debian, basic parameters for Apache are located in “/etc/apache2/apache2.conf”. Look for LogFormat in the file. They look like LogFormat [elements-that-will-be-logged] [nickname-of-the-rule]. For instance:

  • NCSA Common log format: LogFormat "%h %l %u %t \"%r\" %>s %O" common
    • %h: IP address of the client (remote host)
    • %l: remote log name
    • %u: remote user name
    • %t: timestamp
    • \"%r\": first line of request
    • %>s: final HTTP status code
    • %O: bytes sent (including HTTP headers)
    • common: nickname of the rule
  • at InfoSec Handbook, we use: LogFormat "%t %>s %O \"%r\" %!200,304,302{User-agent}i" ish
    • %t: timestamp
    • %>s: final HTTP status code
    • %O: bytes sent (including HTTP headers)
    • \"%r\": first line of request
    • %!200,304,302{User-agent}i: user agent, if HTTP status code doesn’t match 200, 302, or 304
    • ish: nickname of the rule

For more examples and guidance, go to the official Apache Module mod_log_config page.

Depending on your setup, you must define the usage of your log files in another configuration file. For instance, add it to “/etc/apache2/sites-enabled/000-default-le-ssl.conf”:

1
2
3
4
[…]
  ErrorLog ${APACHE_LOG_DIR}/error.log
  CustomLog ${APACHE_LOG_DIR}/access.log [nickname-of-custom-format]
[…]

You see that Apache actually creates multiple log files. In this case, it creates:

  • error.log (diagnostic information, and any errors that Apache encounters in processing requests)
  • access.log (all processed, client-side requests)

If ModSecurity is present, there is also modsec_audit.log. These log files contain the full HTTP requests and responses, as well as the reason for blocking IP addresses.

Step 4: Configure logrotate

All log files can be managed by logrotate. logrotate is a standard command for log management on Linux. Log rotation means “keep older log files for a defined time period”. The time period (and more) is defined in “/etc/logrotate.conf”. Open the file by entering sudo nano /etc/logrotate.conf. The file may look like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
[…]
# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

[…]

# packages drop log rotation information into this directory
include /etc/logrotate.d

[…]

By default, your server creates a fresh log file every week, and keeps the last three weeks. The important part is include /etc/logrotate.d. The folder “/etc/logrotate.d/” contains additional configuration files. There is a special configuration file for Apache, called “apache2”.

Open it by entering sudo nano /etc/logrotate.d/apache2. The file may look like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
/var/log/apache2/[…] {
        weekly
        missingok
        rotate 4
        delaycompress
        notifempty
        create 640 root adm
        sharedscripts
        […]
}

We can now tell logrotate to use our GPG public key, which we configured in step 1. Add the following highlighted configuration to enforce encryption for your Apache logs, and securely delete old files:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/var/log/apache2/[…] {
        daily
        missingok
        rotate 14
        delaycompress
        notifempty
        create 640 root adm
        sharedscripts

        # delete log files using shred -u
        shred

        # compress old log files using the command defined by compresscmd
        compress

        # use gpg to "compress" (= encrypt) old files
        compresscmd /usr/bin/gpg

        # tell gpg to encrypt log files for the specific recipient
        compressoptions --encrypt --recipient [your-gpg-key-id]

        # use .gpg as the file extension for encrypted files
        compressext .gpg
        […]
}

As you can see, logrotate stores log files for 14 days (“daily”, “rotate 14”). This setup implies that the current log file remains unencrypted for 1 day. If necessary, you can change “daily” to “hourly”. For instance, using “hourly” and “rotate 24” means that logrotate keeps log files of the current hour, and the 23 preceding hours. In this case, the current log file remains unencrypted for 1 hour. However, logrotate creates a new log file every hour. Note that you additionally must modify your cron jobs to execute logrotate every hour.

Step 5: Test logrotate

Finally, we must test if our setup works. You can tell logrotate to rotate log files immediately. Enter sudo logrotate --force /etc/logrotate.d/apache2. If you see any errors, reconfigure your setup.

Otherwise, list all Apache log files by entering: ls -l /var/log/apache2/. There should be at least one “.gpg” file now. Its content should be encrypted/unreadable.

Additionally, you can check if logrotate is executed daily by your system. There should be the “/etc/cron.daily/logrotate” file on your server.

Server monitoring

Besides logging, we need to monitor our log files. There are many dedicated tools for this–some of them are proprietary software, others are open-source software. If you are interested in this, have a look at syslog, Kibana, Nagios, or Splunk. We may present these tools in an upcoming article of this series.

For simplicity, we show you manual retrieval and analysis of your log files in this article. An advantage of this method is that you don’t create a new attack surface by installing more software. The biggest disadvantage is that there is no comfort.

To copy the log files to your client, you must add your remote user to the remote “adm” group. Debian’s adm group is used for system monitoring tasks. Log files created by logrotate can be read by members of the adm group by default. So, enter sudo usermod -aG adm [username] on your server to add your remote user to the server’s adm group. Alternatively, you can create a new user only for the purpose of retrieving log files.

After that, you can simply retrieve log files by entering scp -r [username]@[server-ip-address]:/var/log/apache2/* . on your client. This command copies all files of the remote folder to the current local folder. You must decrypt .gpg files now by entering: gpg -o [output-file] -d [input-file].gpg.

Finally, you can use awk to filter your local log files for events of interest (the filter is valid for our log format “ish”):

  • awk '!match($3, 200){print $0}' access.log: Print all requests except requests that resulted in HTTP status code 200
  • awk '!match($3, 200) && !match($3, 304){print $0}' access.log: as before, however, additionally exclude HTTP status code 304
  • awk '!match($8, "-"){print $0}' access.log: Print all logged user agents
  • awk 'match($3, 404){print $6}' access.log: Print requested files that aren’t on the web server

Last but not least, it’s also important to monitor other aspects of your server that can’t be easily logged. For instance, your complete DNS setup could be modified by attackers. Hence, you must monitor your DNS configuration. Furthermore, there are errors that can’t be logged by your web server like CSP violations, or network errors. In upcoming parts of this series, we will discuss these aspects.

This article is part of the "Web server security" series.
Read other articles of this series.

Summary

Follow five simple rules to log more privacy-friendly:

  1. Carefully consider what personal data you want to log, and document why it is important for you
  2. Document all log files, and mention logging of personal data in your privacy policy
  3. Disable logging of personal data wherever it isn’t necessary
  4. Automatically encrypt all log files, which contain personal data, using the setup above
  5. Monitor your log files using special tools, or do it by hand at least

Keep in mind: Log files without monitoring are absolutely useless. Moreover, monitor your DNS setup, and configuration changes in general.

See also