Bots or spiders are a common problem for many websites. They can cause resource drains, and even overload websites with lower resource limits. Here we demonstrate how to setup block lists, and apply them on an individual basis using a Nginx Reverse Proxy.

Before you get started, you'll need to know what bots you intend to block. In this article, we include the lists used by K&T Host, but you can feel free to use your own.

 

Step 1: You will need to include your bot lists in your nginx.conf file. Edit the /usr/local/etc/nginx/nginx.conf file, and add the following in the HTTP section.

# Block the Bots
include /usr/local/etc/nginx/lists/*.conf;
include /usr/local/etc/nginx/lists/search_bots;

 

Step 2:  Add the lists to the /usr/local/etc/nginx/lists/ folder. Each list filename should end with .conf except the search_bots list. For example, blocked-bots.conf.

 

Step 3: Create an "if statement" for each bot list. Create one per file so that you can include each file individually per Nginx server definition. 

File: /usr/local/etc/nginx/security/block-blockedbots.conf

if ($blocked_bots = 1) {
        return 444;
}

File: /usr/local/etc/nginx/security/block-devtools.conf

if ($dev_tools = 1) {
        return 444;
}

File: /usr/local/etc/nginx/security/block-monitoring.conf

if ($monitoring_bots = 1) {
        return 444;
}

File: /usr/local/etc/nginx/security/block-scanners.conf

if ($scanners = 1) {
        return 444;
}

File: /usr/local/etc/nginx/security/block-data_collectors.conf

if ($data_collectors.conf = 1) {
        return 444;
}

File: /usr/local/etc/nginx/security/block-social

if ($social_bots = 1) {
        return 444;
}

File: /usr/local/etc/nginx/security/block-search

if ($search_bots = 1) {
        return 444;
}

 

Step 4: Add the relevent block lists to each virtual hosting file with Nginx. To do this, add the following to your Nginx virtual host definition.

#Include Security
        include /usr/local/etc/nginx/security/*.conf;
#       include /usr/local/etc/nginx/security/block-search;

Uncomment the block-search line if you wish to block search engines.

You can also add each individual block list instead. For example,

#Bot Management
include /usr/local/etc/nginx/security/block-scanners.conf;
include /usr/local/etc/nginx/security/block-data_collectors.conf;
include /usr/local/etc/nginx/security/block-blockedbots.conf;
include /usr/local/etc/nginx/security/block-devtools.conf;  
include /usr/local/etc/nginx/security/block-referer.conf;
include /usr/local/etc/nginx/security/block-social.conf;

# Search Engines
include /usr/local/etc/nginx/security/block-search;