Introduction

When it comes to centralized logging, my personal favorite solution is Graylog . It is incredibly robust and scales extremely well. But if you’re using the open-source version of Graylog, you’ve probably noticed a significant gap: it doesn’t offer an automated archiving feature for long-term storage of indices. Graylog Operations does have this feature, but what if upgrading isn’t in your budget?

This limitation in Graylog Open becomes increasingly problematic as more indices accumulate. Without an automated archiving solution, you’re left with only two options: closing the older indices, which still consumes storage, or deleting them, which sacrifices historical data. Neither of these options are very attractive.

To help you navigate this challenge, this blog post will guide you through a cost-effective alternative: automating the archiving of older Graylog indices using a Python script in conjunction with Opensearch snapshots. Whether you’re on a tight budget, new to Opensearch, or a seasoned Graylog administrator, this guide aims to alleviate one more stressor from your administrative responsibilities.

Let’s dive in!

The Problem

When it comes to managing logs, Graylog is incredibly efficient, but the platform has its limitations. One of those limitations, especially for users of the Graylog Open edition, is the absence of index archiving. This leaves administrators with the cumbersome task of manually creating snapshots and offloading them to alternative storage to ensure the opensearch cluster is running efficiently at all times. This can be both time-consuming and prone to error.

As Graylog is often used in mission-critical applications where logs might be required for compliance, security monitoring, or debugging, an error in snapshot management could have serious implications. Moreover, for IT teams already stretched thin, freeing up human resources from repetitive tasks like this can make a significant difference.

So, the question arises: How can we automate this snapshot process to make life easier for Graylog administrators?

Prerequisites

Before you can execute the Python script for automated snapshot creation, there are some prerequisites to check off your list. This ensures that the environment is ready and capable of supporting the script.

  • The script is written in Python, and it is recommended to use Python 3.6 or later.

  • The script relies on several Python packages such as requests, pytz, smtplib, and re. Make sure to install them if you haven’t:

    pip install requests pytz
    
  • You will need administrative access to your Graylog and Opensearch instances.

  • You will need to have created a snapshot repository on your opensearch cluster.

  • Make sure to replace the username and password variables in the script with your Opensearch admin credentials.

  • The script includes an email notification feature, so you’ll need access to a Gmail account to send notifications. Replace gmail_user and gmail_pwd in the script with your Gmail username and password, respectively.

  • Ensure that the machine running the script has network access to the Graylog and Opensearch servers. Check your firewall settings to confirm.

  • The script disables SSL certificate warnings by default for ease of use, which may not be suitable for all environments. It’s advisable to configure proper SSL certificates for production use.

The Solution: An Automated Snapshot Script

After understanding the limitations of manual snapshot management in the Graylog Community edition, it’s clear that an automated solution is needed. That’s where our Python script comes in. Designed to bridge this functionality gap, the script automates the process of creating snapshots for Graylog indices in Opensearch, making life easier for administrators and providing an added layer of data security.

Code Walkthrough

Importing Modules

import pytz
import re
import requests
import smtplib
from datetime import datetime
from email.mime.text import MIMEText
from requests.packages.urllib3.exceptions import InsecureRequestWarning
  • pytz: For timezone conversions.
  • re: For regular expressions, used to extract index names.
  • requests: To make API calls to Opensearch.
  • smtplib and MIMEText: For sending email notifications.
  • datetime: To work with date and time objects.
  • InsecureRequestWarning: To suppress SSL certificate warnings.

Setting Up Variables

username = 'opensearch_username'
password = 'opensearch_password'

The username and password are credentials used to authenticate against Opensearch and Graylog servers. These are hardcoded for the purpose of this demonstration.

Sending Emails with send_email

def send_email(subject, body, to, gmail_user, gmail_pwd):
    msg = MIMEText(body)
    msg['Subject'] = subject
    msg['From'] = gmail_user
    msg['To'] = to

    try:
        server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
        server.ehlo()
        server.login(gmail_user, gmail_pwd)
        server.send_message(msg)
        server.close()
        print('Email sent!')
    except Exception as e:
        print('Failed to send email:', e)

This function is responsible for sending emails in case of snapshot failures. It uses smtplib to connect to Gmail’s SMTP server and send emails.

Creating Snapshots with create_snapshot

def create_snapshot(index_name, snapshot_name, auth, verify=False):
    """Creates a snapshot for the given index"""
    url = f"https://opensearch01:9200/_snapshot/my-fs-repository/{snapshot_name}"
    headers = {'Content-Type': 'application/json'}
    payload = {
        "indices": index_name,
        "ignore_unavailable": True,
        "include_global_state": False
    }
    response = requests.put(url, auth=auth, headers=headers, json=payload, verify=False)
    print(response.text)
    if response.status_code != 200:
        print(f"Failed to create snapshot for {index_name}. Status code: {response.status_code}")
        send_email('Snapshot creation failed', f'Failed to create snapshot for {index_name}. Status code: {response.status_code}', 'email-recipient@gmail.com', 'email-sender@gmail.com', 'sender_gmail_password') # Be sure to update the recipient, sender, and password fields!
        return False
    return True

This function calls Opensearch’s snapshot API to create a snapshot of an old index. It uses the requests module to make a PUT request.

Listing Indices with get_indices

def get_indices():
    url = 'https://opensearch01:9200/_cat/indices?v=true&pretty'
    response = requests.get(url, auth=(username, password), verify=False)
    if response.status_code != 200:
        print(f"Request failed with status code {response.status_code}")
        return

    lines = response.text.split("\n")
    indices = []

    for line in lines:
        words = line.split()

        if words and words[0] == 'green':
            index = words[2]
            match = re.match(r"(.*?)_(\d+)", index)

            if match:
                base_name, rotation_number = match.groups()

                if base_name == 'firewall':
                    response = requests.get(f"https://opensearch01:9200/{index}/_settings", auth=(username, password), verify=False)
                    creation_date = response.json()[index]['settings']['index']['creation_date']
                    creation_date = datetime.fromtimestamp(int(creation_date) / 1000.0)
                    creation_date = pytz.utc.localize(creation_date).astimezone(pytz.timezone('US/Eastern'))
                    creation_date_formatted = creation_date.strftime("%Y-%m-%d_%I%M%p").lower()
                    indices.append((base_name, int(rotation_number), creation_date_formatted, index))

    indices.sort(key=lambda x: x[1])

    return indices

This function fetches all the current indices and their details using Opensearch’s cat API. It returns a sorted list of index names, which are then used to determine which indices to snapshot.

Main Logic

indices = get_indices()

indices_to_keep = indices[-9:]

for base_name, rotation_number, creation_date, index in indices:
    if (base_name, rotation_number, creation_date, index) not in indices_to_keep:
        if create_snapshot(index, creation_date, (username, password)):
            print(f"Created snapshot for {base_name}_{rotation_number}: {creation_date}")

The main logic of the script is straightforward:

  1. Fetch all indices using get_indices().
  2. Identify older indices to snapshot.
  3. Create snapshots of older indices using create_snapshot().

Security Considerations

Security is an indispensable aspect of any script that interacts with sensitive systems like Graylog and Opensearch. Let’s discuss the security implications of this automated snapshot script.

Protecting Username and Password

The username and password variables are hardcoded into the script, which means anyone with access to the script has the keys to your Graylog and Opensearch kingdoms. This is risky business, as unauthorized users could potentially manipulate your data, delete indices, or even disable security features. Therefore, it’s crucial to keep these credentials secure. In a production environment, consider using environment variables, a configuration file, or a secure secret management service to handle sensitive information.

Disabling SSL Certificate Warnings

The script includes a line to suppress SSL certificate warnings (InsecureRequestWarning). While this can make the script run without displaying annoying warnings, it’s a practice that should be approached with caution. Disabling these warnings means that you are more susceptible to Man-in-the-Middle (MitM) attacks, as the script won’t verify the SSL certificate of the server to which it’s connecting. In a production environment, it’s highly recommended to solve SSL certificate issues to maintain the highest level of security.

By understanding these security considerations, you can better prepare your environment to minimize risks while utilizing the script’s functionality. Always prioritize security when dealing with automated scripts that interact with critical business systems.

How to Run the Script

You can run this script manually by executing python your_script_name.py in your terminal, or set it up as a secure cron job to run nightly.

To set up a secure cron job:

  1. Open the cron table by typing crontab -e in the terminal.
  2. Add the cron job line to execute it nightly at midnight.
  3. Secure the cron job by restricting access to the cron table and storing sensitive information in a secure configuration file.

Conclusion

You’ve now gained a comprehensive understanding of automating Graylog/Opensearch snapshots through a Python script. You’ve learned why snapshots are essential, how to run the script, and key considerations for each part of the code.

There’s room for future improvements, such as better error handling and adding more notification methods. The script provides a solid foundation for keeping your Graylog data securely backed up. Happy automating!

Complete Script

Here is the complete Python script for your reference. Feel free to take this script, modify it, improve upon it, and integrate it into your own Graylog management procedures:

import pytz
import re
import requests
import smtplib
from datetime import datetime
from email.mime.text import MIMEText
from requests.packages.urllib3.exceptions import InsecureRequestWarning


requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)

username = 'opensearch_username'
password = 'opensearch_password'

def send_email(subject, body, to, gmail_user, gmail_pwd):
    msg = MIMEText(body)
    msg['Subject'] = subject
    msg['From'] = gmail_user
    msg['To'] = to

    try:
        server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
        server.ehlo()
        server.login(gmail_user, gmail_pwd)
        server.send_message(msg)
        server.close()
        print('Email sent!')
    except Exception as e:
        print('Failed to send email:', e)

def create_snapshot(index_name, snapshot_name, auth, verify=False):
    """Creates a snapshot for the given index"""
    url = f"https://opensearch01:9200/_snapshot/my-fs-repository/{snapshot_name}"
    headers = {'Content-Type': 'application/json'}
    payload = {
        "indices": index_name,
        "ignore_unavailable": True,
        "include_global_state": False
    }
    response = requests.put(url, auth=auth, headers=headers, json=payload, verify=False)
    print(response.text)
    if response.status_code != 200:
        print(f"Failed to create snapshot for {index_name}. Status code: {response.status_code}")
        send_email('Snapshot creation failed', f'Failed to create snapshot for {index_name}. Status code: {response.status_code}', 'recipient_email@gmail.com', 'sender_email@gmail.com', 'sender_password')
        return False
    return True

def get_indices():
    url = 'https://opensearch01:9200/_cat/indices?v=true&pretty'
    response = requests.get(url, auth=(username, password), verify=False)

    if response.status_code != 200:
        print(f"Request failed with status code {response.status_code}")
        return

    lines = response.text.split("\n")
    indices = []

    for line in lines:
        words = line.split()
        if words and words[0] == 'green':
            index = words[2]
            match = re.match(r"(.*?)_(\d+)", index)

            if match:
                base_name, rotation_number = match.groups()

                if base_name == 'firewall':
                    response = requests.get(f"https://opensearch01:9200/{index}/_settings", auth=(username, password), verify=False)
                    creation_date = response.json()[index]['settings']['index']['creation_date']
                    creation_date = datetime.fromtimestamp(int(creation_date) / 1000.0)
                    creation_date = pytz.utc.localize(creation_date).astimezone(pytz.timezone('US/Eastern'))
                    creation_date_formatted = creation_date.strftime("%Y-%m-%d_%I%M%p").lower()
                    indices.append((base_name, int(rotation_number), creation_date_formatted, index))
    indices.sort(key=lambda x: x[1])

    return indices

indices = get_indices()

indices_to_keep = indices[-9:]

for base_name, rotation_number, creation_date, index in indices:
    if (base_name, rotation_number, creation_date, index) not in indices_to_keep:
        if create_snapshot(index, creation_date, (username, password)):
            print(f"Created snapshot for {base_name}_{rotation_number}: {creation_date}")

Thank you for reading, and I hope you found this guide beneficial in your journey towards more efficient and automated Graylog management! In the future, I will address purging snapshots that are older than 180 days to keep a running rotation of snapshots.