Self-hosting searX with Filtron and Caddy

I’ve recently begun the process of changing my online lifestyle to be a bit more privacy inclined, this has involved a few changes:

  • I have removed Chrome from my computer and replaced it with Firefox
  • I’ve updated my pi-hole DNS block lists to block a significant number of privacy-breaching trackers/domains
  • I’ve changed my domain’s email MX servers to use ProtonMail (I have a Professional account with them, which allows me to bring my own custom domain, setting up catch-all email forwarding, and a few other nice things)
  • I’ve recently changed my search engine of choice to searX (a self-hosted instance of it as well)

This post will discuss how I configured my self-hosted instance of searX with caddy as a reverse proxy and a filtering agent called Filtron to keep the searX instance from being abused by bad people/robots.

First step is simple enough, follow the instructions on the installation page for searX: https://asciimoo.github.io/searx/dev/install/installation.html

Once I reached the section for uwsgi, I quickly found out that Filtron doesn’t like forwarding to destinations that are unix sockets, so I had to figure out how to get searX to persist across reboots/crashes without using uwsgi. Luckily, I’m not the only person who wanted to achieve this. Another user has already done most of the leg work and wrote a basic systemd service file for searX and published it as a github issue: https://github.com/asciimoo/searx/issues/985

After a few tweaks I was satisfied with the following:

[Unit]
Description=searx
After=syslog.target network.target

[Service]
Type=simple
User=searx
WorkingDirectory=/usr/local/searx
ExecStart=/usr/local/searx/searx-ve/bin/python /usr/local/searx/searx/webapp.py
TimeoutStopSec=5
Restart=always
RestartSec=60

[Install]
WantedBy=multi-user.target

And while I was here, I went ahead and created the service file for Filtron:

[Unit]
Description=filtron
After=syslog.target network.target

[Service]
Type=simple
User=root
ExecStart=/usr/local/go/bin/filtron -rules "/root/rules.json"
TimeoutStopSec=5
Restart=always
RestartSec=60

[Install]
WantedBy=multi-user.target

After this, I did a quick systemctl daemon-reload to get the service files read by systemd and moved on. I modified the settings in searx/settings.yml to my liking and set the listening address to be 127.0.0.1:8888 so I can point Filtron to it later on.

Next was to install Filtron. Filtron is a golang binary that requires at least golang version 1.9 because of the maths/bit dependency it uses. The install steps are pretty simple on the github repo: https://github.com/asciimoo/filtron

Once you have it installed, I dumped the customized searX rules.json included in part of searX’s installation page into a file on disk and feed it to Filtron. Now that Filtron is happily listening on 127.0.0.1:4004 and forwarding requests to searX on 127.0.0.1:8888 after filtering out abusers per the rules.json file provided, its time to move on to caddy.

Caddy is probably the easiest part of this setup, I simply added an extra server block and configured it to pass the connection to Filtron:

searx.odin.lan {
        tls tyler@tpage.io

        log /var/log/caddy/searx.odin.lan-443.log {

        rotate_size 50  # Rotate after 50 MB
        rotate_age  90  # Keep rotated files for 90 days
        rotate_keep 20  # Keep at most 20 log files
        rotate_compress # Compress rotated log files in gzip format
        }

        proxy / 127.0.0.1:4004 {
                websocket
                transparent
        }
}

Then I added a DNS record to point searx.odin.lan to my server’s IP address and restarted caddy. Once I verified that the dns resolved I started Filtron and searX by enabling and starting their respective systemd service and I was off to the races!

Further reading:

 

Caddy and Mail-in-a-Box

Introduction

Recently I’ve decided to move away from using Vesta CP as my web control panel for my domain, since there hasn’t been any major releases for it in over a year (as of the time of this writing). I also found out that the EC2 instance ran out of disk space, despite me configuring Vesta CP to notify me when this happens. I researched alternative open source control panels and could not find a suitable replacement, so I decided to skip a web control panel all together and configure everything on the box myself. My setup is pretty simple (A WordPress installation, Mail server, and reverse proxy endpoints for my home VPN) so I took the plunge.

Tinc VPN

This was pretty simple, I installed tinc via my package repository and copied over the configuration files from my previous (now defunct) EC2 instance. I configured a systemd service in order to manage the VPN tunnel connection and keep it alive in the background, and the rest is history. Here is my systemd file for tinc:

[Unit] 
Description=Tinc VPN After=network.target 

[Service] 
Type=simple 
WorkingDirectory=/etc/tinc/vpn 
ExecStart=/usr/sbin/tincd -n vpn-D -d3 
ExecReload=/usr/sbin/tincd -n vpn -kHUP 
TimeoutStopSec=5 
Restart=always 
RestartSec=60 

[Install] 
WantedBy=multi-user.target

 

Mail Server

I have heard good things about Mail-in-a-Box, and wanted to try it. One of the first issues I experience is that it only supports Ubuntu 14.04 and any attempts to use it on a later version causes the script to abort after notifying you that the version is not supported by MIAB. Luckily an Ubuntu 16.04 fork exists that requires you to take a few extra steps in order to get it working, but after that everything ran smoothly. During the installation process, MIAB installs and nginx webserver to be the fronted for managing the installation (adding new users, aliases, webmail access, etc.). It also wipes any pre-existing nginx configuration files and uninstalls apache2, so be wary if you are trying to run this on a box not solely dedicated to email. Once the setup is complete, you can access the following endpoints:

  • Webmail via Roundcube at domain.com/mail
  • Administrative portal via a custom Flask server instance at domain.com/admin
  • Nextcloud instance (for contacts/calendar) at domain.com/cloud/contacts or domain.com/cloud/calendar

It also installs a few other features (DNS resolver, small static site hosting, autodiscover for Outlook/iOS), but more importantly the web endpoints (and nginx) would be clashing with my reverse proxy setup and my intent to use Caddy as a webserver instead.

Caddy

This was new to me, I have been a long time user of nginx and never had any complaints about it. However one of Caddy’s main features is it’s automatic HTTPS. No more dealing with SSL certificates, keys, expiration dates, etc. It was all handled automatically, and was definitely not one of my favorite parts of maintaining a web server. The documentation seemed simple enough, so I decided to migrate from nginx to Caddy as part of my new web server stack.

The first issue I had to deal with is nginx and caddy fighting over port 80/443. I thought about hosting nginx on a different port and using caddy to reverse proxy traffic to the required nginx port, but ultimately I thought it would be a better learning experience if I rewrote the necessary nginx server blocks into caddy server blocks. I decided to skip over the nextcloud related ones as I would not be actually using the MIAB provided instance of nextcloud, since I run my own instance on my home server. So that leaves roundcube and the administrative panel. After much struggling and consulting the caddy documentation, I was able to successfully access the administrative panel and roundcube using the caddy file below:

mail-time.tpage.io {
      tls user@tpage.io
      gzip

      basicauth / user hunter2

      root /usr/local/lib/roundcubemail/
      fastcgi / /run/php/php7.0-fpm.sock php
}

admin-panel.tpage.io {
      tls user@tpage.io
      gzip

      header / {
            X-Frame-Options "DENY"
            X-Content-Type-Options "nosniff"
            Content-Security-Policy "frame-ancestors 'none';"
            Strict-Transport-Security "max-age=31536000"
      }

      proxy / http://127.0.0.1:10222/ {
            transparent
            websocket
      }
}

Moving forward, I had to setup a WordPress server block for my WordPress installation later on. This time it was pretty easy:

https://www.tpage.io https://tpage.io {
      tls user@tpage.io
      root /var/www/wordpress
      gzip
      fastcgi / /run/php/php7.0-fpm.sock php

      rewrite {
            if {path} not_match ^\/wp-admin
            to {path} {path}/ /index.php?_url={uri}
      }
}

http://www.tpage.io {
      tls user@tpage.io
      redir https://www.tpage.io
}

http://tpage.io {
      tls user@tpage.io
      redir https://tpage.io
}

Note the extra directives for tpage.io and www.tpage.io over HTTP. This is to make Caddy provision/manage TLS certificates for Dovecot/Postfix (which uses the server’s FQDN: tpage.io), instead of Mailinabox. I chose to do it this way because Caddy handles the provisioning of TLS certificates effortlessly and it makes me feel at ease knowing I don’t have to worry about my TLS certificates expiring. Without the extra directives, Caddy doesn’t automatically provision TLS certificates for tpage.io and www.tpage.io, it just redirects them to their HTTPS equivalent.

After configuring the domains and subdomains for automatic TLS certificate provisioning, I then had to modify the TLS certificate paths Dovecot/Postfix use when configured to use TLS (Spoiler: Mailinabox configures them this way). It all came down to editing /etc/dovecot/conf.d/10-ssl.conf and /etc/postfix/main.cf

Then swapping out the TLS/SSL certificate/key paths with the path of where Caddy stores the FQDN (tpage.io) TLS certs, for me this was /etc/ssl/caddy/acme/auto-generated-folder/tpage.io/tpage.io.crt and /etc/ssl/caddy/acme/auto-generated-folder/tpage.io/tpage.io.key

Afterwards I restarted Dovecot/Postfix and everything was golden!

Now for the reverse proxy connections over Tinc VPN. This can be accomplished because web servers bind to 0.0.0.0, which means all interfaces available (including VPN tunnels!). So you can build server blocks that reverse proxy public origin HTTP(S) connections to private web servers via the VPN tunnel. A few sample server blocks for basic services (Plex, HomeAssistant, Nextcloud, etc.) are below:

hass.tpage.io {
      tls user@tpage.io

      proxy / 15.21.5.2:80 {
            websocket
            transparent
      }
}

plex-tv.tpage.io {
      tls user@tpage.io
      gzip
      timeouts none

      proxy / 15.21.5.4:32400 {
            transparent
            websocket
      }
}

nextcloud.tpage.io {
      tls user@tpage.io
      gzip

      proxy / 15.21.5.4:9090 {
            transparent
            websocket
      }
}

guacamole.tpage.io {
      tls user@tpage.io
      gzip

      basicauth / user hunter3

      proxy / 15.21.5.4:8080/guacamole/ {
            transparent
            websocket
      }
}

 

WordPress

This part was the easiest, I used cURL to grab the latest package release from WordPress.org and decompressed it into /var/www/. After configuring the rest of my setup, I simply restored my previous WordPress installation with a recent backup and everything just worked! The plugin I use for backup/restores is: All-in-One WP Migration

Edit (3/21/2018): I’ve updated the post to describe how to make Caddy manage/renew the TLS certificates for Dovecot and Postfix

Edit (4/11/2018): Looks like there was a 0-day for Vesta being actively used – https://www.digitalocean.com/community/questions/how-do-i-determine-the-impact-of-vestacp-vulnerability-from-april-8th-2018

Edit (10/27/18): I’ve since migrated away from MIAB to using ProtonMail as my MX server, self-hosting email is too much of a burden for where I’m at in my life right now

Custom Cryptocurrency Sensors in HomeAssistant

I have recently diversified my investments in crypto-coins, and wanted to keep track of them all in order to track my investments. So I utilized the cryptocompare API and built a python dictionary of all the different crypto-coins I own, and the rest is history!

Here is the script that I use:

import requests
import json
import datetime

# Custom Cryptocurrency sensor python script

# HASS URL
base_hass_url = "http://HASS URL/api/states/"

# Define headers for HASS
hass_headers = {'Accept': 'application/json', 'Content-Type': 'application/json', 'x-ha-access': "hunter2"}

# Get yesterday timestamp for historical pricing
yesterday = datetime.datetime.today() - datetime.timedelta(days=1)
yesterday_unix = yesterday.timestamp()
original = datetime.datetime(2017, 0, 0, 0, 0, 0)
original_unix = original.timestamp()

# Add base URL structure
base_url = "https://min-api.cryptocompare.com/data/price?fsym="
base_history_url = "https://min-api.cryptocompare.com/data/pricehistorical?fsym="
base_headers = {'User-Agent': 'Python Requests', 'Content-Type': 'application/json'}
base_tsym = "&tsyms=USD"
base_history_tsym = "&tsyms=USD&ts=" + str(int(yesterday_unix))
base_original_tsym = "&tsyms=USD&ts=" + str(int(original_unix))

# Cryptocoin maps
cryptocoins = { "ETH": {"state": "0", "sensor_name": "sensor.coin_eth_price", "attributes": { "friendly_name": "Ethereum Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "BTC": {"state": "0", "sensor_name": "sensor.coin_btc_price", "attributes": { "friendly_name": "Bitcoin Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "LTC": {"state": "0", "sensor_name": "sensor.coin_ltc_price", "attributes": { "friendly_name": "Litecoin Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "NMC": {"state": "0", "sensor_name": "sensor.coin_nmc_price", "attributes": { "friendly_name": "Namecoin Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "ZEC": {"state": "0", "sensor_name": "sensor.coin_zec_price", "attributes": { "friendly_name": "Zcash Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "ICN": {"state": "0", "sensor_name": "sensor.coin_icn_price", "attributes": { "friendly_name": "Iconomi Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "GNT": {"state": "0", "sensor_name": "sensor.coin_gnt_price", "attributes": { "friendly_name": "Golem Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
                "STRAT": {"state": "0", "sensor_name": "sensor.coin_strat_price", "attributes": { "friendly_name": "Stratis Value", "unit_of_measurement": "USD", "previous_value": "0", "original_value": "0"}},
              }

# For each coin
for coin in cryptocoins:

  # HTTP GET the API for price and price yesterday
  coin_request = requests.get(base_url + coin + base_tsym, headers=base_headers)
  coin_request_historical = requests.get(base_history_url + coin + base_history_tsym, headers=base_headers)
  coin_request_original = requests.get(base_history_url + coin + base_original_tsym, headers=base_headers)

  # Set the response as the state for the coin and the previous value as an attribute
  response = coin_request.json()
  historical_response = coin_request_historical.json()
  original_response = coin_request_original.json()
  cryptocoins[coin]["state"] = str(response.get("USD"))
  cryptocoins[coin]["attributes"]["previous_value"] = str(historical_response.get(coin).get("USD"))
  cryptocoins[coin]["attributes"]["original_value"] = str(original_response.get(coin).get("USD"))

  # Build the payload and URL for HASS
  coin_url = base_hass_url + str(cryptocoins[coin]["sensor_name"])
  coin_payload = {
                    "state": cryptocoins[coin]["state"],
                    "attributes": cryptocoins[coin]["attributes"]
                  }

  # Make the request
  coin_request = requests.post(coin_url, headers=hass_headers, data=json.dumps(coin_payload))

Download videos from put.io using their API

I wanted to automatically download new videos from put.io as they got added to my account, so I took a look at their API and built a python script to do it.  The script descends into the parent folder and any child folders and looks for video files that are above a certain size, and don’t contain the word “sample” in them.  After all the videos that meet that criteria have been downloaded (using aria2c), the script deletes all the folders and cleans the History and Transfers tab.

Here is the script:

import requests
import json
import time
import subprocess
import os
import sys
import datetime

# put.io monitoring and downloading

# Check if file exists before proceeding
file = "/tmp/tv_download"
one_day = datetime.datetime.now() - datetime.timedelta(days=1)

# If the file exists, exit
if os.path.isfile(file):
    filetime = datetime.datetime.fromtimestamp(os.path.getctime(file))
    
    if filetime > one_day:
        os.remove(file)
    else:
        sys.exit(2)

# If the file doesn't exist create it
else:
    use_file = open(file, "w")
    use_file.write("In use")
    use_file.close()

# Base URL and OAUTH token
url = "https://api.put.io/v2/"
oauth = "?oauth_token=<INSERT OAUTH TOKEN>"
headers = {'Accept': 'application/json', 'Content-Type': 'application/json'}
file_urls = {}

# This function adds {file_id: download_url} of files we want to file_urls, recursively
def get_video_urls(file):

    # If the file type is a FOLDER, recursively descend into it
    if file['file_type'] == 'FOLDER':
        
        # Grab the folder id
        folder_id = str(file['id'])

        # If its not an empty folder
        if file['size'] != 0:

            # Get the list of children in the folder
            folder_list = requests.get(url + "files/list" + oauth,
                                       headers=headers,
                                       params={'parent_id': folder_id}).json()

            # Process each child
            for child in folder_list['files']:
                get_video_urls(child)
                
        # If it is empty, add it to the file_urls dict and mark it as null for deletion        
        else:
            file_urls.update({folder_id: "null"})
    
    # If its a file we want
    if file['file_type'] == "VIDEO" and "sample" not in file['name'] and file['size'] > 50000000:

        # Grab it's ID
        video_id = str(file['id'])

        # Get the download URL
        video_url_request = requests.head(url + "files/" + video_id + "/download" + oauth, headers=headers)
        video_url_headers = video_url_request.headers
        video_url = str(video_url_headers['Location'])

        # Add the ID and URL to file_urls{}
        file_urls.update({video_id: video_url})
    
    # Else its a junk file
    else:
        
        # Grab it's ID
        video_id = str(file['id'])
        
        # If it is junk add it to the file_urls dict and mark it as null for deletion
        file_urls.update({folder_id: "null"})

# Get files/folder in the specific folder given
other_child_list = requests.get(url + "files/list" + oauth,
                                headers=headers,
                                params={'parent_id':'<FOLDER ID>'}).json()

# Kick off the descending into the "Other" folder
for child in other_child_list['files']:
    get_video_urls(child)

# For each URL we get in the file_urls dict
for file_url in file_urls:

    # If we have a video file to download
    if file_urls.get(file_url) != "null":
        
        # Download it with aria2c and store the return code
        return_code = subprocess.call(["aria2c", "-c", "-q", "-x8", "-d",
                                       "/downloads", "--log-level=error", file_urls.get(file_url)])
        time.sleep(30)

        # If aria2c completed successfully delete the file
        if return_code == 0:
            file_deletion_data = {'file_ids': str(file_url)}
            file_delete_request = requests.post(url + "files/delete" + oauth,
                                                headers=headers,
                                                data=json.dumps(file_deletion_data))
    
    # Else we have an empty folder to delete
    else:
        file_deletion_data = {'file_ids': str(file_url)}
        file_delete_request = requests.post(url + "files/delete" + oauth,
                                            headers=headers,
                                            data=json.dumps(file_deletion_data))

# Pause to let the put.io API catch up
time.sleep(5)

# Clean the history tab
requests.post(url + "events/delete" + oauth)

# Pause again to let the request kick in
time.sleep(5)

# Clean the tranfers tab
requests.post(url + "transfers/clean" + oauth)

# Remove the file when we are done
os.remove(file)

 

Monitor running services using systemd and HomeAssistant

I was having issues with my Raspberry Pi, specifically the Bluetooth service running through systemd.  So naturally I wanted to be able to track when the service was reporting as failed or offline/disabled, and that manifested in another custom sensor setup for HomeAssistant.  This is a python script that will iterate through system services given in a list and create sensors in your HomeAssistant instance and have their state mapped to the sensor state (running = on, failed/stopped = off).  I have it set up to run every few minutes through a cron job, and its worked flawlessly so far.

import requests
import json
import subprocess
import time
import os

# systemd-service-sensor

# URL for homeassistant instance
hass_url = "http://HA_URL/api/states/sensor."

# Headers for homeassistant instance
hass_headers = {'Accept': 'application/json',
                'Content-Type': 'application/json',
                'x-ha-access': 'HA_PASSWORD'
               }

# Services I want to track
services = ["bluetooth.service","cron.service",
            "dasher.service","nginx.service",
            "ntp.service","ssh.service","supervisor.service"]

for service in services:

  # Get status information from systemctl
  service_info = subprocess.check_output("systemctl is-active " + service + "; exit 0",
                                         stderr=subprocess.STDOUT, 
                                         stdin=open(os.devnull), 
                                         shell=True).decode('utf-8').replace("\n","")

  # Generate payload for HASS
  hass_payload = {
        "state": service_info,
        "attributes": {
             "friendly_name": service.replace("."," ")
        }
  }

  # Formate sensor name to be *service*_service instead of *service*.service
  hass_sensor = service.replace(".","_")
  hass_sensor = hass_sensor.replace("-","_")

  # POST data to HASS
  hass_request = requests.post(hass_url + hass_sensor,
                               headers=hass_headers,
                               data=json.dumps(hass_payload))

 

Monitor Your Car with HomeAssistant and Dash

Recently I picked up a nice new bluetooth OBDII adapter for my car, specifically this one. I wanted the ability to automatically check and record information about my car (average trip distance, odometer rating, etc.) without having to do anything but getting in the car and driving.

The first part of the process was automatically recording my drives and uploading them to Dash. I achieved this by creating a Tasker profile to automatically launch the Dash app and start tracking my drive. The app talks to the OBDII adapter and uploads the information over my cell phone’s data connection to Dash’s servers.

The next part was getting access to the data uploaded through Dash’s API. The process was fairly simple, I used Hurl.it to set up and acquire an oauth token for use with the API. They have a lot of information that you can pull from their API endpoints, but I was most interested in the quantitative data (more specifically the odometer reading, distance traveled, fuel consumed).

I wrote a python script that I have set to run every 10 minutes on my raspberry pi to poll the data I wanted, and upload it to my HomeAssistant instance for tracking. The source code is below:

import requests
import json

# Custom Dash sensor python script

# Replace with your HomeAssistant Instance's URL
hass_url = "http://my-HA.com"

# Replace with your api_password you defined in configuration.yaml
hass_password = "hunter2"

# Make a request to the dash.by API
dash_url = "https://dash.by/api/chassis/v1/"
dash_header = {'Authorization': 'Bearer Token goes here'}

dash_request_user = requests.get(dash_url + "user", headers=dash_header)
dash_request_trip = requests.get(dash_url + "trips", headers=dash_header)

json_resp = dash_request_user.json()
trip_resp = dash_request_trip.json()

driver_score = str(json_resp["overallScore"])

vehicle_odometer = str(round(int(json_resp["currentVehicle"]["odometer"]), 0))

last_trip_time = str(round(int(trip_resp["result"][0]["stats"]["timeDriven"]), 0))
last_trip_average_speed = str(trip_resp["result"][0]["stats"]["averageSpeed"])
last_trip_distance_driven = str(round(int(trip_resp["result"][0]["stats"]["distanceDriven"]), 0))

driver_sensor = {
       "sensor_name": "sensor.dash_driver_score",
       "state": driver_score,
       "attributes": {"friendly_name": "Dash Drive Score", "unit_of_measurement": "points"}
}

vehicle_sensor = {
       "sensor_name": "sensor.dash_current_vehicle_odometer",
       "state": vehicle_odometer,
       "attributes": {"friendly_name": "Current Vehicle Odometer",
                      "unit_of_measurement": "miles"}
}

trip_duration_sensor = {
       "sensor_name": "sensor.dash_last_trip_time",
       "state": last_trip_time,
       "attributes": {"friendly_name": "Last Trip Duration", "unit_of_measurement": "min"}
}

trip_average_speed_sensor = {
       "sensor_name": "sensor.dash_last_trip_average_speed",
       "state": last_trip_average_speed,
       "attributes": {"friendly_name": "Last Trip Average Speed",
                      "unit_of_measurement": "mph"}
}

trip_distance_driven_sensor = {
       "sensor_name": "sensor.dash_last_trip_distance_driven",
       "state": last_trip_distance_driven,
       "attributes": {"friendly_name": "Last Trip Distance Driven",
                      "unit_of_measurement": "miles"}
}


sensor_list = []
sensor_list.append(driver_sensor)
sensor_list.append(vehicle_sensor)
sensor_list.append(trip_duration_sensor)
sensor_list.append(trip_average_speed_sensor)
sensor_list.append(trip_distance_driven_sensor)

for sensor in sensor_list:

  # Build the URL based on variables defined above
  hass_endpoint = hass_url + "/api/states/" + sensor["sensor_name"]

  # Define headers for HASS
  hass_headers = {'Accept': 'application/json',
                  'Content-Type': 'application/json',
                  'x-ha-access': hass_password}

  # Generate payload for HASS
  hass_payload = {
        "state": sensor["state"],
        "attributes": sensor["attributes"]
  }

  # POST data to HASS
  hass_request = requests.post(hass_endpoint,
                               headers=hass_headers,
                               data=json.dumps(hass_payload))