In which a foolish developer tries DevOps: critique my VPS provisioning script!
I'm attempting to provision two mirror staging and production environments for a future SaaS application that we're close to launching as a company, and I'd like to get some feedback on the...
I'm attempting to provision two mirror staging and production environments for a future SaaS application that we're close to launching as a company, and I'd like to get some feedback on the provisioning script I've created that takes a default VPS from our hosting provider, DigitalOcean, and readies it for being a secure hosting environment for our application instance (which runs inside Docker, and persists data to an unrelated managed database).
I'm sticking with a simple infrastructure architecture at the moment: A single VPS which runs both nginx and the application instance inside a containerised docker service as mentioned earlier. There's no load balancers or server duplication at this point. @Emerald_Knight very kindly provided me in the Tildes Discord with some overall guidance about what to aim for when configuring a server (limit damage as best as possible, limit access when an attack occurs)—so I've tried to be thoughtful and integrate that paradigm where possible (disabling root login, etc).
I’m not a DevOps or sysadmin-oriented person by trade—I stick to programming most of the time—but this role falls to me as the technical person in this business; so the last few days has been a lot of reading and readying. I’ll run through the provisioning flow step by step. Oh, and for reference, Ubuntu 20.04 LTS.
First step is self-explanatory.
#!/bin/sh
# Name of the user to create and grant privileges to.
USERNAME_OF_ACCOUNT=
sudo apt-get -qq update
sudo apt install -qq --yes nginx
sudo systemctl restart nginx
Next, create my sudo user, add them to the groups needed, require a password change on first login, then copy across any provided authorised keys from the root user which you can configure to be seeded to the VPS in the DigitalOcean management console.
useradd --create-home --shell "/bin/bash" --groups sudo,www-data "${USERNAME_OF_ACCOUNT}"
passwd --delete $USERNAME_OF_ACCOUNT
chage --lastday 0 $USERNAME_OF_ACCOUNT
HOME_DIR="$(eval echo ~${USERNAME_OF_ACCOUNT})"
mkdir --parents "${HOME_DIR}/.ssh"
cp /root/.ssh/authorized_keys "${HOME_DIR}/.ssh"
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
chown --recursive "${USERNAME_OF_ACCOUNT}":"${USERNAME_OF_ACCOUNT}" "${HOME_DIR}/.ssh"
sudo chmod 775 -R /var/www
sudo chown -R $USERNAME_OF_ACCOUNT /var/www
rm -rf /var/www/html
Installation of docker, and run it as a service, ensure the created user is added to the docker group.
sudo apt-get install -qq --yes \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository --yes \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get -qq update
sudo apt install -qq --yes docker-ce docker-ce-cli containerd.io
# Only add a group if it does not exist
sudo getent group docker || sudo groupadd docker
sudo usermod -aG docker $USERNAME_OF_ACCOUNT
# Enable docker
sudo systemctl enable docker
sudo curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
docker-compose --version
Disable root logins and any form of password-based authentication by altering sshd_config
.
sed -i '/^PermitRootLogin/s/yes/no/' /etc/ssh/sshd_config
sed -i '/^PasswordAuthentication/s/yes/no/' /etc/ssh/sshd_config
sed -i '/^ChallengeResponseAuthentication/s/yes/no/' /etc/ssh/sshd_config
Configure the firewall and fail2ban.
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw allow http
sudo ufw allow https
sudo ufw reload
sudo ufw --force enable && sudo ufw status verbose
sudo apt-get -qq install --yes fail2ban
sudo systemctl enable fail2ban
sudo systemctl start fail2ban
Swapfiles.
sudo fallocate -l 1G /swapfile && ls -lh /swapfile
sudo chmod 0600 /swapfile && ls -lh /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile && sudo swapon --show
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
Unattended updates, and restart the ssh daemon.
sudo apt install -qq unattended-upgrades
sudo systemctl restart ssh
Some questions
You can assume these questions are cost-benefit focused, i.e. is it worth my time to investigate this, versus something else that may have better gains given my limited time.
- Obviously, any critiques of the above provisioning process are appreciated—both on the micro level of criticising particular lines, or zooming out and saying “well why don’t you do this instead…”. I can’t know what I don’t know.
- Is it worth investigating tools such as
ss
orlynis
(https://github.com/CISOfy/lynis) to perform server auditing? I don’t have to meet any compliance requirements at this point. - Do I get any meaningful increase in security by implementing 2FA on login here using google authenticator? As far as I can see, as long as I'm using best practices to actually
ssh
into our boxes, then the likeliest risk profile for unwanted access probably isn’t via the authentication mechanism I use personally to access my servers. - Am I missing anything here? Beyond the provisioning script itself, I adhere to best practices around storing and generating passwords and ssh keys.
Some notes and comments
- Eventually I'll use the hosting provider's API to spin up and spin down VPS's on the fly via a custom management application, which gives me an opportunity to programmatically execute the provisioning script above and run some over pre- and post-provisioning things, like deployment of the application and so forth.
- Usage alerts and monitoring is configured within DigitalOcean's console, and alerts are sent to our business' Slack for me to action as needed. Currently, I’m settling on the following alerts:
- Server CPU utilisation greater than 80% for 5 minutes.
- Server memory usage greater than 80% for 5 minutes.
- I’m also looking at setting up daily fail2ban status alerts if needed.