-
1 vote
-
A few easy linux commands, and a real-world example on how to use them in a pinch
This below is a summary of some real-world performance investigation I recently went through. The tools I used are installed on all linux systems, but I know some people don't know them and would...
This below is a summary of some real-world performance investigation I recently went through. The tools I used are installed on all linux systems, but I know some people don't know them and would straight up jump to heavyweight log analysis services and what not, or writing their own solution.
Let's say you have request log sampling in a bunch of log files that contain lines like these:
127.0.0.1 [2021-05-27 23:28:34.460] "GET /static/images/flags/2/54@3x.webp HTTP/2" 200 1806 TLSv1.3 HIT-CLUSTER SessionID:(null) Cache:max-age=31536000
127.0.0.1 [2021-05-27 23:51:22.019] "GET /pl/player/123456/changelog/ HTTP/1.1" 200 16524 TLSv1.2 MISS-CLUSTER SessionID:(null) Cache:
You might recognize Fastly logs there (IP anonymized). Now, there's a lot you might care about in this log file, but in my case, I wanted to get a breakdown of hits vs misses by URL.
So, first step, let's concatenate all the log files with
cat *.log > all.txt
, so we can work off a single file.Then, let's split the file in two: hits and misses. There are a few different values for them, the majority are covered by either
HIT-CLUSTER
orMISS-CLUSTER
. We can do this by just grepping for them like so:grep HIT-CLUSTER all.txt > hits.txt; grep MISS-CLUSTER all.txt > misses.txt
However, we only care about url and whether it's a hit or a miss. So let's clean up those hits and misses with
cut
. The way cut works, it takes a delimiter (-d
) and cuts the input based on that; you then give it a range of "fields" (-f
) that you want.In our case, if we cut based on spaces, we end up with for example:
127.0.0.1
[2021-05-27
23:28:34.460]
"GET
/static/images/flags/2/54@3x.webp
HTTP/2"
200
1806
TLSv1.3
HIT-CLUSTER
SessionID:(null)
Cache:max-age=31536000
.We care about the 5th value only. So let's do:
cut -d" " -f5
to get that. We will alsosort
the result, because future operations will require us to work on a sorted list of values.cut -d" " -f5 hits.txt | sort > hits-sorted.txt; cut -d" " -f5 misses.txt | sort > misses-sorted.txt
Now we can start doing some neat stuff.
wc
(wordcount) is an awesome utility, it lets you count characters, words or lines very easily.wc -l
counts lines in an input, since we're operating with one value per line we can easily count our hits and misses already:$ wc -l hits-sorted.txt misses-sorted.txt 132523 hits-sorted.txt 220779 misses-sorted.txt 353302 total
220779 / 132523 is a 1:1.66 ratio of hits to misses. That's not great…
Alright, now I'm also interested in how many unique URLs are hit versus missed.
uniq
tool deduplicates immediate sequences, so the input has to be sorted in order to deduplicate our entire file. We already did that. We can now count our urls withuniq < hits-sorted.txt | wc -l; uniq < misses-sorted.txt | wc -l
. We get49778
and201178
, respectively. It's to be expected that most of our cache misses would be in "rarer" urls; this gives us a 1:4 ratio of cached to uncached URL.Let's say we want to dig down further into which URLs are most often hitting the cache, specifically. We can add
-c
touniq
in order to get a duplicate count in front of our URLs. To get the top ones at the top, we can then usesort
, in reverse sort mode (-r
), and it also needs to be numeric sort, not alphabetic (-n
).head
lets us get the top 10.$ uniq -c < hits-sorted.txt | sort -nr | head 815 /static/app/webfonts/fa-solid-900.woff2?d720146f1999 793 /static/app/images/1.png 786 /static/app/fonts/nunito-v9-latin-ext_latin-regular.woff2?d720146f1999 760 /static/CACHE/js/output.cee5c4089626.js 758 /static/images/crest/3/light/notfound.png 757 /static/CACHE/css/output.4f2b59394c83.css 756 /static/app/webfonts/fa-regular-400.woff2?d720146f1999 754 /static/app/css/images/loading.gif?d720146f1999 750 /static/app/css/images/prev.png?d720146f1999 745 /static/app/css/images/next.png?d720146f1999
And same for misses:
$ uniq -c < misses-sorted.txt | sort -nr | head 56 / 14 /player/237678/ 13 /players/ 12 /teams/ 11 /players/top/ <snip>
So far this tells us static files are most often hit, and for misses it also tells us… something, but we can't quite track it down yet (and we won't, not in this post). We're not adjusting for how often the page is hit as a whole, this is still just high-level analysis.
One last thing I want to show you! Let's take everything we learned and analyze those URLs by prefix instead. We can cut our URLs again by slash with
cut -d"/"
. If we want the first prefix, we can do-f1-2
, or-f1-3
for the first two prefixes. Let's look!cut -d'/' -f1-2 < hits-sorted.txt | uniq -c | sort -nr | head 100189 /static 5948 /es 3069 /player 2480 /fr 2476 /es-mx 2295 /pt-br 2094 /tr 1939 /it 1692 /ru 1626 /de
cut -d'/' -f1-2 < misses-sorted.txt | uniq -c | sort -nr | head 66132 /static 18578 /es 17448 /player 17064 /tr 11379 /fr 9624 /pt-br 8730 /es-mx 7993 /ru 7689 /zh-hant 7441 /it
This gives us hit-miss ratios by prefix. Neat, huh?
13 votes -
The bashtop resource monitor is a work of art
12 votes -
Typesetting Markdown - Part 8
5 votes -
Oil 0.8.pre4: The Biggest Shell Programs in the World
7 votes -
Converting Project Gutenberg Projects to Markdown
12 votes -
What terminal emulator do you use?
What are your experiences with your current terminal emulator or former ones? What makes you use your current terminal emulator? What shell do you use?
16 votes -
How can I make "whereis" automatically open the file on Nvim when it is the only result?
EDIT: SOLVED It looks like it was much simple than I thought and someone solved it on Reddit already. I won't delete, just leave the link if someone is interested. Runtime Environment OS: MX Linux...
EDIT: SOLVED
It looks like it was much simple than I thought and someone solved it on Reddit already. I won't delete, just leave the link if someone is interested.
Runtime Environment
- OS: MX Linux 18
- Result of Y: 4.19.0-5-amd64
- dotfiles
- i3 version: 4.13
- ~/.config/i3
- GNU Emacs: 27.0.50
- ~/.emacs.d
Issue
Sometimes I use "whereis" (aliased for "wh", but it doesn't make any difference...) for my own scripts.
I usually copy their paths manually (using tmux) and paste to the command line resulting in something like this:
nvim /home/my_username/my_scripts_folder/my_script
Could I make that into a single command?
Thanks in advance!
3 votes -
How Bash completion works
6 votes -
Humble Book Bundle: Linux & UNIX by O'Reilly
8 votes -
The features and history of GNU Readline
4 votes -
Cleaning your GitHub profile with a simple Bash script
5 votes -
Two-factor authentication for home VNC via Signal
For my particular use case I share my home PC with my spouse and since I'm the more tech-savvy of the two I'll need to occasionally remote in and help out with some random task. They know enough...
For my particular use case I share my home PC with my spouse and since I'm the more tech-savvy of the two I'll need to occasionally remote in and help out with some random task. They know enough that the issue will usually be too complex to simply guide over the phone, so remote control it is.
I'm also trying to improve my personal efforts toward privacy and security. To that end I want to avoid closed-source services such as TeamViewer where a breach on their end could compromise my system.
The following is the current state of what I'm now using as I think others may benefit from this as well:
Setup
Web
I use a simple web form as my first authentication. It's just a username and password, but it does require a web host that supports server side code such as PHP. In my case I just created a blank page with nothing other than the form and when successful the page generates a 6 digit PIN and saves it to a text file in a private folder (so no one can simply navigate to it and get the PIN).
I went the text file route because my current hosting plan only allows 1 database and I didn't want to add yet another random table just for this 1 value.
Router
To connect to my home PC I needed to forward a port from my router. I'm going to use VNC as it lets me see what is currently shown on the monitor and work with someone already there so I forward port 5900 as VNC's default port. You can customize this if you want. Some routers allow you to SSH into their system and make changes that way so a step more secure would be to leave the port forward disabled and only enable it once a successful login from the web form is disabled. In my case I'll just leave the port forwarded all the time.
IP Address
To connect to my computer I need to know it's external IP address and for this I use FreeDNS from Afraid.org. My router has dynamic DNS support for them already included so it was easy to plug in my details to generate a URL which will always point to my home PC (well, as long as my router properly sends them my latest IP address). If your router doesn't support the dynamic DNS you choose many also allow either a download or the settings you would need to script your own to keep your IP address up to date with their service.
Signal
Signal is an end-to-end encrypted messenger which supports text, media, phone and video calls. There's also a nifty command line option on Github called Signal-cli which I'm using to provide my second form of authentication. I just downloaded the package, moved to my $PATH (in my case /usr/local/bin) and set it up as described on their README. In my case I have both a normal cell phone number and another number provided by Google Voice. I already use my normal cell phone number with Signal so for this project I used Signal-cli to register a new account using my Google Voice number.
VNC
My home PC runs Ubuntu 18.04 so I'm using x11vnc as my VNC server. Since I'm leaving my port forwarded all the time I most certainly do NOT want to leave VNC also running. That's too large a security risk for me. Instead I've written a short bash script that first checks the web form using curl and https (so it's encrypted) with its own login information to check if any PIN numbers have been saved. If a PIN is found the web server sends that back and then deletes the PIN text file. Meanwhile the bash script uses the PIN to start a VNC session with that PIN as the password and also sends my normal cell the PIN via Signal-cli so that I can login.
I have this script set to run every minute so I'm not waiting long after web login and I also have the x11vnc session set to timeout after a minute so I can quickly connect again should I mess something up. It's also important that x11vnc is set to auto exit after closing the session so that it's not left up for an attacker to attempt to abuse.
System Flow
Once everything is setup and working this is what it's like for me to connect to my home PC:
- Browse to my web form and login
- Close web form and wait for Signal message
- Launch VNC client
- Connect via dynamic DNS address (saved to VNC client)
- Enter PIN code
- Close VNC when done
Code
Here's some snippets to help get you started
PHP for Web Form Processing
<?php // Variables $username = 'your_username'; $password = 'your_password_super_long_and_unique'; $filename = 'path_to_private_folder/vnc/pin.txt'; // Process the login form if($action == 'Login'){ $file = fopen($filename,'w'); $passwd = rand(100000,999999); fwrite($file,$passwd); fclose($file); exit('Success'); } // Process the bash script if($action == 'bash'){ if(file_exists($filename)){ $file = fopen($filename,'r'); $passwd = fread($file,filesize($filename)); fclose($filename); unlink($filename); exit($passwd); } else { exit('No_PIN'); } } ?>
Bash for x11vnc and Signal-cli
# See if x11vnc access has been requested status=$(curl -s -d "u=your_username&p=your_password_super_long_and_unique&a=bash" https://vnc_web_form.com) # Exit if nothing has been requested if [ "$status" = "No_PIN" ]; then # No PIN so exit; log the event if you want exit 0 fi # Strip non-numeric characters num="${status//[!0-9]/}" # See if they still match (prevent error messages from triggering stuff) if [ $status != $num ]; then # They don't match so probably not a PIN - exit; log it if you want exit 1 fi # Validate pin number num=$((num + 0)) if [ $num -lt 100000 ]; then # PIN wasn't 6 digits so something weird is going on - exit; log it if you want exit 1 fi if [ $num -gt 999999 ]; then # Same as before exit 1 fi # Everything is good; start up x11vnc # Log event if you want # Get the current IP address - while dynamic DNS is in place this serves as a backup ip=$(dig +short +timeout=5 myip.opendns.com @resolver1.opendns.com) # Send IP and password via Signal # Note that phone number includes country code # My bash is running as root so I run the command as my local user where I had registered Signal-cli su -c "signal-cli -u +google_voice_number send -m '$num for $ip' +normal_cell_number" s3rvant # Status was requested and variable is now the password # this provides a 1 minute window to connect with 1-time password to control main display # again run as local user su -c "x11vnc -timeout 60 -display :0 -passwd $num" s3rvant
Final Thoughts
There are more secure ways to handle this. Some routers support VPN for the connect along with device certificates which are much stronger than a 6 digit PIN code. Dynamically opening and closing the router port as part of the bash script would also be a nice touch. For me this is enough security and is plenty convenient enough to quickly offer tech support (or nab some bash code for articles like this) on the fly.
I'm pretty happy with how Signal-cli has worked out and plan to use it again with my next project (home automation). I'll be sure to post again once I get that ball rolling.
13 votes -
Oil: Success With the Interactive Shell
9 votes -
Bash-5.0 release available
17 votes