-
2 votes
-
A few easy linux commands, and a real-world example on how to use them in a pinch
This below is a summary of some real-world performance investigation I recently went through. The tools I used are installed on all linux systems, but I know some people don't know them and would...
This below is a summary of some real-world performance investigation I recently went through. The tools I used are installed on all linux systems, but I know some people don't know them and would straight up jump to heavyweight log analysis services and what not, or writing their own solution.
Let's say you have request log sampling in a bunch of log files that contain lines like these:
127.0.0.1 [2021-05-27 23:28:34.460] "GET /static/images/flags/2/54@3x.webp HTTP/2" 200 1806 TLSv1.3 HIT-CLUSTER SessionID:(null) Cache:max-age=31536000
127.0.0.1 [2021-05-27 23:51:22.019] "GET /pl/player/123456/changelog/ HTTP/1.1" 200 16524 TLSv1.2 MISS-CLUSTER SessionID:(null) Cache:
You might recognize Fastly logs there (IP anonymized). Now, there's a lot you might care about in this log file, but in my case, I wanted to get a breakdown of hits vs misses by URL.
So, first step, let's concatenate all the log files with
cat *.log > all.txt
, so we can work off a single file.Then, let's split the file in two: hits and misses. There are a few different values for them, the majority are covered by either
HIT-CLUSTER
orMISS-CLUSTER
. We can do this by just grepping for them like so:grep HIT-CLUSTER all.txt > hits.txt; grep MISS-CLUSTER all.txt > misses.txt
However, we only care about url and whether it's a hit or a miss. So let's clean up those hits and misses with
cut
. The way cut works, it takes a delimiter (-d
) and cuts the input based on that; you then give it a range of "fields" (-f
) that you want.In our case, if we cut based on spaces, we end up with for example:
127.0.0.1
[2021-05-27
23:28:34.460]
"GET
/static/images/flags/2/54@3x.webp
HTTP/2"
200
1806
TLSv1.3
HIT-CLUSTER
SessionID:(null)
Cache:max-age=31536000
.We care about the 5th value only. So let's do:
cut -d" " -f5
to get that. We will alsosort
the result, because future operations will require us to work on a sorted list of values.cut -d" " -f5 hits.txt | sort > hits-sorted.txt; cut -d" " -f5 misses.txt | sort > misses-sorted.txt
Now we can start doing some neat stuff.
wc
(wordcount) is an awesome utility, it lets you count characters, words or lines very easily.wc -l
counts lines in an input, since we're operating with one value per line we can easily count our hits and misses already:$ wc -l hits-sorted.txt misses-sorted.txt 132523 hits-sorted.txt 220779 misses-sorted.txt 353302 total
220779 / 132523 is a 1:1.66 ratio of hits to misses. That's not great…
Alright, now I'm also interested in how many unique URLs are hit versus missed.
uniq
tool deduplicates immediate sequences, so the input has to be sorted in order to deduplicate our entire file. We already did that. We can now count our urls withuniq < hits-sorted.txt | wc -l; uniq < misses-sorted.txt | wc -l
. We get49778
and201178
, respectively. It's to be expected that most of our cache misses would be in "rarer" urls; this gives us a 1:4 ratio of cached to uncached URL.Let's say we want to dig down further into which URLs are most often hitting the cache, specifically. We can add
-c
touniq
in order to get a duplicate count in front of our URLs. To get the top ones at the top, we can then usesort
, in reverse sort mode (-r
), and it also needs to be numeric sort, not alphabetic (-n
).head
lets us get the top 10.$ uniq -c < hits-sorted.txt | sort -nr | head 815 /static/app/webfonts/fa-solid-900.woff2?d720146f1999 793 /static/app/images/1.png 786 /static/app/fonts/nunito-v9-latin-ext_latin-regular.woff2?d720146f1999 760 /static/CACHE/js/output.cee5c4089626.js 758 /static/images/crest/3/light/notfound.png 757 /static/CACHE/css/output.4f2b59394c83.css 756 /static/app/webfonts/fa-regular-400.woff2?d720146f1999 754 /static/app/css/images/loading.gif?d720146f1999 750 /static/app/css/images/prev.png?d720146f1999 745 /static/app/css/images/next.png?d720146f1999
And same for misses:
$ uniq -c < misses-sorted.txt | sort -nr | head 56 / 14 /player/237678/ 13 /players/ 12 /teams/ 11 /players/top/ <snip>
So far this tells us static files are most often hit, and for misses it also tells us… something, but we can't quite track it down yet (and we won't, not in this post). We're not adjusting for how often the page is hit as a whole, this is still just high-level analysis.
One last thing I want to show you! Let's take everything we learned and analyze those URLs by prefix instead. We can cut our URLs again by slash with
cut -d"/"
. If we want the first prefix, we can do-f1-2
, or-f1-3
for the first two prefixes. Let's look!cut -d'/' -f1-2 < hits-sorted.txt | uniq -c | sort -nr | head 100189 /static 5948 /es 3069 /player 2480 /fr 2476 /es-mx 2295 /pt-br 2094 /tr 1939 /it 1692 /ru 1626 /de
cut -d'/' -f1-2 < misses-sorted.txt | uniq -c | sort -nr | head 66132 /static 18578 /es 17448 /player 17064 /tr 11379 /fr 9624 /pt-br 8730 /es-mx 7993 /ru 7689 /zh-hant 7441 /it
This gives us hit-miss ratios by prefix. Neat, huh?
13 votes -
Windows Package Manager 1.0 Released
15 votes -
Pwned Passwords is now open-sourced via the .NET Foundation, and will be provided compromised passwords by the FBI
13 votes -
Fortnightly Programming Q&A Thread
General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads. Don't forget to format your code using the triple...
General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads.
Don't forget to format your code using the triple backticks or tildes:
Here is my schema: ```sql CREATE TABLE article_to_warehouse ( article_id INTEGER , warehouse_id INTEGER ) ; ``` How do I add a `UNIQUE` constraint?
6 votes -
An update on Flow's direction
6 votes -
CSS container queries: use cases and migration strategies
4 votes -
How to build a quick and dirty subtitle player
On my desk I have two screens -- one off to the side for movies, TV, etc and my main in front. Sometimes I find myself wanting subtitles on my main screen. The main issue I've found, at least with...
On my desk I have two screens -- one off to the side for movies, TV, etc and my main in front. Sometimes I find myself wanting subtitles on my main screen. The main issue I've found, at least with macOS, is that the SRT players suck.
I figured, why not just generate a tiny black video with embedded subtitles?
ffmpeg -i subs.srt -t 3:00:00 -s 40x10 -f rawvideo -pix_fmt rgb24 -r 25 -i /dev/zero subs.mpeg
Set the ratio to be super small without being too small. This video is 40px by 10px and the video only takes a few seconds to generate. For me, this generated at ~850x speed.
From there, jack up the subtitle font size and shift it up a little bit so nothing gets cut off. This also works really well with tiling window managers.
11 votes -
Sublime Text 4
22 votes -
The documentation system
7 votes -
[Google IO 2021] A high-level overview of how Excalidraw works and the browser APIs it uses
8 votes -
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
9 votes -
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
18 votes -
Text editing hates you too
13 votes -
Text rendering hates you
12 votes -
What unified login to use?
I'm setting up a server with nextcloud, plex, matrix and some other things I don't yet know, for some friends and family, (about 20 people if I get lucky) and now I heard of a thing called single...
I'm setting up a server with nextcloud, plex, matrix and some other things I don't yet know, for some friends and family, (about 20 people if I get lucky)
and now I heard of a thing called single sign on/unified login. (Login to different services with the same user/pw and/or login once, access to all services)so far I found out about Keycloak https://en.wikipedia.org/wiki/Keycloak
is this what I'm looking for? does anybody have experience in this? Are there other/better/simpler solutions for this?
12 votes -
FOSS and UX (twitter thread)
@Kavaeric: Let's walk through this, shall we?Say we've decided to make a new FOSS word processor. Call it, I dunno, Libra-Office or O-Pan-Office. Just a thought. Word processors, as you might guess, are also a fairly entrenched market.Who's our target audience?
26 votes -
Fortnightly Programming Q&A Thread
General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads. Don't forget to format your code using the triple...
General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads.
Don't forget to format your code using the triple backticks or tildes:
Here is my schema: ```sql CREATE TABLE article_to_warehouse ( article_id INTEGER , warehouse_id INTEGER ) ; ``` How do I add a `UNIQUE` constraint?
10 votes -
A modern boilerplate for Vite, React 17, and TypeScript 4.3
2 votes -
Haiku RISC-V port progress
4 votes -
Cloudflare introduces Cryptographic Attestation of Personhood, an experiment intended to replace CAPTCHAs
19 votes -
Observable Plot
2 votes -
Battlestar Galactica Lessons from Ransomware to the Pandemic
4 votes -
Google Docs will now use canvas based rendering
13 votes -
New major versions released for the six core Pallets projects - Flask 2.0, Werkzeug 2.0, Jinja 3.0, Click 8.0, ItsDangerous 2.0, and MarkupSafe 2.0
7 votes -
An interview with Linus Torvalds: Linux and Git
11 votes -
KeenWrite 2.0
12 votes -
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
13 votes -
ArchLabs 2021.05.02 Release
7 votes -
Linux bans the University of Minnesota for sending intentionally buggy patches in the name of research
58 votes -
Ventoy: Multi-ISO bootable USBs
18 votes -
Share your linux desktop/setup
I've put quite a bit of work into my i3 set up recently and I'm curious if the people here are interested in that kind of thing. I'd be interested in looking through configs to get ideas, and...
I've put quite a bit of work into my i3 set up recently and I'm curious if the people here are interested in that kind of thing.
I'd be interested in looking through configs to get ideas, and sharing screenshots and such.
Here is what my desktop looks like right now. Let me know what you think.
26 votes -
PC doesn't connect properly to thunderbolt dock if it's plugged in after booting
I have a CalDigit TS3+ dock that I switch between my M1 Macbook Pro and a System76 Galago Pro that's currently running Windows 10 (I'm sorry, FOSS gods). The M1 has no problem using the TS3+ when...
I have a CalDigit TS3+ dock that I switch between my M1 Macbook Pro and a System76 Galago Pro that's currently running Windows 10 (I'm sorry, FOSS gods). The M1 has no problem using the TS3+ when it's connected or disconnected and reconnected. But the Galago Pro will only connect to the dock if it was connected while booting up. Disconnecting and reconnecting leaves the dock in a state where only the pass-through thunderbolt port works. So my monitor that uses the pass-through for DisplayPort can operate after reconnecting the dock, but no other devices plugged into the dock will. My mouse/keyboard for example don't even receive power in this state.
Oddly when setting up the drivers for thunderbolt I found that the only drivers available are for Intel's NUCs. IIRC someone on reddit said that this is actually the software I need to install, so that's what I did.
Any help getting the dock to work better is appreciated! Currently I need to reboot the laptop every time I switch back from the M1 to the Galago Pro.
4 votes -
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
10 votes -
Is there some way for using Hacker News with a “mark as read” function?
note: I posted this on hacker news, some people here seem knowledgeble about hacker news so i though i would ask here also. basically in Reddit enhancement suite you can filter comments to only...
note: I posted this on hacker news, some people here seem knowledgeble about hacker news so i though i would ask here also.
basically in Reddit enhancement suite you can filter comments to only show comments that are “unread”, you click on a comment to mark it as read (or with email when clicking on a message marks it as read and you can even mark it as unread).
Is there something like that for hacker news? (a browser addon or some custom client).
5 votes -
Disclosure of a vulnerability in AI Dungeon that enabled accessing all users' private adventures, scenarios, and posts via its GraphQL API
16 votes -
An update on the UMN affair
10 votes -
Practical SQL for data analysis
13 votes -
CVE-2021-3156 - How sudo on Linux was hacked
14 votes -
Fortnightly Programming Q&A Thread
General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads. Don't forget to format your code using the triple...
General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads.
Don't forget to format your code using the triple backticks or tildes:
Here is my schema: ```sql CREATE TABLE article_to_warehouse ( article_id INTEGER , warehouse_id INTEGER ) ; ``` How do I add a `UNIQUE` constraint?
5 votes -
twtxt - a decentralised, minimalist microblogging service for hackers
6 votes -
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
13 votes -
TeXMe Demo: Self-rendering Markdown + MathJax documents
6 votes -
A guide to some newly supported, modern CSS pseudo-class selectors
4 votes -
The SPACE of Developer Productivity
3 votes -
Pyodide is now an independent project - The CPython 3.8 interpreter compiled to WebAssembly which allows Python to run in the browser, originally developed at Mozilla
9 votes -
Enzyme: Automatic differentiation of LLVM IR
8 votes -
Self hosting email at home?
I recently set up kubernetes to run on an old laptop. The goal was two-fold, 1 learn kubernetes and 2 setup an instance of nextcloud. I've managed to set everything up with cert renewals for my...
I recently set up kubernetes to run on an old laptop. The goal was two-fold, 1 learn kubernetes and 2 setup an instance of nextcloud. I've managed to set everything up with cert renewals for my domain and enabled dyndns in case my provider changes my ip. All well and good and quite nice learning experience! Now I would like to also start running my own email server and have some questions. Is ther any that have a helm chart that is easy to setup in kubernetes? Since I am running this from home I imagine I'm more likely to be classified as a spammer. What can I do to minimize the likelihood of that? I read somewhere about reverse DNS, but not entirely sure if it is possible to do given I am running it all at home via a regular ISP.
17 votes -
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
11 votes -
disroot (a provider of open source services such as mail) has received funding to implement mailbox encryption
17 votes