em-dash's recent activity

  1. Comment on What programming/technical projects have you been working on? in ~comp

    em-dash
    Link Parent
    Oh, absolutely, there'll need to be some sort of index somewhere. I imagine in practice the use will be something like "just use libgen if it still exists at the time, else consult the index and...

    Oh, absolutely, there'll need to be some sort of index somewhere. I imagine in practice the use will be something like "just use libgen if it still exists at the time, else consult the index and go pull a tape off the bookshelf". (Because of course this needs to be displayed on a bookshelf like a row of paper books.)

    If you could make a deduplicated extract of LG/OL converted into markdown and tar'd zstd? by upload Y-m-d I'd be interested in seeding that!

    The metadata, or contents? I suspect all the contents may be too big to keep on a reasonable set of hard drives even as compressed markdown (as funny as seeding torrents from tape would be, I do not think it would be a good idea), but once I have a decent chunk of them downloaded I'll convert what looks like a representative set and extrapolate and see how it looks.

    2 votes
  2. Comment on What programming/technical projects have you been working on? in ~comp

    em-dash
    (edited )
    Link Parent
    To add even more complexity for you, some more ideas to explore: namespaces for both leader+1 and leader+2 sequences, with the latter mapping to the few hundred most common words over three...

    To add even more complexity for you, some more ideas to explore:

    • namespaces for both leader+1 and leader+2 sequences, with the latter mapping to the few hundred most common words over three letters long. e.g. leader t becomes "the", but leader b c becomes "because".
    • tap-hold on letter keys (b c just types those two letters, but b+c together types "because")
    • foot pedals as extra layer shifts
    • circular keycaps so you can fit extra keys between your letter keys
    • adopt þ back into english, it is a cool letter and we were wrong to drop it
    • https://hackaday.com/2025/03/14/building-a-ten-hundred-key-computer-word-giving-thing/

    (hi, I am Tildes's resident Weird Input Device Enjoyer)

    2 votes
  3. Comment on What programming/technical projects have you been working on? in ~comp

    em-dash
    Link
    Another libgen mirror went down earlier this year, and as a result I have been exploring the rabbit hole of "can I download and store all of library genesis for a reasonable cost?". This comes out...

    Another libgen mirror went down earlier this year, and as a result I have been exploring the rabbit hole of "can I download and store all of library genesis for a reasonable cost?". This comes out to around 4TiB of fiction (which I can fit on my current hard drives), 45TiB of nonfiction (which I cannot, but conceivably could), and 81TB of science papers and magazines (which crosses the line into "maybe start looking into alternative storage formats").

    So I've landed on LTO-6 tape for now(*), and by spending entirely too much time scrolling through ebay, I have acquired a tape drive, the extra computer-bits required to use a tape drive, and 62.5TiB(**) of blank tapes, for a total of around $400. (Used-but-still-working hard drive prices are usually around $10/TiB, for scale.)

    (* At some point in the future, I want to experiment with DIYing optical tape, which is a very cool thing that the industry seems to have abandoned for unclear reasons, but that can be a separate project later.)

    (** Uncompressed size. LTO tapes have this bizarre marketing thing going on where they're advertised with sizes that assume some fixed compression ratio, which is Not How Data Compression Works At All. This is 25 tapes, each of which holds 2.5 real TiB or 6.25 imaginary marketing TiB.)

    The other side of this is figuring out which things to download. The lazy answer is "literally all of it", but this is a big and messy dataset with a lot of duplicates. If there are five copies of exactly the same book and four are potato-quality scans I see no reason to waste the extra space on them. If there are hardcover and paperback editions with different ISBNs but the same content, those can also be considered equivalent, since I'm only storing the content. I've been playing around with smushing the libgen and openlibrary database dumps together to try to deduplicate to a reasonable level.

    I'm glad someone else in my friend group has already taken on the role of Plex Server Person, else this would start to get silly.

    4 votes
  4. Comment on What programming/technical projects have you been working on? in ~comp

    em-dash
    Link Parent
    Ooh, you're running on one those fancy instances with the weird custom task scheduler patches. I've found them to make interesting decisions sometimes about when things need to be swapped out of...

    I get 750hrs/mo free ADHD

    Ooh, you're running on one those fancy instances with the weird custom task scheduler patches. I've found them to make interesting decisions sometimes about when things need to be swapped out of memory, but overall I've learned to work with it.

    3 votes
  5. Comment on Anubis works in ~tech

    em-dash
    Link Parent
    I just changed my browser's user agent in response to reading this, sighing and facepalming the whole time. I expect if Anubis catches on enough to be even slightly problematic to them, they'll do...

    and always seem to have a User-Agent containing "Mozilla".

    I just changed my browser's user agent in response to reading this, sighing and facepalming the whole time. I expect if Anubis catches on enough to be even slightly problematic to them, they'll do the same, and then I'll have to sigh and facepalm some more while I also find a way to evade whatever happens next.

    3 votes
  6. Comment on I'm tired of dismissive anti-AI bias in ~tech

    em-dash
    Link Parent
    I've talked before about the AI code review bot $work uses, which I'm pretty sure is just ChatGPT in a trenchcoat resold as a SaaSaaS product. Probably two thirds of what it says is somewhere...

    I've talked before about the AI code review bot $work uses, which I'm pretty sure is just ChatGPT in a trenchcoat resold as a SaaSaaS product.

    Probably two thirds of what it says is somewhere between completely inane and accurate-but-not-helpful. It frequently suggests adding null checks for non-nullable variables, for example. Last week I had it suggest adding some absolutely nonsensical type casts that would not even have compiled.

    The other third is actually very good. Today it caught a subtle bug, where I wrote a test case, made it pass, then refactored the test case such that it still passed but no longer actually tested what it should have.

    If someone made a tool that reliably did the second thing but not the first, I'd call it an amazing technological advance. But that's not what we have today. It's the noise mixed in that makes it frustrating. It's not quite bad enough to blanket ignore it, but it's still a lot of mental effort on my part to fact check everything it says.

    17 votes
  7. Comment on The Tiny Soapbox: a platform for small, low-stakes rants in ~talk

    em-dash
    Link
    The noun that means "how something is pronounced" should be "pronounciation". I refuse to spell or pronounce it "pronunciation". (I won a spelling bee once in middle school, which makes me a world...

    The noun that means "how something is pronounced" should be "pronounciation". I refuse to spell or pronounce it "pronunciation".

    (I won a spelling bee once in middle school, which makes me a world renowned speling expurt, so you have to listen to me)

    7 votes
  8. Comment on The Tiny Soapbox: a platform for small, low-stakes rants in ~talk

    em-dash
    Link Parent
    Related: kilo/hecto/deca should be K/H/D. They had a nice thing going where all the big multipliers were capital letters and the small multipliers were lowercase letters, but then drew the line...

    Related: kilo/hecto/deca should be K/H/D. They had a nice thing going where all the big multipliers were capital letters and the small multipliers were lowercase letters, but then drew the line between them between kilo and mega instead of at 1, which would have made sense.

    You may also be amused to learn that deciinches are a fairly common unit in circuit board layout. Everyone awkwardly calls them "2.54mm" or "100 mils" because they don't want to admit to using a beautifully cursed unit like deciinches.

    8 votes
  9. Comment on Is it possible to completely hide one’s activity on the Internet from one’s ISP? in ~tech

    em-dash
    Link Parent
    The trustworthy ones are paid. Not all paid ones are trustworthy, but none of the free ones are. Do note that a VPN doesn't remove the need to trust someone to deliver your traffic. The only time...

    The trustworthy ones are paid. Not all paid ones are trustworthy, but none of the free ones are.

    Do note that a VPN doesn't remove the need to trust someone to deliver your traffic. The only time this is helpful is if you trust the VPN more than you trust your ISP. In particular, if you're going to use one for illegal activities, you want it to be run from a place where the government you're worried about can't just order them to turn against you.

    20 votes
  10. Comment on What if we made advertising illegal? in ~tech

    em-dash
    Link
    There's a lot of "society couldn't exist without advertising" reactions in this thread, and... I don't think that's really the case, at least for reasonable definitions of "advertising". There are...

    There's a lot of "society couldn't exist without advertising" reactions in this thread, and... I don't think that's really the case, at least for reasonable definitions of "advertising". There are plenty of alternative ways we could do this sort of thing as a society, even without discarding the rest of capitalism.

    Imagine, for example, a world where in which you want to buy a chair.

    You go consult the big directory of chair vendors, as one does. Anyone who can sell you a chair is listed in this directory. (The same companies might be listed as pencil vendors too, if they happen to sell both chairs and pencils.)

    The directory lets you filter by which styles of chairs they sell, price ranges, warranties, whether they're new or used chair vendors or just a person who threw a single chair on craigslist, reviews from previous buyers, whatever support-or-boycott-this-company attributes people care about at the time.

    The vendors in the list are randomly sorted by default. You can choose to sort by prices or review scores instead. (You can't sort by name, because there's no good reason to, and it encourages that thing people used to do in the phone book days where companies would list their names starting with "AAAAA" so they'd appear at the start of the list.)

    At no point in this process is your attention forcibly diverted toward chair shopping. You are here because you took the explicit step of looking for a chair.

    "But @em-dash", you say, "wouldn't that mean there are still like 40 effectively-identical chair vendors left in the list even after applying a normal set of filters? How would they differentiate themselves?"

    They would not. If they have a problem with that, perhaps they should try being less identical in ways that matter to chair buyers, instead of spending unreasonable amounts of effort and money making 30-second videos of attractive people enthusiastically sitting in chairs.

    7 votes
  11. Comment on Paged out! issue 6 in ~comp

    em-dash
    Link Parent
    I mean, in theory you can use anything with at least three wires, but everyone looks at me weird when I slap DE-15 connectors on things that definitely should not have VGA support :) One fun idea...

    I mean, in theory you can use anything with at least three wires, but everyone looks at me weird when I slap DE-15 connectors on things that definitely should not have VGA support :)

    One fun idea I've had but not tried is using a weird custom 3-way cable between both halves and the host, in which the segment between the two halves has no differential data lines, but only +5V, ground, and the two SBU pins. This is almost certainly against some part of the spec but I don't know which part. I'm pretty sure it would work though, at the cost of not being able to use normal cables.

  12. Comment on Paged out! issue 6 in ~comp

    em-dash
    Link Parent
    Ugh. The number of hours I have spent looking for a suitable connector for split keyboards that's smaller than RJ-9 is very non-zero. There kind of aren't any, since USB kind of took over most of...

    Ugh. The number of hours I have spent looking for a suitable connector for split keyboards that's smaller than RJ-9 is very non-zero. There kind of aren't any, since USB kind of took over most of the connecting-things-to-other-things space.

  13. Comment on Life altering PostgreSQL patterns in ~comp

    em-dash
    Link Parent
    The numbers had to match numbers in the corresponding code-enum, so no, you really did need to insert them with explicit IDs. (Epilogue: eventually I did just drop the tables, after inventing a...

    The numbers had to match numbers in the corresponding code-enum, so no, you really did need to insert them with explicit IDs.

    (Epilogue: eventually I did just drop the tables, after inventing a mildly terrifying pipeline to move all the extra metadata they had accumulated into constants in the right places in the codebase. It involved SQL that generated vim commands which were then pasted into a terminal with the right file open in vim (this was before bracketed paste was widely supported). I do not remember why "just regex-replace a CSV dump into an array of structs" wasn't an option, but I'm pretty sure I did it with vim specifically because I thought it was funny.)

    1 vote
  14. Comment on Life altering PostgreSQL patterns in ~comp

    em-dash
    Link Parent
    Amusingly, with one particularly memorable pair of enum tables (in the same product) I had the opposite experience because of how badly they were handled: they had serial integer keys and people...

    helps insure data integrity and notification across teams

    Amusingly, with one particularly memorable pair of enum tables (in the same product) I had the opposite experience because of how badly they were handled: they had serial integer keys and people kept adding values to the end and using the same number as each other, and that caused Weird and Exciting Results.

    I finally "solved" this by writing the current maximum value for each in dry erase marker on the window next to my desk, and declaring that anyone wanting to add a new value must first physically walk over and increment the appropriate value and use the new value as the key. It's a mutex enforced by physics!

    3 votes
  15. Comment on Introductions | March 2025 in ~talk

    em-dash
    Link Parent
    :D "Insectweight" is a collective term covering antweight (1lb/454g weight limit) and beetleweight (3lb/1360g). But I am totally adding "aphid-hunting robot swarm" to my project list.

    :D
    "Insectweight" is a collective term covering antweight (1lb/454g weight limit) and beetleweight (3lb/1360g).
    But I am totally adding "aphid-hunting robot swarm" to my project list.

    2 votes
  16. Comment on Life altering PostgreSQL patterns in ~comp

    em-dash
    Link Parent
    I hate enum tables so much. They sound so tempting and yet they're so bad in practice. The thing about enum values is they usually correspond to enum values in your codebase (otherwise they...

    I hate enum tables so much. They sound so tempting and yet they're so bad in practice.

    The thing about enum values is they usually correspond to enum values in your codebase (otherwise they wouldn't really be enums, just data). Now instead of your application just requiring a specific schema version to work, it requires a specific schema and data version. This isn't better, it's just marginally different to migrate and more tempting to do it the wrong way.

    At least this implementation uses the text representation as the key. Most people will autopilot add an integer or UUID key and now you have to make sure that mapping is consistent across environments.

    4 votes
  17. Comment on Introductions | March 2025 in ~talk

    em-dash
    Link Parent
    Oh hey, that's also why I'm in Ohio, except mine is just finishing up her PhD. We're both excited to leave; this was only ever intended as temporary. I was talking to the friend from this story...

    moved to Ohio after graduating college for my wife’s PhD program

    Oh hey, that's also why I'm in Ohio, except mine is just finishing up her PhD. We're both excited to leave; this was only ever intended as temporary.

    I was talking to the friend from this story the other day about the butterfly effect, and joked that while I hold him personally responsible for my exile from Florida, I do at least have the advantage of no longer being in Florida.

    3 votes
  18. Comment on Introductions | March 2025 in ~talk

    em-dash
    Link Parent
    What's your favorite Chinese dish that most westerners wouldn't be familiar with?

    What's your favorite Chinese dish that most westerners wouldn't be familiar with?

    2 votes
  19. Comment on Introductions | March 2025 in ~talk

    em-dash
    Link
    How long have you been on Tildes? How did you find out about us? Since June 2023. I think this was during one of the people-leaving-reddit waves, but I had already been off of reddit for a year or...

    How long have you been on Tildes? How did you find out about us?

    Since June 2023. I think this was during one of the people-leaving-reddit waves, but I had already been off of reddit for a year or so before that. I don't remember how I wound up here, it kind of just happened.

    How did you choose your username?

    An em dash is a kind of punctuation, and Em is a shortened form of Emily. I use both of those names and enjoy puns and word-smashes.

    What are your interests?

    I like weird technology things, making things, and cats.

    A/S/L (age/(gender|pronouns|identifier)/location)

    33/(transfem, she/they)/Ohio, but expecting to be cured of that particular affliction later this year.

    What do you do? This could be in your spare time, for work, your passions.

    My job title is Software Engineer, though IMO that sounds a bit over-the-top and I usually go with "programmer" or "computer toucher" unless I'm trying to sound important. Most of my hobbies are extending the computer-touching out in directions that are more fun but harder to convince people to pay me for: current projects I have going on include modding buttons onto a music player, designing some custom tiny motor controllers for insectweight combat robots, a tool-assisted speedrun collaboration project with a friend, and the early planning stages of like twelve other things that may or may not ever get made.

    I also do gardening and woodworking, though haven't done much of either recently, and I picked up electric guitar again this year after not playing for 5ish years.

    Also a lot of my time/brain cycles recently have been spent pondering various aspects of moving and house-building.

    Do you want other users to PM/DM you from this thread?

    Sure, why not?

    Give us a fun fact (or a link!)! If there is anything to know about tilderinos, it's that we value knowledge sharing!

    Plants don't actually "grow toward the sun". Light makes them grow less, but the effect is localized, so if you have a bit of stem going sideways and light is only hitting the top half, the bottom half will grow faster than the top half. It's the same sort of effect as a bimetallic strip, but less metallic and more plantic.

    7 votes
  20. Comment on Google’s Taara is launching a new chip to deliver high-speed Internet with light in ~tech