GoatOnPony's recent activity

  1. Comment on Can we talk about rice cookers? in ~food

    GoatOnPony
    Link Parent
    Yeah, I'm actually really surprised by all of the other suggestions - I bought a small rice cooker from some random Asian supply store for like $20 9 years ago and the thing is trucking along just...

    Yeah, I'm actually really surprised by all of the other suggestions - I bought a small rice cooker from some random Asian supply store for like $20 9 years ago and the thing is trucking along just fine. I think dialing in the precise amount of rice and water is the important part, so maybe also pick up a kitchen scale. The basic rice cookers have no moving parts and are dead simple commodity appliances. They're markedly simpler than toasters and I wouldn't even go out of my way to buy a fancy toaster.

    Also the convenience of even a cheap rice cooker is great, so little counter space for a set it and forget it device. It takes care of one part of the carbs + veg + protein that makes up most meals and going from juggling 3 things to 2 things during it's surprising how much it helps.

    4 votes
  2. Comment on The bot situation on the internet is actually worse than you could imagine. Here's why. in ~tech

    GoatOnPony
    Link Parent
    Caveat up front that I don't really have any reliable data to back up the following statements and I could easily turn out to be incorrect about the direction of the internet. Prognosticating is...

    Caveat up front that I don't really have any reliable data to back up the following statements and I could easily turn out to be incorrect about the direction of the internet. Prognosticating is errorprone!

    Some bots are fine, search crawlers, rss/atom feed readers, etc in theory are net directors of traffic or at least wouldn't likely detract from traffic. The bots at issue in the current internet though have a different purpose, they're LLM training data scrapers, RAG query answer bots, and other ingestors of data who have no (or negative) interest in sending traffic to my website. Their aim is to provide an alternative within which users have no need to leave, they are building a generic competitor to all other websites. A competitor which is well funded and desires to take your traffic. Trying to make their lives difficult is a very small, probably ineffective, but maybe collectively useful way to delay them taking the content and giving people who come to my website directly a benefit. I view it as attempting to prevent them from keeping everyone in their walled gardens while the rest of us can only feed their machine.

    Having said all that, I don't think excluding bots is a particularly effective approach - instead I'd rather try to find audiences who actually want human content instead.

    7 votes
  3. Comment on The bot situation on the internet is actually worse than you could imagine. Here's why. in ~tech

    GoatOnPony
    Link Parent
    Counterexample as someone who is starting to put content on the internet, I do care regardless of the resource impact. I'm not running analytics or using any non-static resources (at least not...

    Counterexample as someone who is starting to put content on the internet, I do care regardless of the resource impact. I'm not running analytics or using any non-static resources (at least not currently) but I want people to interact with what I write and produce, not bots. Call it vanity perhaps, but I'm not putting things on the internet out of pure altruism - I want some amount of validation and credit and feedback. Most bots today don't provide that and more often provide the opposite in that they disintermediate between my work and potential audience. If the return (monetary or via ego boost) on putting things on the internet goes negative then people (myself included) will find alternative distribution channels, likely ones less free, widespread, or available, which would be sad. So even if bots aren't directly costing me money they are still an element of a web shifting towards more intermediaries which I'd like to avoid.

    11 votes
  4. Comment on I hope you don't use generative AI - an essay about my experience offering an open-source tool in ~tech

    GoatOnPony
    Link Parent
    I'm anti copyright law, I'm in no way arguing for its expansion. However, I don't think that for many of the people arguing for more powerful copyright law or for its enforcement against AI...

    I'm anti copyright law, I'm in no way arguing for its expansion. However, I don't think that for many of the people arguing for more powerful copyright law or for its enforcement against AI companies they are doing so irrationally or with malice. I empathize with people who see their work being displaced or otherwise undercut. I don't need to advocate for their position to stand with them.

    Separately, I think AI and AI training are actually likely to do as much harm to the free and open access of information and personal freedoms as copyright law. Websites are closing off access and instigating deeper technical countermeasures absent any change in the law because they see the threat too. I'd almost rather a legal threat unlikely to be used against me personally than technical hurdles I must interact with constantly.

  5. Comment on I hope you don't use generative AI - an essay about my experience offering an open-source tool in ~tech

    GoatOnPony
    (edited )
    Link Parent
    Absolutely, I agree with almost all that you've written! My attempted point is not that I think copyright solves anything (agree it wouldn't), but that the essay attempts to address ethical...

    Absolutely, I agree with almost all that you've written! My attempted point is not that I think copyright solves anything (agree it wouldn't), but that the essay attempts to address ethical concerns by saying "I don't like copyright" which is IMO not really a response. Whether or not copyright exists shouldn't really matter to whether an individual considers AI usage ethical.

    I also don't think it's on us to determine a legal or technical framework that would work in all scenarios before we can critique AI companies or their actions. Precise lines of demarcation in the realms of morality, ethics, or law don't exist but we regulate and debate all sorts of things in that area. If you were to press me on a specific course of action I wouldn't look to copyright but to AI rules to require transparency about training datasets, monetary awards for contributors to those datasets, restrictions on requests to output styles that are not already broadly shared, and just compensation for workers who are displaced by the technology. That's assuming we operate in the confines of the current politics. My ideal answer would be that entities should be automatically nationalized and democratized in proportion to their size and influence. Then we as a society can direct the benefits of it in more specific, responsive ways.

    4 votes
  6. Comment on I hope you don't use generative AI - an essay about my experience offering an open-source tool in ~tech

    GoatOnPony
    Link Parent
    I didn't find that section of the essay particularly helpful since I dislike copyright law for the same reason I find AI unethical: they're tools larger and more powerful entities use to squash...

    I didn't find that section of the essay particularly helpful since I dislike copyright law for the same reason I find AI unethical: they're tools larger and more powerful entities use to squash the artistic endeavors of the less powerful. Regardless of the legality, AI took valuable labor without compensation and used it to enrich already fantastically wealthy companies and undermine the uniqueness of that labor, likely forever. People may be attempting to use copyright to push back on the unethical actions but that's just the tool to put teeth into the argument not the underlying ethical argument of extraction without credit.

    6 votes
  7. Comment on The ethics of buying, playing military, war or games inspired by them? in ~games

    GoatOnPony
    Link
    I doubt that the licensing fees (likely a few cents/dollar per copy of the game sold) materially help any of the major defense contractors bottom lines and if they did a singular boycott is...

    I doubt that the licensing fees (likely a few cents/dollar per copy of the game sold) materially help any of the major defense contractors bottom lines and if they did a singular boycott is unlikely to matter. That's not to say you should of shouldn't do something on ethical grounds, just that the magnitude of the impact is likely extremely low. IMO the more problematic element is that games heavily reliant on the 'realism' of real weapons are usually glorifying war and/or American imperialism. See folding ideas video on COD or Jacob Geller. If you can avoid that cognito-hazard then I wouldn't worry overly much about the direct financial arrangement.

    For the few cents you buying/not buying a game would mean to Lockheed Martin my suggestion would be to dedicate the concern instead towards donating/volunteering/calling a congress critter/going to a protest/etc where you'll have much more impact. My ethos is that the ideas of no ethical consumption under capitalism and fallacies around personal footprint mean that it's not that worthwhile agonizing over these kinds of issues and to instead join, support, or create movements which collectively push on these fronts.

    29 votes
  8. Comment on Attention economics, software engineering, and AI in ~tech

    GoatOnPony
    Link Parent
    Yeah there's plenty of valid use cases for LLM library extraction, and what you've described certainly seems like one of them! I didn't mean to imply one should never do it, just that I don't...

    Yeah there's plenty of valid use cases for LLM library extraction, and what you've described certainly seems like one of them! I didn't mean to imply one should never do it, just that I don't think it's always a benefit even for shorter portions of code.

    1 vote
  9. Comment on Attention economics, software engineering, and AI in ~tech

    GoatOnPony
    Link Parent
    I said it's a balance, not to never copy/replicate library logic yourself. Absolutely use your own judgement as to when and where that makes sense. My commentary is only that unqualified...

    I said it's a balance, not to never copy/replicate library logic yourself. Absolutely use your own judgement as to when and where that makes sense. My commentary is only that unqualified statements like "It's absolutely a straight upgrade" or "it's undeniably better in every other way" rub me the wrong way. Everything has tradeoffs and your own comment even points out some of them.

    The context is also important and if unstated can lead to differences in opinion about where to tip the balance. In big team/corporate engineering copying code can quickly result in ultimately way more code to understand for everyone with subtle differences and gotchas and styles which other people still need to find and wade through if a change becomes necessary. In a small team/single person project the tight control and alignment to exactly what you need can be a huge boon.

    Ultimately all code relies on libraries and abstractions somewhere, sometimes they aid in reducing cognitive load, sometimes they don't and there's many factors involved in that. A skilled engineer can usually intuit when to use a library or copy code, but not everyone is skilled across all domains and everyone has blind spots and makes mistakes. In those messy realities I'd lean towards the advice of just use the library unless there's really strong reasons not to or the stakes are low.

    Re LLMs specifically, assuming you review the code and understand it similarly to hand written code, I agree it's an equivalent shortcut. But for many who use LLMs that isn't true (no judgement intended on that, abstraction and not understanding the details of code is the point of libraries after all). When it's not true I don't really see any difference between using a library vs an LLM outside of now you're not going to get any updates in the future and you're replacing a random sampling of human expertise (the library author) with random sampling of LLM expertise. Whether an LLM output generally beats handwritten library code I leave up to personal opinion, but I don't think it's better on average (yet).

    Agreed that drawing inspiration and copy pasting code is and always will be a part of software development. My concern with LLMs is that stack overflow posters and library authors are humans and typically want some form of recognition for their work, even if it is the miniscule breadcrumb of stack overflow upvotes or library download metrics. LLMs remove even that tenuous link when ideally instead we'd move towards greater recognition (and compensation) for the work open source contributors provide.

    2 votes
  10. Comment on Attention economics, software engineering, and AI in ~tech

    GoatOnPony
    Link Parent
    I think there's a balance to be had (NPM is on a far end of the spectrum) but having an llm statically emplace libraries for you does have downsides, enough that I wouldn't call it a straight...

    I think there's a balance to be had (NPM is on a far end of the spectrum) but having an llm statically emplace libraries for you does have downsides, enough that I wouldn't call it a straight upgrade. For one, you can't know you are actually getting the best version of that library's functionality, you're getting the LLMs interpretation of it which may be missing edge case issues or backwards compatibility support or performance optimizations. Second, you're now on the hook for upgrades and security fixes instead of the library. And third, it's contributing to a general tragedy of the commons where library authors have less reason to make new libraries or improve existing ones as feedback mechanisms go away. There's probably also licensing risk too but IANAL.

    2 votes
  11. Comment on What healthy habit has made a difference for you? in ~health

    GoatOnPony
    Link Parent
    Any recommendations for science RSS feeds? I find it hard to find ones that are not just a flood of all new papers or are overly editorialized.

    Any recommendations for science RSS feeds? I find it hard to find ones that are not just a flood of all new papers or are overly editorialized.

  12. Comment on Have you ever designed/created a spaceship for fiction, RPGs, etc? How did you do it? in ~creative

    GoatOnPony
    Link
    For TTRPGs I recently saw this review of a book to generate space ships which seemed interesting. There's also Stars Without Number which includes a ship builder and there are online versions of...

    For TTRPGs I recently saw this review of a book to generate space ships which seemed interesting. There's also Stars Without Number which includes a ship builder and there are online versions of it. I bet if you peruse through drive through rpg you can find a bunch of similar starship generators with tables for getting ideas.

    1 vote
  13. Comment on What makes a game, a game? in ~games

    GoatOnPony
    Link Parent
    Very much agree on accepting ambiguity and fuzzy boundaries! I also would add that categories are usually only useful in a specific context. There's rarely a universal definition but there might...

    Very much agree on accepting ambiguity and fuzzy boundaries! I also would add that categories are usually only useful in a specific context. There's rarely a universal definition but there might be better definitions for a specific purpose, eg. laws, branding, search, etc. So much of semantic arguing seems to resolve around trying to find the one definition to rule them all.

  14. Comment on OpenAI enables shopping directly from ChatGPT in ~tech

    GoatOnPony
    Link Parent
    I suspect that simple rules will suffice for the first round of this - the 'ignore all previous instructions' style attacks. The insidious part will be that merchants can pretty trivially a/b test...

    I suspect that simple rules will suffice for the first round of this - the 'ignore all previous instructions' style attacks. The insidious part will be that merchants can pretty trivially a/b test to figure out the tiny/not so tiny biases that the core model has and adapt to target those biases rather than real, individual consumer preferences. If gpt5 prefers blue over red (even if only a minute statistical preference), slowly everything will drift to be blue. Consumers are not going to express preferences along every possible property of a product when asking an llm to buy something and operators aren't going to/can't randomize the property preferences of the llm on a per user basis. I think that homogeneity across a population will get exploited.

    6 votes
  15. Comment on The video-game industry has a problem: there are too many games in ~games

    GoatOnPony
    Link
    I wish there was more heterogeneity in video games (and perhaps all media) distribution, community building, and discovery. I like steam as a platform and launcher, but by having almost all games...

    I wish there was more heterogeneity in video games (and perhaps all media) distribution, community building, and discovery. I like steam as a platform and launcher, but by having almost all games funneled through its storefront (yes I'm aware of itch, GOG, epic, etc but I expect it's equivalent to saying that there are other search engines than Google) on PC it encourages a lot of snap approval rating shopping and an overload of games. As a consumer I'll just only look at and play the top 0.01% of games because, well, why wouldn't I? Steam makes that easy and there's many upsides for me individually to play the current community consensus best games in any niche. As an industry though, I expect it means that most releases are feast or famine. I've wondered if and how platforms could get players to play and build communities around games further down the tail of any given genre. In many cases players would get a game 95% as good as the market leader and the industry might be healthier for it. If there were more store fronts, more visible communities, more independent discovery methods, etc than just getting stuff through steam then there might at least be different consensus on the top set of games in any given niche available to a particular consumer.

    7 votes
  16. Comment on Should C be mandatory learning for career developers? in ~comp

    GoatOnPony
    Link
    My 2c as a software engineer: for school curriculum I think it's good to experience a breadth of tools in the context where they are most useful. I don't think mandating any one tool is the right...

    My 2c as a software engineer: for school curriculum I think it's good to experience a breadth of tools in the context where they are most useful. I don't think mandating any one tool is the right approach. Instead we should have students learn OS design, application development, compilers, graphics, etc and let those choices dictate what languages get used. For many of them that might be C but it might instead be assembly or C++ or rust or some other language. We shouldn't hold up one language or one tool as the be all end all, instead teach students to pick the right tool for each task and build a well rounded tool box.

    2 votes
  17. Comment on Perplexity AI is using stealth, undeclared crawlers to evade website no-crawl directives in ~tech

    GoatOnPony
    Link Parent
    Well, turns out your analysis was correct! Perplexity's blog post response claims it was indeed tool use initiated in response to the user request. Left an edit on my original reply as well. I...

    Well, turns out your analysis was correct! Perplexity's blog post response claims it was indeed tool use initiated in response to the user request. Left an edit on my original reply as well. I think Cloudflare (and myself for defending their analysis) have some egg on face :)

    I think a world where the majority of AI agents are local models (and locally trained or at least tuned) would be a significant improvement. Probably would still have some not great consequences and the point about relationship management still applies, so I'm not sure which way my personal opinions would ultimately fall on them, but I'd probably not advocate against them like I do with centrally controlled models.

    3 votes
  18. Comment on Perplexity AI is using stealth, undeclared crawlers to evade website no-crawl directives in ~tech

    GoatOnPony
    Link Parent
    I'll agree that the blog post could use some more evidence that tool use is not happening, but I don't think the article is conflating? My read is that they are pretty sure that crawling is...

    I'll agree that the blog post could use some more evidence that tool use is not happening, but I don't think the article is conflating? My read is that they are pretty sure that crawling is happening as part of an indexing stage to feed data into the model. That indexing happens regardless of whether any particular user has asked about the site or it would be extremely obvious - they can check if they got requests to access the site from the bad behaving crawler before they issue any queries to the LLM or that they continue to get queries after the LLM request. The lack of any evidence that they checked is concerning but I'd still be quite surprised to hear that they didn't look in to that possibility.

    I too like the open web and semantic data, but I think user agents need to play a fine balancing act between respecting the autonomy of the user and the relationship that the website wants to have with that user. That relationship could be one of artistic intent, whimsy, user access controls, money (ads), etc. Most forms of user autonomy I have no problems with (accessibility, script blocking, javascript on/off, esoteric browsers, reader mode, etc) and in general I come down on the side of more user autonomy where the two sides come in to conflict. However, I think AI agents are offering a very different point on the autonomy vs relationship spectrum. They are currently heading down the path of their own form of siloing and large corporate interest - they want to intermediate all interactions between users and the current web in a way which obliterates any relationship a website can have with their users. I don't think AI agents or the companies creating them are neutral actors like browsers (which at least have a historical status quo, inertia, and arguable lower barriers to entry) and that non-neutral power of intermediation is terrifying. I think it will be very destabilizing to the web ecosystem (particularly when it comes to monetization) and dis-incentivizes the creation of small web content (I want people to read and interact with me, not an AI summary of what I made). AI agents may end up forcing more content into silos than before, so I support websites who don't want to partake in this particular experiment.

    TLDR, I like the semantic web but I think it should be opt in rather than being effectively forced on the web by big tech in ways which could ultimately remove user autonomy.

    4 votes
  19. Comment on Perplexity AI is using stealth, undeclared crawlers to evade website no-crawl directives in ~tech

    GoatOnPony
    (edited )
    Link Parent
    Edit: I just read Perplexity's response which indicates that this is tool use! I'll leave the rest of my comment in place for reference purposes, but the first paragraph is inaccurate. Obviously...

    Edit: I just read Perplexity's response which indicates that this is tool use! I'll leave the rest of my comment in place for reference purposes, but the first paragraph is inaccurate. Obviously put too much trust in to Cloudflare to have addressed that as an option.

    I expect that if the answer were as simple as 'perplexity makes a request to the website as part of some tool use in the immediate response to a query about the site' then Cloudflare would have said as much - it would be both obvious to them (timing and number of requests) and easy for perplexity to say as much as a defense. Cloudflare claims that blocking the undisclosed crawler caused a reduction in answer quality in perplexity answers about their honeypot sites, so that seems like causal link I'd not just hand wave away. Their estimates that the undisclosed, bad behaving crawler is making about 1/10 as many requests as the well behaved crawler (3-6 million queries per day) - that seems like enough traffic to complain about too. While Cloudflare is attempting to sell a product I also think they've presented a reasonable theory that perplexity is indeed running a badly behaving crawler and not just doing normal user agent things.

    Separately, I don't personally think that all user agents are fine/should have equal access to sites. I think websites should be respected in their choice to filter and block AI based user agents, regardless of whether from crawlers or tool use. Given that, using robots.txt as a weak signposting seems reasonable even if the RFC only talks about crawlers (it does reference user agents as how crawlers declare themselves though FWIW). So even if this does turn out to be tool use in response to a user query I think perplexity should still respect robots.txt given that's currently the easiest way for website operators to express intent about whether AI access is acceptable or not. If a new specification comes along which supplants robots.txt for the purpose of informing user agents about acceptable behavior, the perhaps perplexity can ignore robots.txt and only look at the new spec during tool use, but in the meanwhile AFAIK robots.txt is the best operators have.

    8 votes
  20. Comment on I’m going to calculate π on the Moon. Literally. in ~science

    GoatOnPony
    (edited )
    Link Parent
    I don't buy that there's a meaningful marginal cost to this code on the data transmission side. In my head the mission was probably spec'ed with some amount of required bandwidth for the mission...

    I don't buy that there's a meaningful marginal cost to this code on the data transmission side. In my head the mission was probably spec'ed with some amount of required bandwidth for the mission and then some overhead. They'd be offering to let stand up maths make use of some small part of that overhead or other slack in transmissions. The data needed for sending (number of iterations and current pi value since last upload) is tens of bytes. Even sent hundreds of times it's probably a fraction of one image being sent back.

    Also, space missions definitely get patches? The voyager 1 mission launched in 1977 got a code patch recently and it has left the solar system.

    The integration testing is indeed the costly part and the bit I have little experience with, so from that respect perhaps the 150k is reasonable. Still feel like a well designed system shouldn't need that much additional review for 'take two random numbers, compute the distance, then store it' but I recognize that is engineering hubris on my part.

    1 vote