Unrelated to my other reply but the point of using a git repository which is hosted by someone else is that if they die, their work can still be used. It's meant to outlive them.
Unrelated to my other reply but the point of using a git repository which is hosted by someone else is that if they die, their work can still be used. It's meant to outlive them.
Shenanigans like these are why I've since moved all of my code over to Sourcehut. I take it one step further by mirroring all of my repositories on GitLab, Bitbucket, and GitHub. Git's capacity...
Shenanigans like these are why I've since moved all of my code over to Sourcehut. I take it one step further by mirroring all of my repositories on GitLab, Bitbucket, and GitHub.
Git's capacity for decentralization has long been one of its best features. Back up your stuff, people!
Sourcehut hasn't been that great in my experience, the interface feels fairly dated and the author doesn't seem to feel the need to add any accessibility features. I've had to deal with libraries...
Sourcehut hasn't been that great in my experience, the interface feels fairly dated and the author doesn't seem to feel the need to add any accessibility features. I've had to deal with libraries written by SirCmpwn in the past and it wasn't very pleasant to file issues with them on the other end, so I extrapolate that the experience will be similar on sourcehut.
I would recommend GitLab since it has the most features so most devs won't need to even switch to another platform for any of the many tasks they might need to do (CI and friends). Alternatively you can selfhost Gitea, which has been an utter joy.
I got far enough through that thread (stopped when the insults started) to understand it, and he’s against HTML patch emails? That sounds completely reasonable given how HTML email has a habit of...
I got far enough through that thread (stopped when the insults started) to understand it, and he’s against HTML patch emails? That sounds completely reasonable given how HTML email has a habit of making a mess of plain text parsers.
I'll take HTML for parsing over plain text any day. Even at its most basic, you can pull out things like links and headings by tag rather than by mile-long regex or brittle methods like line...
I'll take HTML for parsing over plain text any day. Even at its most basic, you can pull out things like links and headings by tag rather than by mile-long regex or brittle methods like line counting. Add a few sensible IDs, classes, and general bits of metadata and it's even on its way to being self documenting.
Judicious use of HTML, by which I pretty much mean the subset that markdown allows, also makes it much easier for a human to pick out and remember the info they need compared to a wall of identical text (looking at you, IBM Bluemix alerts).
All that said, we fortunately don't even have to choose! Just throw HTML, plain text, and JSON all in there with the appropriate headers and keep everyone happy.
He's against any HTML mail traffic on his service, regardless if it contains patches or not. Since lists are intended to replace any sort of issue discussion on github, that closes the doors for a...
He's against any HTML mail traffic on his service, regardless if it contains patches or not. Since lists are intended to replace any sort of issue discussion on github, that closes the doors for a huge number of contributors.
FWIW I don't and won't use the software because I don't need it, but still: interface is dated to you, other people (including me) take functional over dated; and the claim about accessibility is...
Sourcehut hasn't been that great in my experience, the interface feels fairly dated and the author doesn't seem to feel the need to add any accessibility features.
FWIW I don't and won't use the software because I don't need it, but still: interface is dated to you, other people (including me) take functional over dated; and the claim about accessibility is just false. The same thread, couple toots further:
and in the future I intend to expand on this for better a11y support, alongside my other contributions to free software a11y.
aerc is a new mail client by Drew himself. And then, a couple toots below:
HTML emails are forbidden on lists.sr.ht and will continue to be, end of discussion. If that presents an issue for a11y I would sooner work on making more email clients accessible than I'd allow HTML emails on sr.ht. Automatic conversion of HTML to plaintext would break all kinds of things, important things.
Also @ols: it is only a single toot that includes insults in that thread.
"Geez Dave, why do you STILL refuse to use 'the cloud' in this day and age?! You're such a dinosaur!" I'm no Luddite. But I would rather fill a warehouse with a million Raspberry Pies running as a...
"Geez Dave, why do you STILL refuse to use 'the cloud' in this day and age?! You're such a dinosaur!"
I'm no Luddite. But I would rather fill a warehouse with a million Raspberry Pies running as a distributed godzilla v-server, write the entire damn operating system, compiler, then my apps - and go broke+insane in the attempt - before I'd trust ANY soul-less corporate shitbag entity to be honest, fair or even decent in their interactions with me. And the day I have to "trust" one of them with my life's work is the day I hang my keyboard on the wall for good.
I doubt that guy learned the lesson he should be learning from this. Nor did most of his followers.
What gets me is how easy it is to roll your own now. Hardware has continued advancing in performance per dollar. A single hardware box has vast capabilities compared to ten years ago, and putting...
What gets me is how easy it is to roll your own now. Hardware has continued advancing in performance per dollar. A single hardware box has vast capabilities compared to ten years ago, and putting a couple together in a homebrew cluster has never been easier or cheaper. Bandwidth capability based on internet connection (and number of simultaneous connections) is more of a limiting factor than the hardware is, and it's not very limiting.
Given the nature of GitHub and the fact that they make it easy to open free accounts, it seems reasonable that they would have some kind of automated system for detecting and banning spammers....
Given the nature of GitHub and the fact that they make it easy to open free accounts, it seems reasonable that they would have some kind of automated system for detecting and banning spammers. Clearly there was a flaw in their filters that was exposed here, but it doesn’t seem like any malicious intent on their part. I wonder if GitHub treats paid accounts the same as free ones in this regard.
His account is paid but yes, it does make sense but I think a human should always be verifying that the machine isn't doing something stupid. At the very least with paid accounts. I think the...
His account is paid but yes, it does make sense but I think a human should always be verifying that the machine isn't doing something stupid. At the very least with paid accounts.
I think the worst part is how they didn't warn or notify him in any way though.
I agree, returning a generic 404 error page was not the right thing to do. I get that they might not want to make it obvious for bots to detect if they have been banned or not. But a simple error...
I agree, returning a generic 404 error page was not the right thing to do. I get that they might not want to make it obvious for bots to detect if they have been banned or not. But a simple error page with a link to a high-priority support email address would have been so much more helpful for the user.
A email is just as useful to a bot as a 404 in my eyes. If website == 404 or email.title == "Your account has been suspended": start_over() This only benefits the user, the bot wouldn't really care.
A email is just as useful to a bot as a 404 in my eyes.
Ifwebsite==404oremail.title=="Your account has been suspended":start_over()
This only benefits the user, the bot wouldn't really care.
The point of a 404 error page is two-fold: You don't want people being able to try different kinds URLs under a given organization's control to find which ones are valid and which ones aren't. If...
The point of a 404 error page is two-fold:
You don't want people being able to try different kinds URLs under a given organization's control to find which ones are valid and which ones aren't. If you didn't throw a generic 404 for both missing repos and missing permissions to view repos, a user (or many users, particularly bots) could determine what projects different enterprise organizations are currently developing, even if those projects are set to private visibility. This is very much a security issue that an enterprise organization would want to avoid.
Spammers tend to traverse websites in an automated fashion via some web crawling behavior. Specifically, visit some main page, find relevant links (e.g. public project pages), visit those links, post spam content, then rinse and repeat. If you add a "you were banned" page, then the bot need only check on any given visit if that ban language exists and, if so, the bot can hand-off processing to a different bot in the network. To make efforts more difficult for bot spamming, it's better to throw up a generic 404 page so the bot won't know the difference. Granted, it would be possible for a second non-posting account to work in tandem to find non-404 pages and allow the posting account to submit spam messages to the ones found, but there are still possible mitigation strategies to combat this, e.g. noting which accounts visit the same sets of pages. The entire point isn't to become immune to spam, but to make spam increasingly more difficult and therefore not a worthwhile venture.
Regarding some other points in the article itself:
He complains about the lack of an email, but emails alert bots that they were blocked, allowing them to simply hand-off their work to other bots in the network, as described in the error page scenario above. He also complains about why they didn't "just block that one spammy message", but that is also known to fail horrendously by using different spam messages and variations of those spam messages to find holes in the detection algorithm(s) being used.
Now, that being said, there certainly shouldn't have been an automated action applied in this scenario. Spam reports should be taken seriously, but for an older, more active account with no preexisting history of abuse, his account should've only been flagged for manual review, not automatically deactivated.
There's a general philosophy one should follow when automating a task: always have a human validate the results of the automation. Inherently trusting computers to not screw things up is a massive mistake.
I'm frankly astounded that someone who considers themselves a professional only keeps one backup of their "life's work", and on a cloud service at that. Even the random youtubers I follow keep...
Rohrer added he was "astounded by the completely unprofessional behavior" of the services he's using to run One Hour One Life.
I'm frankly astounded that someone who considers themselves a professional only keeps one backup of their "life's work", and on a cloud service at that.
Even the random youtubers I follow keep copies of their (likely sizeable, and considerable # of) video project files from things they've mentioned about project work. Anyone making a living off their data should literally be safeguarding that data with their life.
Not to sound like a techie prepper but I'd want 3 copies minimum (as usual no more than 2 onsite) to even begin to feel secure. Involving finances is more like 4.
What makes you think he didn't have backups? There are plenty of services out there that I would be frustrated and inconvenienced by being banned from because they provide a good & valuable...
What makes you think he didn't have backups? There are plenty of services out there that I would be frustrated and inconvenienced by being banned from because they provide a good & valuable service, and/or have critical mass among my friends and colleagues.
It doesn't mean I rely solely on those services, or that I couldn't restore my local backups to another service if necessary, but I'd still be grumbling loudly about it to whoever would listen.
He does have backups as noted here You can't backup issues or unmerged pull requests though, that's just not possible. Furthermore, the purpose of using a cloud service like Github is that it will...
You can't backup issues or unmerged pull requests though, that's just not possible.
Furthermore, the purpose of using a cloud service like Github is that it will outlive you. Your code will go on. That's the whole point of the operation as noted here
You can, but not through an official GitHub tool (at least as far as I know). I've used scripts to pull down all the issues from a project in JSON format and save them in a text file, which can be...
You can't backup issues or unmerged pull requests though, that's just not possible.
You can, but not through an official GitHub tool (at least as far as I know). I've used scripts to pull down all the issues from a project in JSON format and save them in a text file, which can be backed up. It's probably pretty unusual for anyone to set that up though.
Doing that would most likely mean you can't import it back and even if you were to import it back somehow, an issue which isn't linked to the issuer's account is mostly useless unless you've...
Doing that would most likely mean you can't import it back and even if you were to import it back somehow, an issue which isn't linked to the issuer's account is mostly useless unless you've already been able to reproduce the issue yourself. It's not really a good solution :p
Microsoft's code-sharing site GitHub has caused a scare for developer Jason Rohrer after the company, without explanation or warning, blocked him from all his code repositories.
If I am flagged, that's sad. The title is clickbaity and the part I quoted is at least half of the meat of the article. And even then this is one of those articles that is three sentences long...
If I am flagged, that's sad. The title is clickbaity and the part I quoted is at least half of the meat of the article. And even then this is one of those articles that is three sentences long when you skip the repeating and word salad: this guys github account was blocked because of a false positive, github has rather switfly fixed the problem, this is a current trend where scarcely-supervised automated systems cause troubles like this.
I like it when people, especially OPs, provide tldrs to articles like this. I put them in my posts, and I put it here to potentially save a click to those who like me won't appreciate what they found when they clicked through.
But if the content of the article boils down to : Shocking title -a mistake made by an automated system - it has since been fixed. Some summary is probably better than the clickbait title
But if the content of the article boils down to :
Shocking title
-a mistake made by an automated system - it has since been fixed.
Some summary is probably better than the clickbait title
Whether or not they are mistakes by an automated system, this is a real danger for those who use Github as a backup service for their projects. The likeliness of them getting back their stuff...
Whether or not they are mistakes by an automated system, this is a real danger for those who use Github as a backup service for their projects. The likeliness of them getting back their stuff compared to Jason are a lot lower. It's not clickbait, it's a warning.
That is not disputed. Whether we learn it from a little summary here or reading it on ZDNet awarding them the click they were baiting for is. I'm not picking on you BTW, thanks for posting this. I...
That is not disputed. Whether we learn it from a little summary here or reading it on ZDNet awarding them the click they were baiting for is.
I'm not picking on you BTW, thanks for posting this. I am just saying why the summary is necessary IMHO.
FWIW I'd rather have some more original reporting here and links as backing references. I think if we do that that'd add quite a bit of value to the platform.
I consider it good etiquette online to add a little description to the article I'm sharing. It lets others discern better whether they want to read something or not, based off more than just a...
I consider it good etiquette online to add a little description to the article I'm sharing. It lets others discern better whether they want to read something or not, based off more than just a headline, which are meant to grasp the very essence while omitting as much as possible.
Most often, the best description is a direct selective quote from the article.
I'm thinking what @cadadr shared could've been used better as a subtitle for the topic itself, so as to explain the situation a little better.
If it's important, make backups. Always. No exceptions.
And he did have copies locally but it's still a terrible practice, you can't backup the metadata of github repos anyhow.
the irony of integrating a VCS with a service you cannot restore data from...
Unrelated to my other reply but the point of using a git repository which is hosted by someone else is that if they die, their work can still be used. It's meant to outlive them.
Yeah, true. Still, l wouldn't use one public location only
Shenanigans like these are why I've since moved all of my code over to Sourcehut. I take it one step further by mirroring all of my repositories on GitLab, Bitbucket, and GitHub.
Git's capacity for decentralization has long been one of its best features. Back up your stuff, people!
Sourcehut hasn't been that great in my experience, the interface feels fairly dated and the author doesn't seem to feel the need to add any accessibility features. I've had to deal with libraries written by SirCmpwn in the past and it wasn't very pleasant to file issues with them on the other end, so I extrapolate that the experience will be similar on sourcehut.
I would recommend GitLab since it has the most features so most devs won't need to even switch to another platform for any of the many tasks they might need to do (CI and friends). Alternatively you can selfhost Gitea, which has been an utter joy.
I got far enough through that thread (stopped when the insults started) to understand it, and he’s against HTML patch emails? That sounds completely reasonable given how HTML email has a habit of making a mess of plain text parsers.
I'll take HTML for parsing over plain text any day. Even at its most basic, you can pull out things like links and headings by tag rather than by mile-long regex or brittle methods like line counting. Add a few sensible IDs, classes, and general bits of metadata and it's even on its way to being self documenting.
Judicious use of HTML, by which I pretty much mean the subset that markdown allows, also makes it much easier for a human to pick out and remember the info they need compared to a wall of identical text (looking at you, IBM Bluemix alerts).
All that said, we fortunately don't even have to choose! Just throw HTML, plain text, and JSON all in there with the appropriate headers and keep everyone happy.
[Edit] Autocorrect failures
He's against any HTML mail traffic on his service, regardless if it contains patches or not. Since lists are intended to replace any sort of issue discussion on github, that closes the doors for a huge number of contributors.
FWIW I don't and won't use the software because I don't need it, but still: interface is dated to you, other people (including me) take functional over dated; and the claim about accessibility is just false. The same thread, couple toots further:
aerc is a new mail client by Drew himself. And then, a couple toots below:
Also @ols: it is only a single toot that includes insults in that thread.
"Geez Dave, why do you STILL refuse to use 'the cloud' in this day and age?! You're such a dinosaur!"
I'm no Luddite. But I would rather fill a warehouse with a million Raspberry Pies running as a distributed godzilla v-server, write the entire damn operating system, compiler, then my apps - and go broke+insane in the attempt - before I'd trust ANY soul-less corporate shitbag entity to be honest, fair or even decent in their interactions with me. And the day I have to "trust" one of them with my life's work is the day I hang my keyboard on the wall for good.
I doubt that guy learned the lesson he should be learning from this. Nor did most of his followers.
What gets me is how easy it is to roll your own now. Hardware has continued advancing in performance per dollar. A single hardware box has vast capabilities compared to ten years ago, and putting a couple together in a homebrew cluster has never been easier or cheaper. Bandwidth capability based on internet connection (and number of simultaneous connections) is more of a limiting factor than the hardware is, and it's not very limiting.
Given the nature of GitHub and the fact that they make it easy to open free accounts, it seems reasonable that they would have some kind of automated system for detecting and banning spammers. Clearly there was a flaw in their filters that was exposed here, but it doesn’t seem like any malicious intent on their part. I wonder if GitHub treats paid accounts the same as free ones in this regard.
His account is paid but yes, it does make sense but I think a human should always be verifying that the machine isn't doing something stupid. At the very least with paid accounts.
I think the worst part is how they didn't warn or notify him in any way though.
I agree, returning a generic 404 error page was not the right thing to do. I get that they might not want to make it obvious for bots to detect if they have been banned or not. But a simple error page with a link to a high-priority support email address would have been so much more helpful for the user.
A email is just as useful to a bot as a 404 in my eyes.
This only benefits the user, the bot wouldn't really care.
The point of a 404 error page is two-fold:
Regarding some other points in the article itself:
He complains about the lack of an email, but emails alert bots that they were blocked, allowing them to simply hand-off their work to other bots in the network, as described in the error page scenario above. He also complains about why they didn't "just block that one spammy message", but that is also known to fail horrendously by using different spam messages and variations of those spam messages to find holes in the detection algorithm(s) being used.
Now, that being said, there certainly shouldn't have been an automated action applied in this scenario. Spam reports should be taken seriously, but for an older, more active account with no preexisting history of abuse, his account should've only been flagged for manual review, not automatically deactivated.
There's a general philosophy one should follow when automating a task: always have a human validate the results of the automation. Inherently trusting computers to not screw things up is a massive mistake.
I'm frankly astounded that someone who considers themselves a professional only keeps one backup of their "life's work", and on a cloud service at that.
Even the random youtubers I follow keep copies of their (likely sizeable, and considerable # of) video project files from things they've mentioned about project work. Anyone making a living off their data should literally be safeguarding that data with their life.
Not to sound like a techie prepper but I'd want 3 copies minimum (as usual no more than 2 onsite) to even begin to feel secure. Involving finances is more like 4.
What makes you think he didn't have backups? There are plenty of services out there that I would be frustrated and inconvenienced by being banned from because they provide a good & valuable service, and/or have critical mass among my friends and colleagues.
It doesn't mean I rely solely on those services, or that I couldn't restore my local backups to another service if necessary, but I'd still be grumbling loudly about it to whoever would listen.
He does have backups as noted here
You can't backup issues or unmerged pull requests though, that's just not possible.
Furthermore, the purpose of using a cloud service like Github is that it will outlive you. Your code will go on. That's the whole point of the operation as noted here
You can, but not through an official GitHub tool (at least as far as I know). I've used scripts to pull down all the issues from a project in JSON format and save them in a text file, which can be backed up. It's probably pretty unusual for anyone to set that up though.
Doing that would most likely mean you can't import it back and even if you were to import it back somehow, an issue which isn't linked to the issuer's account is mostly useless unless you've already been able to reproduce the issue yourself. It's not really a good solution :p
Seems like an honest mistake on Github's end. Nevertheless, it's best to use more than one repository as a backup.
you already got flagged but, what's the point of comments like this?
If I am flagged, that's sad. The title is clickbaity and the part I quoted is at least half of the meat of the article. And even then this is one of those articles that is three sentences long when you skip the repeating and word salad: this guys github account was blocked because of a false positive, github has rather switfly fixed the problem, this is a current trend where scarcely-supervised automated systems cause troubles like this.
I like it when people, especially OPs, provide tldrs to articles like this. I put them in my posts, and I put it here to potentially save a click to those who like me won't appreciate what they found when they clicked through.
I feel like doing that would encourage not reading the article and that's generally a bad idea.
But if the content of the article boils down to :
Shocking title
-a mistake made by an automated system - it has since been fixed.
Some summary is probably better than the clickbait title
Whether or not they are mistakes by an automated system, this is a real danger for those who use Github as a backup service for their projects. The likeliness of them getting back their stuff compared to Jason are a lot lower. It's not clickbait, it's a warning.
That is not disputed. Whether we learn it from a little summary here or reading it on ZDNet awarding them the click they were baiting for is.
I'm not picking on you BTW, thanks for posting this. I am just saying why the summary is necessary IMHO.
FWIW I'd rather have some more original reporting here and links as backing references. I think if we do that that'd add quite a bit of value to the platform.
I consider it good etiquette online to add a little description to the article I'm sharing. It lets others discern better whether they want to read something or not, based off more than just a headline, which are meant to grasp the very essence while omitting as much as possible.
Most often, the best description is a direct selective quote from the article.
I'm thinking what @cadadr shared could've been used better as a subtitle for the topic itself, so as to explain the situation a little better.