• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics in ~tech with the tag "data". Back to normal view / Search all groups
    1. The Slack controversy has opened a whole new can of worms

      So Slack has been in the news since last couple of days and after wondering what dastardly thing did they do this time to deserve the wrath of Internet Gods, I decided to Google and found that...

      So Slack has been in the news since last couple of days and after wondering what dastardly thing did they do this time to deserve the wrath of Internet Gods, I decided to Google and found that they used the folks' private chat data to train their AI models.

      But when I went through the discussions, some folks don't think of this as a big deal at all, some are actually defending Slack. They say, "So what, others like Microsoft and Google and Apple do this all the time with your and my data?". Do you agree with this line of thinking?

      At some point, I think this is going to be a privacy nightmare. Imagine Facebook doing something like this with WhatsApp chat data? I think there are some regulations in EU/US preventing Meta from combining the WhatsApp data with their other components but such regulations don't exist in all countries and nothing prevents Meta from exploiting data of users from those countries.

      What do you think about this? I think Slack needs to be called out more, not less. And something needs to be done to prevent this situation from happening again.

      30 votes
    2. Chrome/Firefox Plugin to locally scrape data from multiple URLs

      As the title suggests, I am looking for a free chrome or firefox plugin that can locally scrape data from multiple URLs. To be a bit more precise, what I mean by it: A free chrome or firefox...

      As the title suggests, I am looking for a free chrome or firefox plugin that can locally scrape data from multiple URLs. To be a bit more precise, what I mean by it:

      • A free chrome or firefox plugin
      • Local scraping: it runs in the browser itself. No cloud computing or "credits" required to run
      • Scrape data: Collects predefined data from certain data fields within a website such as https://www.dastelefonbuch.de/Suche/Test
      • Infinite scroll: to load data that only loads once the browser scrolls down (kind of like in the page I linked above)

      I am not looking into programming my own scraper using python or anything similar. I have found plugins that "kind of" do what I am describing above, and about two weeks ago I found one that pretty much perfectly does what is described ("DataGrab"), but it starts asking to buy credits after running it a few times.

      My own list:

      • DataGrab: Excellent, apart from asking to buy credits after a while
      • SimpleScraper: Excellent, but asks to buy credits pretty much immediately
      • Easy Scraper: Works well for single pages, but no possibility to feed in multiple URLs to crawl
      • Instant Data Scraper: Works well for single pages and infinite scroll pages, but no possibility to feed in multiple URLs to crawl
      • "Data Scraper - Easy Web Scraping" / dataminer.io: Doesn't work well
      • Scrapy.org: Too much programming, but looks quite neat and well documented

      Any suggestions are highly welcome!

      Edit: A locally run executable or cmd-line based program would be fine too, as long as it just needs to be configured (e.g., creating a list of URLs stored in a .txt or .csv file) instead of coded (e.g., coding an infinite scroll function from scratch).

      8 votes
    3. Question about GDPR

      I am in the EU. I asked a company in which I had an account to delete my account. They told me they would do that as long as I sent them an ID and a postal address. This is to ensure that "I am...

      I am in the EU.

      I asked a company in which I had an account to delete my account. They told me they would do that as long as I sent them an ID and a postal address. This is to ensure that "I am the right person".

      I never gave them an ID and a postal address in the first place so how would that verify anything, and I'm using the email that I used to sign-up with them to ask for the deletion.

      Am I in the wrong to believe that this should be easier? Are they misinterpreting the GDPR or am I?

      What are my options if I do not want to send my ID and postal address?

      --

      Their arguments are:

      Article 5(1)(f) of the GDPR requires us to meet security obligations in data processing. Since data deletion is permanent, we need to ensure that the request is indeed from the person concerned.

      Furthermore, Article 12(6) of the GDPR states: "…when the data controller has reasonable doubts concerning the identity of the natural person making the request referred to in Articles 15 to 21, he may request the provision of additional information necessary to confirm the identity of the data subject."

      10 votes
    4. Please help me understand and manage external hdd sleep

      I have an external drive (3.5" hdd, SATA) in an enclosure (usb 3) (purchased separately), connected to a thunderbolt dock (OWC) connected alternately to an iMac and a macbook pro. The HDD goes to...

      I have an external drive (3.5" hdd, SATA) in an enclosure (usb 3) (purchased separately), connected to a thunderbolt dock (OWC) connected alternately to an iMac and a macbook pro. The HDD goes to sleep, and causes problems. Freezes, weird internet access problems, kernel panics.

      I have done some research, and can't seem to figure out:

      how to know whether it is the drive, enclosure, or computer causing the sleep, although, fiddling with various settings on the mac seemed to have no effect, although it may have increased my battery usage :(

      how to adjust settings on the drive, or in the enclosure.

      How to determine what the sleep behavior of prospective drives will be.

      As a workaround, I tried to write a zsh script to touch the drive ever few seconds. This kinda worked, but was a struggle to figure out appropriate permissions issues and how to make it run automatically.

      I welcome all guidance, pointers to resources, clarifications, incantations, well-wishes.

      8 votes