Activity

Votes

Comments

New

All activity

Showing only topics with the tag "data". Back to normal view

Prev Next

Just what is intelligent storage? Here are three examples.

~tech Article 820 words

0 comments

Hewlett Packard Enterprise

March 11, 2019

2 votes
An email marketing company left 809 million records exposed online
~tech
- privacy
- internet
Article 1375 words
1 comment

WIRED

March 8, 2019

8 votes
Delete never: The digital hoarders who collect Tumblrs, medieval manuscripts, and terabytes of text files

~tech Article 2536 words

17 comments

Gizmodo

March 5, 2019

35 votes
My activity history on Tildes: an exercise in boredom

~tildes Image

7 comments

Imgur

February 13, 2019

20 votes
For years Facebook claimed the adding a phone number for 2FA was only for security. Now it can be searched and there's no way to disable that.
~tech
- facebook
- privacy
Link
18 comments

x.com

March 4, 2019

43 votes
Revealed: Facebook’s global lobbying against data privacy laws
~tech
Article 1169 words
0 comments

The Guardian

March 2, 2019

19 votes
lib.reviews An open source, open data review website for high quality reviews on any topic

~tech Link

4 comments

lib.reviews

February 27, 2019

8 votes
The route of a text message, a love story

~tech Article 4827 words

2 comments

VICE

February 24, 2019

12 votes
Privacy vs "I have nothing to hide"
~tech
- privacy
Article 1047 words
0 comments

kevq.uk

February 23, 2019

9 votes
Factors that affect the reliability of SSDs, and how they compare to HDDs

~tech Link

0 comments

backblaze.com

February 21, 2019

5 votes
2.7 million medical calls breached in Sweden due to an unsecured NAS
~tech
- privacy
Article 1174 words
0 comments

hjorthjort.xyz

February 21, 2019

4 votes
Huawei cloning Apple parts, rewarding employees for tech theft
~tech
- apple
Article 447 words
0 comments

AppleInsider

February 19, 2019

9 votes
Facebook charged with misleading users on health data visibility
~tech
Article 483 words
1 comment

healthcareitnews.com

February 18, 2019

8 votes
Data privacy bill unites Charles Koch and Big Tech
~tech
- google
- privacy
Article 518 words
2 comments

ft.com

February 18, 2019

6 votes
Why humanitarians are worried about Palantir’s new partnership with the UN
~tech
- privacy
Article 1803 words
2 comments

Slate

February 16, 2019

8 votes
Even years later, Twitter doesn’t delete your direct messages
~tech
- social media
- privacy
Article 587 words
2 comments

TechCrunch

February 15, 2019

4 votes
Millennial life: How young adulthood today compares with prior generations

~life Article 2197 words

1 comment

pewsocialtrends.org

February 14, 2019

10 votes
Telcos sold highly sensitive customer GPS data
~tech
- privacy
Article 743 words
0 comments

VICE

February 9, 2019

4 votes
Millions are on the move in China, and Big Data is watching
~tech
- privacy
Article 851 words
5 comments

The Sydney Morning Herald

February 6, 2019

9 votes
How ontologies help data science make sense of disparate data

~tech Link

0 comments

archive.ph

February 3, 2019

3 votes
Now your groceries see you, too

~food Article 277 words, published Jan 25 2019

2 comments

The Atlantic

February 2, 2019

6 votes
Data on discrimination

~tech Link

1 comment

danluu.com

January 30, 2019

5 votes
I tried to block Amazon from my life. It was impossible.
~tech
- amazon.web services
Article 2850 words
2 comments

Gizmodo

January 23, 2019

13 votes
What cities are getting wrong about public transportation
~enviro
- sustainability
Article
2 comments

Bloomberg

January 18, 2019

7 votes
VOIPO.com data leak
~tech
- security
Article 1684 words
4 comments

rainbowtabl.es

January 18, 2019

7 votes
Pew study: 74% of Facebook users did not know Facebook was maintaining a list of their interests/traits, 51% were uncomfortable with it, and 27% felt the list was inaccurate
~tech
Article 2317 words
9 comments

pewinternet.org

January 16, 2019

21 votes
I made a program that creates the colour palette of a film

~creative Text 141 words

I saw these things originally on Reddit that extracted the average colour of frames from films and put them together to make a colour palette for said film, the original creator has a site called...

I saw these things originally on Reddit that extracted the average colour of frames from films and put them together to make a colour palette for said film, the original creator has a site called The Colors of Motion. I thought it would be cool to try and create a simple PowerShell script that does the same thing.

Here are a few examples:
Finding Nemo: https://i.imgur.com/8YwOlwK.png
The Bee Movie: https://i.imgur.com/umbd3co.png
Harry Potter and the Philosopher's Stone: https://i.imgur.com/6rsbv0M.png

I've hosted my code on GitHub so if anyone wants to use my PowerShell script or suggest some ways to improve it feel free. You can use pretty much any video file as input as it uses ffmpeg to extract the frames.

GitHub link: https://github.com/ArkadiusBear/FilmStrip

4 comments

Ark

January 11, 2019

17 votes
Open standards may finally give patients control of their data and care via Electronic Health Records

~tech Article 1918 words

4 comments

Hewlett Packard Enterprise

January 3, 2019

6 votes
Economic Policy Institute: Top charts of 2018
~finance
- economics
Article 620 words
1 comment

epi.org

December 27, 2018

6 votes
How Google tracks your personal information
~tech
- google
- privacy
Article 71 words
0 comments

Medium: Patrick Berlinquette

December 25, 2018

7 votes
Steven Pinker’s ideas are fatally flawed

~humanities Article 4831 words, published May 21 2018

4 comments

opendemocracy.net

December 23, 2018

14 votes
At Blind, a security lapse revealed private complaints from Silicon Valley employees
~tech
- security
- privacy
Article 1127 words
4 comments

TechCrunch

December 21, 2018

13 votes
Amazon sends 1,700 Alexa voice recordings to a random person
~tech
- privacy
- amazon
Link
5 comments

threatpost.com

December 20, 2018

17 votes
Using data to determine if Die Hard is a Christmas movie

~movies Article 2790 words

12 comments

stephenfollows.com

December 18, 2018

12 votes
Facebook says new bug allowed apps access to private photos of up to 6.8m users
~tech
- privacy
- social media
Link
8 comments

facebook.com

December 14, 2018

33 votes
Remember backing up to diskettes? I’m sorry. I do, too.

~tech Article 1655 words, published Sep 4 2014

5 comments

druva.com

December 11, 2018

11 votes
"Mischievous responders" have been tainting the data about health disparities between LGBT youth and their peers

~lgbt Article 1170 words

2 comments

thedailybeast.com

December 12, 2018

13 votes
Google CEO Sundar Pichai testifies before the House Judiciary Committee on Data Collection
~tech
- google
- privacy
Video 3:44:41
3 comments

YouTube: C-SPAN

December 11, 2018

15 votes
Your apps know where you were last night, and they’re not keeping it secret
~tech
- privacy
Article 2735 words
5 comments

The New York Times

December 10, 2018

23 votes
Marriott admits hackers stole data on 500 million guests; passports and credit card info included
~tech
- privacy
- security.cyber
Article 585 words
10 comments

Forbes

November 30, 2018

21 votes
Amazon admits it exposed customer email addresses, but refuses to give details
~tech
- amazon
- privacy
- security
Article 601 words
1 comment

TechCrunch

November 21, 2018

14 votes
Unsecured database of millions of SMS text messages exposed password resets and two-factor codes
~tech
- security
Article 726 words
4 comments

TechCrunch

November 16, 2018

19 votes
DeepMind’s move to transfer health unit to Google stirs data fears
~health
- healthcare
Article 669 words
4 comments

ft.com

November 14, 2018

11 votes
Period-tracking apps are not for women

~life.women Article 3534 words

26 comments

Vox

November 14, 2018

28 votes
Fallout 76 bug accidentally deletes entire 50GB beta

~games Article 151 words

6 comments

Kotaku

October 31, 2018

18 votes
Tim Cook's keynote address at the 40th International Conference of Data Protection and Privacy Commissioners
~tech
- privacy
Video 22:12
1 comment

YouTube: European Data Protection Supervisor

October 24, 2018

8 votes
What are the best practices regarding personal files and encryption?
~tech
- privacy
- linux
- security
Ask
Over the past year I have done a lot to shore up my digital privacy and security. One of the last tasks I have to tackle is locking down the many personal files I have on my computer that have...

Over the past year I have done a lot to shore up my digital privacy and security. One of the last tasks I have to tackle is locking down the many personal files I have on my computer that have potentially compromising information in them (e.g. bank statements). Right now they are simply sitting on my hard drive, unencrypted. Theft of my device or a breach in access through the network would allow a frightening level of access to many of my records.

As such, what are my options for keeping certain files behind an encryption "shield"? Also, what are the potential tradeoffs for doing so? In researching the topic online I've read plenty of horror stories about people losing archives or whole drives due to encryption-related errors/mistakes. How can I protect against this scenario? Losing the files would be almost as bad as having them compromised!

I'm running Linux, but I'm far from tech-savvy, so I would either need a solution to be straightforward or I'd have to learn a lot to make sense of a more complicated solution. I'm willing to learn mainly because it's not an option for me to continue with my current, insecure setup. I do use a cloud-based password manager that allows for uploading of files, and I trust it enough with my passwords that I would trust it with my files, though I would like to avoid that situation if possible.

With all this in mind, what's a good solution for me to protect my personal files?

11 comments

kfwyre

October 2, 2018

26 votes

XML Data Munging Problem

~comp

programming

Text 568 words

Here’s a problem I had to solve at work this week that I enjoyed solving. I think it’s a good programming challenge that will test if you really grok XML. Your input is some XML such as this:...

Here’s a problem I had to solve at work this week that I enjoyed solving. I think it’s a good programming challenge that will test if you really grok XML.

Your input is some XML such as this:

<DOC>
<TEXT PARTNO="000">
<TAG ID="3">This</TAG> is <TAG ID="0">some *JUNK* data</TAG> .
</TEXT>
<TEXT PARTNO="001">
*FOO* Sometimes <TAG ID="1">tags in <TAG ID="0">the data</TAG> are nested</TAG> .
</TEXT>
<TEXT PARTNO="002">
In addition to <TAG ID="1">nested tags</TAG> , sometimes there is also <TAG ID="2">junk</TAG> we need to ignore .
</TEXT>
<TEXT PARTNO="003">*BAR*-1
<TAG ID="2">Junk</TAG> is marked by uppercase characters between asterisks and can also optionally be followed by a dash and then one or more digits . *JUNK*-123
</TEXT>
<TEXT PARTNO="004">
Note that <TAG ID="4">*this*</TAG> is just emphasized . It's not <TAG ID="2">junk</TAG> !
</TEXT>
</DOC>

The above XML has so-called in-line textual annotations because the XML <TAG> elements are embedded within the document text itself.

Your goal is to convert the in-line XML annotations to so-called stand-off annotations where the text is separated from the annotations and the annotations refer to the text via slicing into the text as a character array with starting and ending character offsets. While in-line annotations are more human-readable, stand-off annotations are equally machine-readable, and stand-off annotations can be modified without changing the document content itself (the text is immutable).

The challenge, then, is to convert to a stand-off JSON format that includes the plain-text of the document and the XML tag annotations grouped by their tag element IDs. In order to preserve the annotation information from the original XML, you must keep track of each <TAG>’s starting and ending character offset within the plain-text of the document. The plain-text is defined as the character data in the XML document ignoring any junk. We’ll define junk as one or more uppercase ASCII characters [A-Z]+ between two *, and optionally a trailing dash - followed by any number of digits [0-9]+.

Here is the desired JSON output for the above example to test your solution:

{
  "data": "\nThis is some data .\n\n\nSometimes tags in the data are nested .\n\n\nIn addition to nested tags , sometimes there is also junk we need to ignore .\n\nJunk is marked by uppercase characters between asterisks and can also optionally be followed by a dash and then one or more digits . \n\nNote that *this* is just emphasized . It's not junk !\n\n",
  "entities": [
    {
      "id": 0,
      "mentions": [
        {
          "start": 9,
          "end": 18,
          "id": 0,
          "text": "some data"
        },
        {
          "start": 41,
          "end": 49,
          "id": 0,
          "text": "the data"
        }
      ]
    },
    {
      "id": 1,
      "mentions": [
        {
          "start": 33,
          "end": 60,
          "id": 1,
          "text": "tags in the data are nested"
        },
        {
          "start": 80,
          "end": 91,
          "id": 1,
          "text": "nested tags"
        }
      ]
    },
    {
      "id": 2,
      "mentions": [
        {
          "start": 118,
          "end": 122,
          "id": 2,
          "text": "junk"
        },
        {
          "start": 144,
          "end": 148,
          "id": 2,
          "text": "Junk"
        },
        {
          "start": 326,
          "end": 330,
          "id": 2,
          "text": "junk"
        }
      ]
    },
    {
      "id": 3,
      "mentions": [
        {
          "start": 1,
          "end": 5,
          "id": 3,
          "text": "This"
        }
      ]
    },
    {
      "id": 4,
      "mentions": [
        {
          "start": 289,
          "end": 295,
          "id": 4,
          "text": "*this*"
        }
      ]
    }
  ]
}

Python 3 solution here.

If you need a hint, see if you can find an event-based XML parser (or if you’re feeling really motivated, write your own).

4 votes

Twitter makes datasets available containing accounts, tweets, and media from accounts associated with influence campaigns from the IRA and Iran
~tech
- social media
Tweet
0 comments

Twitter

October 17, 2018

8 votes
UK Biobank data on 500,000 people paves way to precision medicine
~health
- healthcare
- medicine
Article 751 words
3 comments

Nature

October 15, 2018

8 votes

Prev Next