What a great post. While reading I realized myself that I've never thought about this aspect of data collection myself. Sure, I know that countless companies are out to get my data and I safeguard...
What a great post. While reading I realized myself that I've never thought about this aspect of data collection myself. Sure, I know that countless companies are out to get my data and I safeguard myself against that as much as I can, but I never realized why they are doing it.
Obviously, to target those ads better at me, but now I realize that it's almost impossible to do this in good way for a huge mass of people. Unless you handcraft a majority of your targeting algorithms, you'll have a problem effectively targeting your ads, and the examples the author brings up make a lot of sense. Youtube is another great example for this. Their current Start page is pure garbage, I rarely get actually new videos on there, and everything recommended to me is often more than six months old. And don't you dare make the mistake of looking at one political video, because then that's all you'll see. I've gotten into the habit of opening Youtube in a private browser window when I watch something I don't want recommended to me at a later date.
There's clearly good money to be made selling ads, but what are they actually being used for? What do you with a database full of millions of people and their preference? How do you analyze and use that knowledge without going through it manually?
Unfortunately it seems like a lot of these tech companies have seemingly not learned from the past and are forging ahead with the same models: collect the data first and hope you will invent a...
There's clearly good money to be made selling ads, but what are they actually being used for? What do you with a database full of millions of people and their preference? How do you analyze and use that knowledge without going through it manually?
Unfortunately it seems like a lot of these tech companies have seemingly not learned from the past and are forging ahead with the same models: collect the data first and hope you will invent a useful application for it along the way: "fake it 'till you make it" but keep selling advertisers, investors, and the world on the notion that soontm you will have the machine to solve all problems.
To play the devils advocate, you're kinda forced to do that. I would assume most companies that collect data, have some sort of plan on what product they want to deliver. So you already have a...
To play the devils advocate, you're kinda forced to do that. I would assume most companies that collect data, have some sort of plan on what product they want to deliver. So you already have a connection between what you collect (data) and what you want to end up delivering (for instance selling products or services). For most I would assume that indeed, you will not figure out how to use the data. But with no data you certainly won't be able to figure it out. And what if maybe somebody does come up with a better way of analyzing that data, then you don't want to start collecting it afterwards. That can often make old data wasted.
Maintaining a company ain't cheap either so you are kinda forced to find yourself patrons like advertisers and investors. And in lieu of profits, they like hearing promises of potential profits (or things leading to it, like data profiteering). It's a bad cycle of co-dependency, but what can you do. At the moment it's the best model to try to punch your way into markets, and until that changes, this behaviour still gives you the best chances of survival.
You can say it's not learning from the past, but is there a better functioning model at the moment?
You can argue you do, but it's often an extremely tenuous link. Sure we could measure the shoe size of every person who buys our oatmeal, and it'd be "linked" to our product by virtue of it...
So you already have a connection between what you collect (data) and what you want to end up delivering (for instance selling products or services).
You can argue you do, but it's often an extremely tenuous link. Sure we could measure the shoe size of every person who buys our oatmeal, and it'd be "linked" to our product by virtue of it pertaining to a consumer who bought it, but that's really just a semantics argument with no practical value. More data does not mean a more qualified analysis by default, and a lot of data hoarding usually only results in some half-assed statistical inferences being performed on a noisy data blob, yielding only faint correlations.
To play devil's advocate for that is to essentially support something on the premise that you can't disprove it's utility until all actions have been exhausted, it's just not a solid argument.
And what if maybe somebody does come up with a better way of analyzing that data, then you don't want to start collecting it afterwards. That can often make old data wasted.
This is the same fallacious thinking as that of "missed sales" due to piracy, it considers that the possibility of gain alone renders both obtained data valuable and non obtained data a "missed opportunity", again by virtue of "you can't disprove it".
Maintaining a company ain't cheap either so you are kinda forced to find yourself patrons like advertisers and investors. And in lieu of profits, they like hearing promises of potential profits (or things leading to it, like data profiteering). It's a bad cycle of co-dependency, but what can you do. At the moment it's the best model to try to punch your way into markets, and until that changes, this behaviour still gives you the best chances of survival.
You can say it's not learning from the past, but is there a better functioning model at the moment?
In theory? In capitalist theory the better model should be the one which provides a better service at a price consumers are willing to pay at large enough scale. In practice smaller, more specialized companies are sometimes able to keep afloat, but even the large giants are buckling under the weight of their shareholder's expectations, they have no long-term survivability, only a continuous cycle of renewal of short-term injections of life where the idea seems to be to find the better model somewhere along the way. What does it say about the quality of a model if it invariably leads the adopter to look for a way out?
I would assume you have some relationship between what you gather and what you are offering. Otherwise it's obviously going to be a mess. Many other fields, like forensics, medicine, science, etc,...
You can argue you do, but it's often an extremely tenuous link. Sure we could measure the shoe size of every person who buys our oatmeal, and it'd be "linked" to our product by virtue of it pertaining to a consumer who bought it, but that's really just a semantics argument with no practical value. More data does not mean a more qualified analysis by default, and a lot of data hoarding usually only results in some half-assed statistical inferences being performed on a noisy data blob, yielding only faint correlations.
I would assume you have some relationship between what you gather and what you are offering. Otherwise it's obviously going to be a mess.
This is the same fallacious thinking as that of "missed sales" due to piracy, it considers that the possibility of gain alone renders both obtained data valuable and non obtained data a "missed opportunity", again by virtue of "you can't disprove it".
Many other fields, like forensics, medicine, science, etc, stores as much data as possible in the hopes of later technological breakthroughs being able to find things not currently possible. You might have years of relevant data on hand the moment it becomes useful. Yes, there's a good chance that wont happen, but if you are wrong you have only wasted storage.
I don't see the connection to piracy.
In theory? In capitalist theory the better model should be the one which provides a better service at a price consumers are willing to pay at large enough scale. In practice smaller, more specialized companies are sometimes able to keep afloat, but even the large giants are buckling under the weight of their shareholder's expectations, they have no long-term survivability, only a continuous cycle of renewal of short-term injections of life where the idea seems to be to find the better model somewhere along the way. What does it say about the quality of a model if it invariably leads the adopter to look for a way out?
Capitalist theory is too idealistic to work. In theory, of course the person with the better product gets ahead. Problem is, the consumers are not ideal rational beings of perfect insight, like the theories assume. If you want to apply capitalist theory, then you need to adjust for how markets actually work.
You need to consider each aspect of the lifecycle of the product/service you are selling. That needs to be part of your equation. This is no different from if somebody where to take a loan to maintain selling a product at a loss. If you cannot maintain your loans (or VC repayment) and services at a surplus without constant money injections, then you are not selling it at the proper price. If you cannot sell it at the proper price where you turn profit, then you are not selling the better product.
Of course people want out of that model. Everyone wants to having their cake and eat it too.
Oh no, I mean objectively they are a very effective from the marketer's perspective. You essentially name a dollar amount, target your audience, and get X amount of sales. It's almost mechanistic.
Oh no, I mean objectively they are a very effective from the marketer's perspective. You essentially name a dollar amount, target your audience, and get X amount of sales. It's almost mechanistic.
Pew did a survey a few weeks ago related to this: http://www.pewinternet.org/2019/01/16/facebook-algorithms-and-personal-data/ 74% of users didn't know Facebook was maintaining the list of...
I think there's a disconnect here between what targeted ads are and what people think they are. The author in the OP makes the same mistake, oddly enough. The way the targeting works is not...
I think there's a disconnect here between what targeted ads are and what people think they are. The author in the OP makes the same mistake, oddly enough. The way the targeting works is not directly aimed at you, individual #3190279, but rather at cohorts the system believes you might belong. So if the system believes that you are X gender, Y age and Z income bracket, it'll try to sell you ads targeted at those demographics. Sure, a lot of times those ads won't resonate with you, but advertising is cheap and you miss 100% of the shots you don't take.
Sometimes google will go the extra creepy mile and based on recent searches & volume it'll pitch you more tailored results (like cars, I like looking at cars and google knows this...it usually serves me ads for cars I'm not interested in, but they're still cars, and now I know who's desperate to increase sales), but more often than not it'll default to your cohort. If you're in the 25yo/relatively healthy cohort you're not likely to get ads for, say, mechanical ventilators. But if you fall into the "motherhood" cohort..........Target will be up on your shit...
Youtube's a different beast, though, where its video promotion algorithm is literally cancer. I'm not sure I've quite figured out yet how it works other than it's got to be really cheap to get your videos promoted (or trolls and their backers have very deep pockets).
I'm fairly certain the author of the blog post knows that you aren't being targeted personally, but rather a combination of the various demographics you fit in. I'm aware of this as well and I...
I'm fairly certain the author of the blog post knows that you aren't being targeted personally, but rather a combination of the various demographics you fit in. I'm aware of this as well and I think my point still stands here, because stereotyping people based on ther age, sex, etc. and trying to sell you shit based on that isn't very effective. Google (unfortunately) knows a lot about me, and yet they haven't been able to sell me anything. Not a single ad served by them has made me go "Neat, I'd like that". Not to speak of the fact that ads today completely ruin the browsing experience, driving many people to ad-blockers. If advertisement annoys you, you're less likely to buy whatever it's advertising, that includes stuff you might otherwise actually be interested in.
I've been told many times over now that Machine Learning/AI has become the replacement for Big Data/Data Science in the entrepreneur sphere: a hyped-up idea promising to deliver gold when in...
This is, by the way, the dirty secret of the machine learning movement: almost everything produced by ML could have been produced, more cheaply, using a very dumb heuristic you coded up by hand, because mostly the ML is trained by feeding it examples of what humans did while following a very dumb heuristic.
I've been told many times over now that Machine Learning/AI has become the replacement for Big Data/Data Science in the entrepreneur sphere: a hyped-up idea promising to deliver gold when in reality most companies who employ it either don't know what they're doing or are too lazy/scared to invest into properly building something complex and useful.
It's doing some fantastic things, but it's also doing some really dumb things because people don't understand the hype cycle. Most of the exciting things it's doing are in astro/physics, with the...
It's doing some fantastic things, but it's also doing some really dumb things because people don't understand the hype cycle.
Most of the exciting things it's doing are in astro/physics, with the majority of the future uses still deep in the research stages.
ML and AI are extremely effective tools at what they are designed to do, but the problem being tackled is much more expansive than what they offer. If data science is tearing down an old building...
ML and AI are extremely effective tools at what they are designed to do, but the problem being tackled is much more expansive than what they offer.
If data science is tearing down an old building and constructing a new one, ML might be dynamite and AI might be concrete, but you still need to figure out how to blow up the building safely and you need someone to direct where to pour the concrete. Even after all that, there's a lot more that happens and you're going to need construction workers and rebar and waste management and food and a million other things.
The author seems to make the point that harvesting data isn't so bad, since it's really hard to use it for anything useful or specific. But I would still argue that protecting your data from...
The author seems to make the point that harvesting data isn't so bad, since it's really hard to use it for anything useful or specific. But I would still argue that protecting your data from harvest is important, for many reasons. Our usage data could be sold or merged with other data. Potentially deanonymized, and used for many purposes beyond just commercial purposes. For instance targeted political propaganda. I think assuming that data analytics suck, therefore data harvesting isn't so bad is a really bad idea.
Not really. They are saying that it is much ado for nothing because it doesn't work in commercial settings. And that they are willing to share their data for useful service in return, but also...
The author seems to make the point that harvesting data isn't so bad, since it's really hard to use it for anything useful or specific.
Not really. They are saying that it is much ado for nothing because it doesn't work in commercial settings. And that they are willing to share their data for useful service in return, but also that most often neither such service nor what we all usually get requires the use of such data (and indeed as data gets bigger recommendations and targeting tend to a line fitted in a scatterplot but does not really encounter even an abysmal fraction of points, I observe; Youtube recommendations for me, for example, have become utterly usesless, except when I am using them for the first couple of clicks in a fresh browser profile).
I agree the rest of your comment tho, as we saw with Cambridge Analytica and Myanmar, all this private interest surveillance may be "useful" in utterly destructive ways, and that is an existential threat for a free and peaceful society which relies on honesty, reason and conscience, the first things to fail when a person is overwhelmed by FUD.
This confirms my belief that ads almost always benefit the ad companies and trackers, not even the publishers, who are nothing but buying what amounts to snake oil. I think the best strategy...
This confirms my belief that ads almost always benefit the ad companies and trackers, not even the publishers, who are nothing but buying what amounts to snake oil.
I think the best strategy against the state of things with data terror is to show to those who shop from and invest in these companies and the recommendations business is that they are not getting what their money's worth. It looks like a fraudulent commercial scheme.
I wonder when some smart publisher is going to go back to doing it the old fashioned way: give away a free edition for my taking a fun, interesting survey.
I wonder when some smart publisher is going to go back to doing it the old fashioned way: give away a free edition for my taking a fun, interesting survey.
What a great post. While reading I realized myself that I've never thought about this aspect of data collection myself. Sure, I know that countless companies are out to get my data and I safeguard myself against that as much as I can, but I never realized why they are doing it.
Obviously, to target those ads better at me, but now I realize that it's almost impossible to do this in good way for a huge mass of people. Unless you handcraft a majority of your targeting algorithms, you'll have a problem effectively targeting your ads, and the examples the author brings up make a lot of sense. Youtube is another great example for this. Their current Start page is pure garbage, I rarely get actually new videos on there, and everything recommended to me is often more than six months old. And don't you dare make the mistake of looking at one political video, because then that's all you'll see. I've gotten into the habit of opening Youtube in a private browser window when I watch something I don't want recommended to me at a later date.
There's clearly good money to be made selling ads, but what are they actually being used for? What do you with a database full of millions of people and their preference? How do you analyze and use that knowledge without going through it manually?
Unfortunately it seems like a lot of these tech companies have seemingly not learned from the past and are forging ahead with the same models: collect the data first and hope you will invent a useful application for it along the way: "fake it 'till you make it" but keep selling advertisers, investors, and the world on the notion that soontm you will have the machine to solve all problems.
To play the devils advocate, you're kinda forced to do that. I would assume most companies that collect data, have some sort of plan on what product they want to deliver. So you already have a connection between what you collect (data) and what you want to end up delivering (for instance selling products or services). For most I would assume that indeed, you will not figure out how to use the data. But with no data you certainly won't be able to figure it out. And what if maybe somebody does come up with a better way of analyzing that data, then you don't want to start collecting it afterwards. That can often make old data wasted.
Maintaining a company ain't cheap either so you are kinda forced to find yourself patrons like advertisers and investors. And in lieu of profits, they like hearing promises of potential profits (or things leading to it, like data profiteering). It's a bad cycle of co-dependency, but what can you do. At the moment it's the best model to try to punch your way into markets, and until that changes, this behaviour still gives you the best chances of survival.
You can say it's not learning from the past, but is there a better functioning model at the moment?
You can argue you do, but it's often an extremely tenuous link. Sure we could measure the shoe size of every person who buys our oatmeal, and it'd be "linked" to our product by virtue of it pertaining to a consumer who bought it, but that's really just a semantics argument with no practical value. More data does not mean a more qualified analysis by default, and a lot of data hoarding usually only results in some half-assed statistical inferences being performed on a noisy data blob, yielding only faint correlations.
To play devil's advocate for that is to essentially support something on the premise that you can't disprove it's utility until all actions have been exhausted, it's just not a solid argument.
This is the same fallacious thinking as that of "missed sales" due to piracy, it considers that the possibility of gain alone renders both obtained data valuable and non obtained data a "missed opportunity", again by virtue of "you can't disprove it".
In theory? In capitalist theory the better model should be the one which provides a better service at a price consumers are willing to pay at large enough scale. In practice smaller, more specialized companies are sometimes able to keep afloat, but even the large giants are buckling under the weight of their shareholder's expectations, they have no long-term survivability, only a continuous cycle of renewal of short-term injections of life where the idea seems to be to find the better model somewhere along the way. What does it say about the quality of a model if it invariably leads the adopter to look for a way out?
I would assume you have some relationship between what you gather and what you are offering. Otherwise it's obviously going to be a mess.
Many other fields, like forensics, medicine, science, etc, stores as much data as possible in the hopes of later technological breakthroughs being able to find things not currently possible. You might have years of relevant data on hand the moment it becomes useful. Yes, there's a good chance that wont happen, but if you are wrong you have only wasted storage.
I don't see the connection to piracy.
Capitalist theory is too idealistic to work. In theory, of course the person with the better product gets ahead. Problem is, the consumers are not ideal rational beings of perfect insight, like the theories assume. If you want to apply capitalist theory, then you need to adjust for how markets actually work.
You need to consider each aspect of the lifecycle of the product/service you are selling. That needs to be part of your equation. This is no different from if somebody where to take a loan to maintain selling a product at a loss. If you cannot maintain your loans (or VC repayment) and services at a surplus without constant money injections, then you are not selling it at the proper price. If you cannot sell it at the proper price where you turn profit, then you are not selling the better product.
Of course people want out of that model. Everyone wants to having their cake and eat it too.
That may be so but FB advertising is pretty darn effective.
Oh no, I mean objectively they are a very effective from the marketer's perspective. You essentially name a dollar amount, target your audience, and get X amount of sales. It's almost mechanistic.
Pew did a survey a few weeks ago related to this: http://www.pewinternet.org/2019/01/16/facebook-algorithms-and-personal-data/
74% of users didn't know Facebook was maintaining the list of interests and traits at all, and when shown it (whether they knew about it or not):
(and 3% refused to answer + 11% didn't have categories assigned)
All good. I could have provided much better context. :)
I think there's a disconnect here between what targeted ads are and what people think they are. The author in the OP makes the same mistake, oddly enough. The way the targeting works is not directly aimed at you, individual #3190279, but rather at cohorts the system believes you might belong. So if the system believes that you are X gender, Y age and Z income bracket, it'll try to sell you ads targeted at those demographics. Sure, a lot of times those ads won't resonate with you, but advertising is cheap and you miss 100% of the shots you don't take.
Sometimes google will go the extra creepy mile and based on recent searches & volume it'll pitch you more tailored results (like cars, I like looking at cars and google knows this...it usually serves me ads for cars I'm not interested in, but they're still cars, and now I know who's desperate to increase sales), but more often than not it'll default to your cohort. If you're in the 25yo/relatively healthy cohort you're not likely to get ads for, say, mechanical ventilators. But if you fall into the "motherhood" cohort..........Target will be up on your shit...
Youtube's a different beast, though, where its video promotion algorithm is literally cancer. I'm not sure I've quite figured out yet how it works other than it's got to be really cheap to get your videos promoted (or trolls and their backers have very deep pockets).
I'm fairly certain the author of the blog post knows that you aren't being targeted personally, but rather a combination of the various demographics you fit in. I'm aware of this as well and I think my point still stands here, because stereotyping people based on ther age, sex, etc. and trying to sell you shit based on that isn't very effective. Google (unfortunately) knows a lot about me, and yet they haven't been able to sell me anything. Not a single ad served by them has made me go "Neat, I'd like that". Not to speak of the fact that ads today completely ruin the browsing experience, driving many people to ad-blockers. If advertisement annoys you, you're less likely to buy whatever it's advertising, that includes stuff you might otherwise actually be interested in.
Do you block ads and trackers?
I've been told many times over now that Machine Learning/AI has become the replacement for Big Data/Data Science in the entrepreneur sphere: a hyped-up idea promising to deliver gold when in reality most companies who employ it either don't know what they're doing or are too lazy/scared to invest into properly building something complex and useful.
It's doing some fantastic things, but it's also doing some really dumb things because people don't understand the hype cycle.
Most of the exciting things it's doing are in astro/physics, with the majority of the future uses still deep in the research stages.
ML and AI are extremely effective tools at what they are designed to do, but the problem being tackled is much more expansive than what they offer.
If data science is tearing down an old building and constructing a new one, ML might be dynamite and AI might be concrete, but you still need to figure out how to blow up the building safely and you need someone to direct where to pour the concrete. Even after all that, there's a lot more that happens and you're going to need construction workers and rebar and waste management and food and a million other things.
The author seems to make the point that harvesting data isn't so bad, since it's really hard to use it for anything useful or specific. But I would still argue that protecting your data from harvest is important, for many reasons. Our usage data could be sold or merged with other data. Potentially deanonymized, and used for many purposes beyond just commercial purposes. For instance targeted political propaganda. I think assuming that data analytics suck, therefore data harvesting isn't so bad is a really bad idea.
Not really. They are saying that it is much ado for nothing because it doesn't work in commercial settings. And that they are willing to share their data for useful service in return, but also that most often neither such service nor what we all usually get requires the use of such data (and indeed as data gets bigger recommendations and targeting tend to a line fitted in a scatterplot but does not really encounter even an abysmal fraction of points, I observe; Youtube recommendations for me, for example, have become utterly usesless, except when I am using them for the first couple of clicks in a fresh browser profile).
I agree the rest of your comment tho, as we saw with Cambridge Analytica and Myanmar, all this private interest surveillance may be "useful" in utterly destructive ways, and that is an existential threat for a free and peaceful society which relies on honesty, reason and conscience, the first things to fail when a person is overwhelmed by FUD.
This confirms my belief that ads almost always benefit the ad companies and trackers, not even the publishers, who are nothing but buying what amounts to snake oil.
I think the best strategy against the state of things with data terror is to show to those who shop from and invest in these companies and the recommendations business is that they are not getting what their money's worth. It looks like a fraudulent commercial scheme.
I wonder when some smart publisher is going to go back to doing it the old fashioned way: give away a free edition for my taking a fun, interesting survey.