The real problem is that this 20 hours of work is work that shouldn't need to be done by a human. ByteDance should have invested in creating a software tool/platform to organize the audio samples...
Based on the estimate of 20 hours of work to translate one hour of content, the worker received $72 for at least 100 hours of work.
The real problem is that this 20 hours of work is work that shouldn't need to be done by a human. ByteDance should have invested in creating a software tool/platform to organize the audio samples and feed them to the transcribers to maximize their productivity. I'm not surprised that ByteDance is both inexperienced with managing such data collection, nor am I surprised that they farmed this out to a chain of middle-management. This is a recipe for failure I've seen many times before.
I sometimes manage part-time contractors who do natural language annotation, and in my experience, it's futile to try to get projects done if you don't have proper software tooling in place. Not only will you make yourself insane trying to manage the overhead of tracking the work and work product you receive, but you will not retain contractors you hire if you waste their time. The company I work for pays contractors well above minimum wage (and we also establish real contracts with contractors), though, so we are strongly incentivized not to offload such wasteful busywork to our contractors. It is a much better use of my time to write some data-munging scripts to automate what is automatable than it is to task contractors to do things manually. We want the hours they spend to actually be productive—not shuffling our datasets or doing other things that computers do much more efficiently than humans.
I don't watch a ton of TikTok videos, but I've never seen one that didn't have whatever the narrator was saying printed over it. Even songs have the words. I realize that's not searchable and you...
I don't watch a ton of TikTok videos, but I've never seen one that didn't have whatever the narrator was saying printed over it. Even songs have the words. I realize that's not searchable and you can't use screen readers with it. But it makes me wonder if OCR would be a better route?
So, the goal here would be speech-to-text models that can automatically transcribe whatever is being spoken. The issue that one would face here is getting speech-to-text models for the many...
So, the goal here would be speech-to-text models that can automatically transcribe whatever is being spoken. The issue that one would face here is getting speech-to-text models for the many different natural languages, and variants/dialects of natural languages across their TikTok user-base. The reason the recruiter mentioned "pt-br"—the language subtag for Brazilian Portuguese according to the IETF BCP-47 recommendation and as registered in the IANA registry)—in their WhatsApp workers-wanted message was because ByteDance would be intent on collecting training and validation data in order to develop their own, automated, speech-to-text model for Brazilian Portuguese. One could also develop a home-grown OCR system (or buy an existing solution), but speech-to-text and OCR would be independent functionalities and would require entirely different datasets. (Though, as you say, it might be possible to collect transcription data if the transcriptions are already present in the uploaded TitTok video content, this would still likely require manual review/correction, which may end up taking just as much effort as transcribing it from the audio alone.)
I feel like this is a problem that will solve itself one way or another. Presumably most people will have the same reaction as Felipe - namely, wow, that's a lot of work in reality for not a lot...
I feel like this is a problem that will solve itself one way or another. Presumably most people will have the same reaction as Felipe - namely, wow, that's a lot of work in reality for not a lot of pay, bye. The barrier of entry and the barrier of exit to this is extremely low.
If the pay is unbearably low, then Tiktok isn't going to get much transcribing done, considering everyone they try to hire will dip for better jobs. Then they'd have to either have to raise the money they pay or give up on the endeavor (or move to hiring a different country's residents).
With practically nothing keeping you on a job that doesn't even have a formal contract, you'd only keep at it if there's nothing better to do for your time - in which case, if it didn't exist, it's not like you'd be in much better shape. Otherwise, just ignore it.
It'd become a problem if this gig becomes so big as to disrupt other industries but it seems like it's an extremely small operation in the grand scheme of the Brazilian economy.
The real problem is that this 20 hours of work is work that shouldn't need to be done by a human. ByteDance should have invested in creating a software tool/platform to organize the audio samples and feed them to the transcribers to maximize their productivity. I'm not surprised that ByteDance is both inexperienced with managing such data collection, nor am I surprised that they farmed this out to a chain of middle-management. This is a recipe for failure I've seen many times before.
I sometimes manage part-time contractors who do natural language annotation, and in my experience, it's futile to try to get projects done if you don't have proper software tooling in place. Not only will you make yourself insane trying to manage the overhead of tracking the work and work product you receive, but you will not retain contractors you hire if you waste their time. The company I work for pays contractors well above minimum wage (and we also establish real contracts with contractors), though, so we are strongly incentivized not to offload such wasteful busywork to our contractors. It is a much better use of my time to write some data-munging scripts to automate what is automatable than it is to task contractors to do things manually. We want the hours they spend to actually be productive—not shuffling our datasets or doing other things that computers do much more efficiently than humans.
I don't watch a ton of TikTok videos, but I've never seen one that didn't have whatever the narrator was saying printed over it. Even songs have the words. I realize that's not searchable and you can't use screen readers with it. But it makes me wonder if OCR would be a better route?
So, the goal here would be speech-to-text models that can automatically transcribe whatever is being spoken. The issue that one would face here is getting speech-to-text models for the many different natural languages, and variants/dialects of natural languages across their TikTok user-base. The reason the recruiter mentioned "pt-br"—the language subtag for Brazilian Portuguese according to the IETF BCP-47 recommendation and as registered in the IANA registry)—in their WhatsApp workers-wanted message was because ByteDance would be intent on collecting training and validation data in order to develop their own, automated, speech-to-text model for Brazilian Portuguese. One could also develop a home-grown OCR system (or buy an existing solution), but speech-to-text and OCR would be independent functionalities and would require entirely different datasets. (Though, as you say, it might be possible to collect transcription data if the transcriptions are already present in the uploaded TitTok video content, this would still likely require manual review/correction, which may end up taking just as much effort as transcribing it from the audio alone.)
I feel like this is a problem that will solve itself one way or another. Presumably most people will have the same reaction as Felipe - namely, wow, that's a lot of work in reality for not a lot of pay, bye. The barrier of entry and the barrier of exit to this is extremely low.
If the pay is unbearably low, then Tiktok isn't going to get much transcribing done, considering everyone they try to hire will dip for better jobs. Then they'd have to either have to raise the money they pay or give up on the endeavor (or move to hiring a different country's residents).
With practically nothing keeping you on a job that doesn't even have a formal contract, you'd only keep at it if there's nothing better to do for your time - in which case, if it didn't exist, it's not like you'd be in much better shape. Otherwise, just ignore it.
It'd become a problem if this gig becomes so big as to disrupt other industries but it seems like it's an extremely small operation in the grand scheme of the Brazilian economy.