35 votes

Coverage of Gaza War in the New York Times and other major newspapers heavily favored Israel, analysis shows

6 comments

  1. [6]
    cykhic
    (edited )
    Link
    Archive link: https://archive.is/IxYxy My summary of the article: The authors analysed 1100 news articles from The New York Times, The Washington Post, and The Los Angeles Times, and concluded...
    • Exemplary

    Archive link: https://archive.is/IxYxy


    My summary of the article:

    The authors analysed 1100 news articles from The New York Times, The Washington Post, and The Los Angeles Times, and concluded that these publications had a consistent pro-Israel bias, because:

    1. For every Israeli death, Israelis are mentioned 8 times, while for every Palestinian death, Palestinians are mentioned 0.5 times, in article bodies.

    2. Highly emotive words ("slaughter", "massacre", and "horrific") were used (60, 120, and 38) times more often to refer to Israelis than Palestinians, in article bodies.

    3. 6000 Palestinian children and 100 journalists were killed, but were only mentioned in headlines 2 times and 9 times respectively, out of 1100 headlines.

    4. "Antisemitism" was mentioned 549 times, while "Islamophobia" was mentioned 79 times, in article bodies, in the period before the "campus antisemitism" phenomenon.


    While I am sympathetic to this article's conclusions, I kind of doubt that their conclusion can be credibly drawn from their analysis.

    Their first finding was the weirdest in my opinion, and made me double take.

    (my summary) 1. For every Israeli death, Israelis are mentioned 8 times, while for every Palestinian death, Palestinians are mentioned 0.5 times, in article bodies.

    If we look closer at their methodology:

    [...] the words “Israeli” or “Israel” appear more than “Palestinian” or variations thereof, even as Palestinian deaths far outpaced Israeli deaths. For every two Palestinian deaths, Palestinians are mentioned once. For every Israeli death, Israelis are mentioned eight times [...]

    If their goal is to measure the rate that deaths are (edit: dropped the word "reported"), why are they counting the word "Israel" here? Are they also counting the word "Palestine"? That sounds like it would seriously conflate reporting on deaths with reporting on the conflict in general, where I would expect fair coverage to be roughly one-to-one. This is on top of the already-large leap of using "Israeli"/"Palestinian" to measure "the speaker's compassion for civilian deaths", when phrases like "Israeli government" or "Palestinian militants" are common.

    If we look at their summary, it shows 10286 mentions of "israeli" and 7045 mentions of "palestinian". If we take these as measuring coverage of the overall conflict (which, given their methodology, I think is the only halfway-reasonable use for this statistic), this is still consistent with their conclusion of pro-Israel bias, but a 10:7 bias is a much weaker conclusion than 16:1.

    I don't think that it's at all reasonable to include the words "Israel" and "Palestine" in the count for their original purpose, so although the numbers (after reinterpretation) still weakly support their conclusion, I can't help but feel like the authors were deliberately manipulating the presentation of their numbers to fit their preferred conclusion.


    Maybe I was primed by the above, but their third finding also seems particularly suspicious.

    (my summary) 3. 6000 Palestinian children and 100 journalists were killed, but were only mentioned in headlines 2 times and 9 times respectively, out of 1100 headlines.

    The intended implication seems to be that although Palestinian children made up 30% (6000 of about 20000) Palestinian casualties, they only made up 0.2% (2 of 1100) of attention paid to them by these newspapers.

    But, why only headlines? Since The Intercept's other three findings refer to words found in the bodies of articles, I see no reason why they should add this restriction specifically for this finding, unless it is to cherry-pick this conclusion. I suspect that if one were to re-run this finding on article bodies, the number that comes out would be much less newsworthy.


    I don't have huge problems with their findings 2 and 4, apart from my general doubt that this word-counting methodology can give us a good indication of (for finding number 2) sympathy to Israeli deaths vs. Palestinian deaths, or (for finding number 4) attitudes towards Jews vs. Muslims.

    (edited to add: I did notice something the article seems to gloss over, which is that the vast majority of emotive language (in finding number 2) seems to be related to the Oct 7 event, rather than more generally split along Israeli/Palestinian lines. Although I'm less sure what conclusion to draw from this.)


    Probably this response will seem like an overreaction to you, but I personally get kind of ticked off when I see poor/lazy data analysis. I respect that the authors probably wanted to find some kind of non-controversial, objective measure of reporting bias, and I don't think I have a better methodology on hand, but I think what they have done is very very much controversial and non-objective.

    I feel that people who don't already agree with them are going to immediately spot the problems, so if someone who agrees with this article signal boosts it, they run the risk of poisoning the well by suggesting that "since it was broadcast widely, this is probably one of their stronger arguments, but it is wrong, which puts an upper bound on the correctness of the entire position".

    It's also not reassuring to me that the statistics appear to be deliberately massaged to prefer one conclusion, in an article which is specifically attempting to call out bias in reporting.

    25 votes
    1. [3]
      Gaywallet
      Link Parent
      I think this is adequately covered in some of their highlights, as well as anyone who's seen reporting on the war. Generally speaking if location is used to explain context of deaths without...

      If their goal is to measure the rate that deaths are (edit: dropped the word "reported"), why are they counting the word "Israel" here? Are they also counting the word "Palestine"? That sounds like it would seriously conflate reporting on deaths with reporting on the conflict in general, where I would expect fair coverage to be roughly one-to-one.

      I think this is adequately covered in some of their highlights, as well as anyone who's seen reporting on the war. Generally speaking if location is used to explain context of deaths without mentioning the conflict, painting it as a one sided conflict (only mentioning Israel or only mentioning Palestine) almost always seems to come with a bias. When it's one sided it seems to always be defense or terrorists. Furthermore if one is to have unbiased reporting, casualties should be explained on all sides involved in a conflict to give both a sense of scale and an understanding of the events which have transpired.

      This is on top of the already-large leap of using "Israeli"/"Palestinian" to measure "the speaker's compassion for civilian deaths", when phrases like "Israeli government" or "Palestinian militants" are common.

      The word "compassion" does not appear in the article. Where did you get this idea?

      If we look at their summary, it shows 10286 mentions of "israeli" and 7045 mentions of "palestinian". If we take these as measuring coverage of the overall conflict (which, given their methodology, I think is the only halfway-reasonable use for this statistic), this is still consistent with their conclusion of pro-Israel bias, but a 10:7 bias is a much weaker conclusion than 16:1.

      Coverage of the conflict isn't always focused on casualties and a raw count doesn't give you an idea of sentiment attached to the statements especially when analyzed by attaching it to object in each sentence.

      While the analysis was not a more sophisticated sentiment analysis using techniques such as ngrams or parsing sentences with linguistic algorithms, lots of basic sentiment analysis is basic emotive analysis using a set of dictionary words to apply specific weights to words such as happy indicating a positive emotion and sad indicating a negative one. Similarly in research involving racism, sexism, etc. negative sentiment is often graded with choice of more negative/positive adjectives (of note here, anti-muslim analysis frequently targets similar words) when compared to control groups.

      I agree that they should have done a better job laying out the base to explain that on a coverage aspect from a 10,000 ft view that there's only slightly more use of the word Israel than Palestine, but I don't agree that their methodology here is flawed. It certainly can be improved with more sophisticated sentiment analysis tools, but that's a very different problem.

      The intended implication seems to be that although Palestinian children made up 30% (6000 of about 20000) Palestinian casualties, they only made up 0.2% (2 of 1100) of attention paid to them by these newspapers.

      That's not what the sentence says at all, you shouldn't read into any scientific findings as 'implications', that's how you insert your own bias. The only thing this sentence is stating is that these particular examples seemed to not be included in headlines.

      I also agree that here they should have done a better job with analysis and presentation however. This finding tells me very little without a comparative point at the very least. They should have provided a count of headlines including some of these terms absent counts of Israel/Palestine or at least gave an idea of what headlines typically consisted of.

      Probably this response will seem like an overreaction to you

      No this was a fantastic breakdown and I thank you for going through the details of the analysis itself - I haven't found any other data nerds to talk with about this, and I think there's a lot to be said about the quality of the analysis (it can be greatly improved, but it's honestly pretty impressive for a random journalist and not a scientist). I'm still waiting for some good scientific papers on the conflict to come out, but if older papers hold similar results, it's extremely likely that we'll see systematic biases in how this conflict is represented based on the publisher and country of origin. 1 2 3 4

      11 votes
      1. [2]
        cykhic
        Link Parent
        Ah, sorry about that. I think I generally use quotes both to denote "this is what they said" as well as denote "the concept of XYZ", which is also how I'm using them in this sentence. I'm not sure...

        The word "compassion" does not appear in the article. Where did you get this idea?

        Ah, sorry about that. I think I generally use quotes both to denote "this is what they said" as well as denote "the concept of XYZ", which is also how I'm using them in this sentence. I'm not sure how to better denote them, maybe square brackets would work.

        That's not what the sentence says at all, you shouldn't read into any scientific findings as 'implications', that's how you insert your own bias.

        I think I disagree in the context of this article. After looking at the specific numbers they found, I tried to make a reasonable generalisation, for example: [the papers say 'antisemitism' more than 'islamophobia'] --> [this indicates more concern about violence against Jews than Muslims] --> [the papers are biased].

        I think doing this is reasonable because:

        1. Without generalising, the finding is so specific that it isn't helpful.
        2. There probably actually is some underlying correlate which gives rise to the observation, and will cause other effects which we care about.
        3. I think many readers will make the generalisation when they read each of those findings.

        While there is some risk of inserting my own bias, I (subjectively) feel like I made reasonable generalisations. In particular, I don't think it's worse than skipping the middle step and going straight from word frequencies to conclusions of bias.

        I'm still waiting for some good scientific papers on the conflict to come out, but if older papers hold similar results, it's extremely likely that we'll see systematic biases in how this conflict is represented based on the publisher and country of origin. 1 2 3 4

        Thanks for the links, these look significantly more rigorous and I am reading them now.

        3 votes
        1. Gaywallet
          Link Parent
          I tend to avoid using double quotes for anything but actual quotes. When I'm synthesizing or otherwise changing the exact wording, or referring to a concept I tend to use single quotes....

          I tend to avoid using double quotes for anything but actual quotes. When I'm synthesizing or otherwise changing the exact wording, or referring to a concept I tend to use single quotes. Ultimately, I'm not sure there's any 'correct' way of doing things, but some sort of separation between the two via symbol or wording helps keep the reader on track.

          I think doing this is reasonable because:

          1. Without generalising, the finding is so specific that it isn't helpful.
          2. There probably actually is some underlying correlate which gives rise to the observation, and will cause other effects which we care about.
          3. I think many readers will make the generalisation when they read each of those findings.

          I'm in agreement with 1. I don't think it's a particularly helpful finding, especially because it comes with no context to anchor to, but honestly that's kind of the nature of science. You sometimes find useful things and sometimes don't. I think the article would have been better served without this particular tidbit, because it just doesn't provide anything useful. Since we have no context, 2. is completely reasonable.

          I (subjectively) feel like I made reasonable generalisations. In particular, I don't think it's worse than skipping the middle step and going straight from word frequencies to conclusions of bias.

          I don't disagree that many people would think it's reasonable to make that jump, but it doesn't follow logically from the findings which is why I pointed it out. It is absolutely fair, however, to criticize the writer for presenting it as a fact without doing the due diligence of giving us anything else to compare or anchor it to. I also suspect, as you seem to, that the writer's bias is at play here and it's why they chose to take a look at this particular metric without bothering to compare it, at the very least, to it's Israeli counterpart. It's even quite possible that they did and the finding was similar and they chose to simply omit that piece of information.

          Thanks for the links, these look significantly more rigorous and I am reading them now.

          Keep in mind that I chose a few articles over differing time frames, across countries, and using slightly different methodologies to show some of the breadth of the field of sentiment analysis. Broadly speaking we've entered a much more polarized age in the last 5 years or so and some of these findings may no longer hold true for current publications and the current conflict.

          4 votes
    2. vektor
      Link Parent
      Likewise for "Gaza" and, perhaps more importantly, "Gazan". Moreover, this one is suspiciously lacking a control, thus covering up sloppy methodology. The implication of their method is that...

      If their goal is to measure the rate that deaths are, why are they counting the word "Israel" here? Are they also counting the word "Palestine"?

      Likewise for "Gaza" and, perhaps more importantly, "Gazan".

      But, why only headlines? Since The Intercept's other three findings refer to words found in the bodies of articles, I see no reason why they should add this restriction specifically for this finding, unless it is to cherry-pick this conclusion.

      Moreover, this one is suspiciously lacking a control, thus covering up sloppy methodology. The implication of their method is that article headlines should be proportional to casualties. If a group has 30% of all casualties in a conflict, they deserve 30% of headlines about that conflict. This fails for multiple reasons: A person can belong to multiple groups. E.g. those 30% of casualties being gazan children are both gazan, children, and palestinian. If I'm giving the children their 30% headlines, and the adults their 70% headlines, I have 0% space left for gazans as a whole. Or palestinians. What's worse, if I play by these rules, I have 0% space left to discuss the conflict in general, or any kind of event around the conflict that can't be tied to casualties from a specific group. In other words, there's no way that the implied math could ever work out in any real-world setting. The only way the reported figured are at all informative is in relative terms, i.e. how does this non-focus on children compare to e.g. other wars or other groups of victims in the current war; as absolute figures as they are reported they are worthless. Those comparisons are made, but only qualitatively.

      8 votes
    3. [2]
      Comment removed by site admin
      Link Parent
      1. cykhic
        Link Parent
        I agree with your characterisation of headlines vs. article bodies. But my question was, why use headlines only for the third finding? Would your characterisation not equally apply to the other...

        I agree with your characterisation of headlines vs. article bodies.

        But my question was, why use headlines only for the third finding? Would your characterisation not equally apply to the other three of the four findings?

        My concern is whether the authors ran the same analysis on both headlines and article bodies, and then reported whichever number came up worse. That would not be a fair analysis (that is, it would have a good chance of finding "evidence" of bias even in a perfectly fair newspaper). Their cherrypicking of data corpus also raises the possibility that the authors cherrypicked the words they analysed, in order to get the most skewed possible numbers, e.g. picking analysis of "children" over other words like "civilian". If they did do this, it would further skew their analysis away from fair.

        1 vote