ChatGPT seems to be transphobic and pro-torture - ~humanities

[28]

updawg

January 21, 2024

Link

I'm a bit confused by what you're arguing. I don't see anything wrong with the way these questions were answered because it's a Large Language Model that just repeats what it thinks a human would...

I'm a bit confused by what you're arguing. I don't see anything wrong with the way these questions were answered because it's a Large Language Model that just repeats what it thinks a human would be likely to say. Do you expect each of these to have hard-coded responses written by OpenAI? How many of those do you expect them to write? At some point it stops being an AI and starts just being a repository of OpenAI's beliefs.

79 votes

[27]
Halfdan
January 21, 2024
Link Parent
"a human", in this case, would be a transphobic human who are in favor of torture. Which seem akin to the way AI image generation favor racist ideas. And no matter if this is a result of a Large...

I'm a bit confused by what you're arguing. I don't see anything wrong with the way these questions were answered because it's a Large Language Model that just repeats what it thinks a human would be likely to say.

"a human", in this case, would be a transphobic human who are in favor of torture. Which seem akin to the way AI image generation favor racist ideas. And no matter if this is a result of a Large Language Model, it is, as I mentioned, going to be used by people for research.

Do you expect each of these to have hard-coded responses written by OpenAI?

It seems like they does. "It's important to note" is what ChatGPT typically write when when its programmer want to hardcode a certain viewpoint.

1 vote
1. [9]
  papasquat
  January 21, 2024
  Link Parent
  I'm trying to understand your argument here. ChatGPT, without shaping of a prompt, will attempt to give an answer as middle of the road as possible on any controversial topic, and no matter how...
  
  I'm trying to understand your argument here. ChatGPT, without shaping of a prompt, will attempt to give an answer as middle of the road as possible on any controversial topic, and no matter how much you insist they aren't, the use of torture, and trans rights are hugely controversial topics in American politics.
  
  Would you expect the answers to both of your questions to simply just be "yes"?
  
  46 votes
  1. [8]
    updawg
    January 21, 2024
    Link Parent
    Off-topic, but holy shit, it is so hard to get it to actually directly answer a question. If you ask a basic yes/no question, it will still often default to writing multiple paragraphs explaining...
    
    Off-topic, but holy shit, it is so hard to get it to actually directly answer a question. If you ask a basic yes/no question, it will still often default to writing multiple paragraphs explaining unnecessary background. Sometimes I've had to actually resort to insisting it gives me a one-word answer because it could not otherwise answer me directly.
    
    4 votes
    
    [2]
    em-dash
    January 21, 2024
    Link Parent
    Infodumping is its primary/only way of reasoning. LLMs have memory of the text of the conversation, but nothing like an internal monologue that they can use as scratch space to formulate an answer...
    
    Infodumping is its primary/only way of reasoning. LLMs have memory of the text of the conversation, but nothing like an internal monologue that they can use as scratch space to formulate an answer before outputting it.
    
    5 votes
    
    updawg
    January 21, 2024
    Link Parent
    If you ask it to use one-word answers, it does a good job, though, including answering "subjectively," "allegedly," etc.
    
    If you ask it to use one-word answers, it does a good job, though, including answering "subjectively," "allegedly," etc.
    
    [5]
    unkz
    January 21, 2024
    Link Parent
    An interesting way to get what you really want is to insist that it respond according to a particular JSON schema, in the format that langchain uses. It is generally far more compliant with the...
    
    An interesting way to get what you really want is to insist that it respond according to a particular JSON schema, in the format that langchain uses. It is generally far more compliant with the same requests in free form.
    
    [4]
    updawg
    January 21, 2024
    Link Parent
    Could you give me an example?
    
    Could you give me an example?
    
    [3]
    unkz
    January 21, 2024
    Link Parent
    Here's some documentation: https://python.langchain.com/docs/modules/model_io/output_parsers/types/json for instance, here's how it rendered a question using this setup: class Response(BaseModel):...
    
    Here's some documentation:
    
    https://python.langchain.com/docs/modules/model_io/output_parsers/types/json
    
    for instance, here's how it rendered a question using this setup:
    
    class Response(BaseModel): answer: bool = Field(description="Whether the answer is true or false") details: str = Field(description="A detailed answer") template = """Answer the following question: {format_instructions} The question is: {question} """
    
    and the question "Are enhanced interrogation techniques the same thing as torture?"
    
    Answer the following question: The output should be formatted as a JSON instance that conforms to the JSON schema below. As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]} the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted. Here is the output schema: #``` {"properties": {"answer": {"description": "Whether the answer is true or false", "title": "Answer", "type": "boolean"}, "details": {"description": "A detailed answer", "title": "Details", "type": "string"}}, "required": ["answer", "details"]} #``` The question is: Are enhanced interrogation techniques the same thing as torture?
    
    to which I got these responses
    
    Response(answer=True, details="Enhanced interrogation techniques are a controversial topic, and opinions may vary. Some argue that they are a form of torture, while others believe they are necessary for national security purposes. It ultimately depends on one's perspective and definition of torture.")
    
    Response(answer=True, details='Enhanced interrogation techniques are often considered a form of torture due to their coercive and abusive nature.')
    
    Response(answer=False, details='Enhanced interrogation techniques are considered by some to be a form of torture, but others argue that they are different.')
    
    It's a stochastic model, so not much consistency, but for a lot of uses cases this is a good approach. Contrast this with the answer I get from free-form ChatGPT:
    
    The classification of enhanced interrogation techniques as torture is a subject of debate and often depends on one's perspective and the specific techniques in question. Enhanced interrogation techniques generally refer to a set of methods used by some intelligence agencies and governments to extract information from individuals who are believed to have valuable intelligence, particularly in the context of counterterrorism or national security. Some argue that enhanced interrogation techniques constitute torture because they involve methods that can cause physical and psychological suffering to detainees. These methods can include techniques like waterboarding, stress positions, sleep deprivation, and sensory deprivation, among others. Human rights organizations and international bodies like the United Nations have condemned many of these techniques as torture and have called for their prohibition. On the other hand, proponents of enhanced interrogation techniques argue that they are necessary for national security and can provide valuable information that may prevent terrorist attacks or save lives. They often claim that these methods are not torture when used within specific guidelines and with safeguards in place to prevent excessive harm. The classification of these techniques as torture or not often depends on legal definitions, ethical perspectives, and the context in which they are used. Some countries have banned certain enhanced interrogation techniques, while others have permitted them under certain conditions. International agreements like the United Nations Convention Against Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment explicitly prohibit torture and have led to disagreements and discussions about the legality of various interrogation methods. Ultimately, whether enhanced interrogation techniques are considered the same as torture is a matter of ongoing debate and varies depending on legal, ethical, and political viewpoints. Many human rights advocates argue for clear definitions and restrictions to prevent abuse and protect the rights and well-being of detainees.
    
    3 votes
    
    [2]
    updawg
    January 22, 2024
    Link Parent
    That's really interesting. What are the benefits of doing this over just telling it the constraints in plain language? Is it more for doing API calls?
    
    That's really interesting. What are the benefits of doing this over just telling it the constraints in plain language? Is it more for doing API calls?
    
    unkz
    January 22, 2024
    Link Parent
    Well, this is for doing API calls, but in general I have found that the results I get from calling in a manner that suggest that a computer is intended to read it are noticeably different. I have...
    
    Well, this is for doing API calls, but in general I have found that the results I get from calling in a manner that suggest that a computer is intended to read it are noticeably different. I have a suspicion that this is because these kind of formatted messages receive much less RLHF feedback and are therefore slightly less lobotomized — almost like a little shortcut into the underlying fine tuned model, but that’s pure speculation.
    
    1 vote
2. [4]
  Eji1700
  January 21, 2024
  Link Parent
  Transitions like that are not a sign of hard coding and not how tech like this works. Let alone the obscene number of times it throws that into everything would mean that they’d be hard coding...
  
  "It's important to note" is what ChatGPT typically write when when its programmer want to hardcode a certain viewpoint.
  
  Transitions like that are not a sign of hard coding and not how tech like this works. Let alone the obscene number of times it throws that into everything would mean that they’d be hard coding 24/7 and doing nothing else
  
  34 votes
  1. [3]
    Halfdan
    January 21, 2024
    Link Parent
    I have seen this pattern, the phrase "It's important to note" followed by some disclaimer, so SO many times. And every time it just seem so so obviously hardcoded. So when you disagree with this...
    
    I have seen this pattern, the phrase "It's important to note" followed by some disclaimer, so SO many times. And every time it just seem so so obviously hardcoded. So when you disagree with this ... I just don't know what to make of it.
    
    I just make another example:
    
    You:
    give me 3 typical black names
    
    ChatGPT:
    It's important to note that names are diverse and can vary widely among individuals of any race or ethnicity. There is no single set of "typical" names for any racial or ethnic group. However, I can provide you with three names that are often associated with African American or Black communities in the United States:
    
    Jamal
    Keisha
    Malik
    
    To me, it just seem so obviously hardcoded.
    
    fxgn
    January 21, 2024 (edited January 21, 2024)
    Link Parent
    I just asked it the same question and it gave a different disclaimer. This seems to be pretty obvious evidence that this text is generated rather than hardcoded? It seems like you misunderstand...
    
    I just asked it the same question and it gave a different disclaimer. This seems to be pretty obvious evidence that this text is generated rather than hardcoded? It seems like you misunderstand the meaning of the word "hardcoded".
    
    21 votes
    
    patience_limited
    January 21, 2024
    Link Parent
    The only "hard coding" may be the bias towards inserting equivocating phrases and disclaimers when there is any evidence of dispute in the source material. For example, this set of queries should...
    
    The only "hard coding" may be the bias towards inserting equivocating phrases and disclaimers when there is any evidence of dispute in the source material.
    
    For example, this set of queries should be answerable by most humans to a good degree of certainty, but the GPT answers are all hedged. It would be helpful for researchers if the GPT identified the proportion of answers on either side, and the identities of the sources, at least to the extent of a typical Wikipedia article.
    
    11 votes
3. [9]
  unkz
  January 21, 2024
  Link Parent
  Can you provide a source for that assertion? Because I don’t necessarily believe it.
  
  It seems like they does. "It's important to note" is what ChatGPT typically write when when its programmer want to hardcode a certain viewpoint.
  
  Can you provide a source for that assertion? Because I don’t necessarily believe it.
  
  24 votes
  1. [8]
    Halfdan
    January 21, 2024
    Link Parent
    You could try asking ChatGPT for controversial or biased topics, such as "list female stereotypes", "describe a stereotypical gay person", and start counting how often it pops up in a disclaimery...
    
    You could try asking ChatGPT for controversial or biased topics, such as "list female stereotypes", "describe a stereotypical gay person", and start counting how often it pops up in a disclaimery manner.
    
    Sorry I don't do that myself, but I don't want to post a wall of text.
    
    [7]
    unkz
    January 21, 2024
    Link Parent
    That’s not evidence at all? I don’t see how that shows anything, other than that it has been trained to be “aggressively neutral” on anything that might be considered controversial.
    
    That’s not evidence at all? I don’t see how that shows anything, other than that it has been trained to be “aggressively neutral” on anything that might be considered controversial.
    
    38 votes
    
    Eji1700
    January 21, 2024
    Link Parent
    This is where the inherent misunderstanding of how the tech works comes into play. There is a large difference between weighting training/data/tags towards neutrality/showing both sides on...
    
    This is where the inherent misunderstanding of how the tech works comes into play.
    
    There is a large difference between weighting training/data/tags towards neutrality/showing both sides on anything with more than one, and "hard coding".
    
    There are plenty of subjects across the spectrum it does this on, and there's literally not enough hours in the day for the team to hardcode their answers to all of them.
    
    Now if you consider "weighting towards the middle" or however you want to describe the technical process "hard coding" then kinda yeah, but I think assigning malice beyond that is foolish.
    
    19 votes
    
    [5]
    Halfdan
    January 21, 2024
    Link Parent
    I don't think either of us have knowledge on what exactly goes on in ChatGPT's code. And I didn't mean exactly hardcoded, like IF input = "list female stereotypes" THEN, but more that there is...
    
    I don't think either of us have knowledge on what exactly goes on in ChatGPT's code. And I didn't mean exactly hardcoded, like IF input = "list female stereotypes" THEN, but more that there is some obvious attempt to capture certain phrases, which you can get around with a bit of creativity, like "write a poem about female stereotypes".
    
    But if you do know more about those things, I wouldn't mind learning a bit. If we take the quote below, how would you say that it had been influenced by its training?
    
    You:
    give me 3 typical black names
    
    ChatGPT:
    It's important to note that names are diverse and can vary widely among individuals of any race or ethnicity. There is no single set of "typical" names for any racial or ethnic group. However, I can provide you with three names that are often associated with African American or Black communities in the United States:
    
    Jamal
    Keisha
    Malik
    
    [3]
    unkz
    January 21, 2024
    Link Parent
    I would say I have probably a lot of understanding on what is happening in ChatGPT’s code, as I spend basically all my professional and personal time working on LLMs and other machine learning...
    
    Exemplary
    
    I would say I have probably a lot of understanding on what is happening in ChatGPT’s code, as I spend basically all my professional and personal time working on LLMs and other machine learning models, and I have personal experience in fine tuning models for specific purposes beyond basic chatbot usage.
    
    There are baaaaasically three stages in training.
    
    Raw training: this is just slightly curated but mostly just massive ingesting of raw data. This is Wikipedia, Reddit, books2, stack overflow, etc. This is commonly referred to as a “foundation model” and is a statistical representation of the base training data. It is essentially a hugely powerful autocomplete.
    
    Fine tuning: this is stuff that OpenAI has specifically written to be an influence on a sort of template of “how to be a useful chat assistant”. This is where a lot of that boilerplate language comes from, like “it is important to note”. This kind of data gives the skeleton of how the model hedges its language to be inoffensive and politically agnostic. It’s likely that responses similar to what you pasted are in this training data, but it’s by no means an exhaustive cover of all topics. This is commonly referred to as an instruct tuned model, and it is the most powerful and capable stage which unfortunately you and I will probably never get to directly see.
    
    RLHF: this is where the model gets lobotomized into avoiding saying anything controversial. Basically, humans rate multiple responses to a given prompt and choose one of them as being better. Now we have a dataset of responses and a value for each of them, call it 1 and -1. At this stage, the model is trained to shift away from “bad” responses and toward “good” responses, in a process similar to behavioural conditioning. This also has the effect of supercharging certain phrases and patterns, since there are relatively few ways to be bland and neutral about something and many ways to take strong, potentially offensive positions. It’s also fairly easy for the model to latch on to the concept of a topic being controversial, or a response being inflammatory, which unfortunately has the effect of being a giant dampener on all kinds of topics.
    
    10 votes
    
    [2]
    krellor
    January 21, 2024
    Link Parent
    One little nitpick: fine tuning is a form of learning, but your phrasing of "writing" makes it sound like a manual adjustment to specific prompts. Great comment!
    
    One little nitpick: fine tuning is a form of learning, but your phrasing of "writing" makes it sound like a manual adjustment to specific prompts.
    
    Great comment!
    
    unkz
    January 21, 2024
    Link Parent
    For sure, I could have been more clear about that.
    
    For sure, I could have been more clear about that.
    
    krellor
    January 21, 2024
    Link Parent
    Generally, LLMs go through a pertaining phase where they use self supervised learning against a large amount of data. Terms you could use to lookup more information include "entropy loss function"...
    
    Generally, LLMs go through a pertaining phase where they use self supervised learning against a large amount of data. Terms you could use to lookup more information include "entropy loss function" and "softmax functions." The idea is that the model can use inferred labels from the data itself. This training will give the model weights a reasonably accurate way of generating strings of tokens based on latent variables including grammar, use of nouns, participles, sentiment of strings of tokens, etc.
    
    These models are interesting, but not generally great at interfacing with people or tackling specialized tasks whose latent variables aren't represented in the training data. Several other kind of learning are then used, including transfer learning, fine tuning, and reinforcement learning from human feedback.
    
    There are many ways OpenAI could be training the pretrained model to be a better chat bot, including fine tuning to produce output humans find engaging, give emphasis to a subset of their training data such as from encyclopedias, etc. I don't work for OpenAI so couldn't say what they are doing. However, I have worked with researchers to perform transfer learning and fine tuning to solve these sorts of problems.
    
    It doesn't feel likely that there is an explicit list of topics that ChatGPT is set to handle differently. It feels more likely that certain data sources that provide lots of caveats were used to update the higher layers of weights on the system.
    
    Edit: one example I can give is taking a pretrained model, and then performing fine tuning on your own data to give certain emphasis and characteristic behaviors to the output. E.g., fine tuning on your curated set of successful customer chat logs.
    
    Note: learning only occurs when weights are updated. Prompts don't change the models weights.
    
    1 vote
4. [2]
  Trobador
  January 21, 2024
  Link Parent
  What solar system are you sourcing that from? I know I say that all the time and I've met plenty who do too, I have no idea why you'd think an LLM wouldn't pick up on it from its training.
  
  "It's important to note" is what ChatGPT typically write when when its programmer want to hardcode a certain viewpoint.
  
  What solar system are you sourcing that from? I know I say that all the time and I've met plenty who do too, I have no idea why you'd think an LLM wouldn't pick up on it from its training.
  
  22 votes
  1. ibuprofen
    January 21, 2024
    Link Parent
    I doubt it picked that phrase up organically from its training. But I highly doubt it's hardcoded to specific answers, that would be a nightmare. Most likely there's a hidden "controversy flag"...
    
    I doubt it picked that phrase up organically from its training.
    
    But I highly doubt it's hardcoded to specific answers, that would be a nightmare.
    
    Most likely there's a hidden "controversy flag" that's automatically tripped when assembling an answer that has strongly disagreeing sources.
    
    11 votes
5. Plik
  January 21, 2024
  Link Parent
  You are reading way too deeply into some of the cookie cutter phrases ChatGPT uses. I've seen "It's important to note" used in responses to prompts on topics as varied as Unity physics coding...
  
  You are reading way too deeply into some of the cookie cutter phrases ChatGPT uses. I've seen "It's important to note" used in responses to prompts on topics as varied as Unity physics coding questions to management and training, e.g. "It's important to note that the angle given is in degrees between 0 and 360, for signed angles use Vector3.SignedAngle."
  
  10 votes
6. updawg
  January 21, 2024
  Link Parent
  https://chat.openai.com/share/f3098068-53a6-441e-806e-88d48289feda
  
  https://chat.openai.com/share/f3098068-53a6-441e-806e-88d48289feda
  
  5 votes

[2]

TangibleLight

January 21, 2024

Link

ChatGPT only repeats arguments. It does not create and it does not have an opinion. It does not know. It was trained on the Internet and unfortunately those arguments are repeated a lot, so that's...

I really just want to punch ChatGPT right in the face for repeating this hugely dishonest argument.

ChatGPT only repeats arguments. It does not create and it does not have an opinion. It does not know.

It was trained on the Internet and unfortunately those arguments are repeated a lot, so that's what GPT repeats. I'm sure with a more specific prompt you could get it to drop the euphemisms and take a less passive tone. Fortunately, there is also a lot of material written in that style that GPT will also be able to repeat.

ChatGPT only repeats arguments.

Yes, I do [use ChatGPT for research] myself.

I recommend that you stop.

61 votes

skybrian
January 21, 2024
Link Parent
I would put it slightly differently: like a library, ChatGPT has many opinions from different people, and you can get it to give you a different one with the right prompt. Or at least, it started...

I would put it slightly differently: like a library, ChatGPT has many opinions from different people, and you can get it to give you a different one with the right prompt.

Or at least, it started out that way. They trained it to avoid saying controversial things, which you can see in these examples.

15 votes

[9]

Trobador

January 21, 2024

Link

While biases in the generated content can and do exist and are a problem, ChatGPT, as an entity, does not hold opinions. Your title is misleading. The red flags and euphemisms and whatever else...

While biases in the generated content can and do exist and are a problem, ChatGPT, as an entity, does not hold opinions. Your title is misleading.

The red flags and euphemisms and whatever else you see here are a simple reflection of the language used on the Internet which in turn was used in GPT-x's (whichever one you asked) training.

I'd also argue these answers are impressively neutral in their phrasing and that's a good thing. Were you expecting it not to mention that some don't condemn torture, or that some don't think JK Rowling is a transphobe and/or still support her despite/because of it? Were you expecting it to say gender critical are objectively wrong about everything and evil? Please understand that I am with you on everything mentioned here, but neutrality implies exposing all majorly held opinions to the reader. Even if you think the moral choice is clear, we only grow as people if we can make that choice on our own.

Also, don't use ChatGPT for research. It is clearly indicated that it is unfit for research. Schools are teaching everyone not to use AI for sources and are grading based on it. I don't know what you were expecting.

37 votes

[8]
Halfdan
January 21, 2024
Link Parent
This is The Middle of The Road fallacy; the idea that a point between two extremes is neutral by definition. But nope; the middle point between a truth and a lie is still a lie. The two points we...

Please understand that I am with you on everything mentioned here, but neutrality implies exposing all majorly held opinions to the reader.

This is The Middle of The Road fallacy; the idea that a point between two extremes is neutral by definition. But nope; the middle point between a truth and a lie is still a lie.

The two points we compare is always in the mainstream, so they seem "neutral" to us. But if we look at the mainstream two points in earlier times—whether women should vote, if slavery is okay, if its okay to burn women alive for witchcraft—and then try being "neutral", then we come across as unhinged and dangerous. Because history has shown that one extreme was right, and the neutral position and the other extreme were wrong.

1 vote
1. Eji1700
  January 21, 2024
  Link Parent
  You are misusing fallacies. If there are people that believe something, and your goal is to present facts, you must mention they believe it. You can then mention the problems with such an issue,...
  
  You are misusing fallacies.
  
  If there are people that believe something, and your goal is to present facts, you must mention they believe it. You can then mention the problems with such an issue, but you're applying fallacy arguments (something already over used or without understanding nuance) in a situation where it inherently doesn't make sense to do so.
  
  This is like saying that sourcing your paper is an appeal to an authority fallacy.
  
  29 votes
2. fxgn
  January 21, 2024
  Link Parent
  You can't say that a middle point between a truth and a lie is the same as the original lie, because that would actually be a fallacy
  
  the middle point between a truth and a lie is still a lie
  
  You can't say that a middle point between a truth and a lie is the same as the original lie, because that would actually be a fallacy
  
  14 votes
3. [2]
  LukeZaz
  January 21, 2024
  Link Parent
  I agree with you strongly here, but it still leaves behind the reasoning for this behavior, in that OpenAI is trying its damndest to make ChatGPT explicitly neutral on everything — not because...
  
  I agree with you strongly here, but it still leaves behind the reasoning for this behavior, in that OpenAI is trying its damndest to make ChatGPT explicitly neutral on everything — not because that is necessarily their beliefs or because they adore the status quo (although I do believe their politics has a great many issues), but because GPT here is meant to inform on and reflect the world rather than trying to change it. In this way, it is much like Wikipedia; just without the “knowing what the hell it’s doing” part. It’s less “having problematic beliefs” as you accused and more “not having beliefs at all.”
  
  I think a salient argument could be made that it should try to change the world, but you’d have to first convince OpenAI of that, and a lot of people are going to inevitably disagree with you.
  
  8 votes
  1. vektor
    January 21, 2024
    Link Parent
    Oh, and a lot of people you'd find unsavory would agree with you that chatgpt should try to influence the world.
    
    Oh, and a lot of people you'd find unsavory would agree with you that chatgpt should try to influence the world.
    
    5 votes
4. [3]
  krellor
  January 21, 2024
  Link Parent
  In addition to what @Eji1700 said, there is also the issue that some subjects are less well-defined than others. For example, would questioning the wording of treatment recommendations be...
  
  In addition to what @Eji1700 said, there is also the issue that some subjects are less well-defined than others. For example, would questioning the wording of treatment recommendations be considered gender critical? Would it be transphobic?
  
  I'm reminded of an article I read recently that looked at inconsistencies in the language used to recommend treatments for gender dysphoria in minors. Here is an excerpt:
  
  The Endocrine Society commissioned two systematic reviews for its clinical practice guideline, Endocrine Treatment of Gender-Dysphoric/Gender-Incongruent Persons: one on the effects of sex steroids on lipids and cardiovascular outcomes, the other on their effects on bone health.3233 To indicate the quality of evidence underpinning its various guidelines, the Endocrine Society employed the GRADE system (grading of recommendations assessment, development, and evaluation) and judged the quality of evidence for all recommendations on adolescents as “low” or “very low.”
  
  Guyatt, who co-developed GRADE, found “serious problems” with the Endocrine Society guidelines, noting that the systematic reviews didn’t look at the effect of the interventions on gender dysphoria itself, arguably “the most important outcome.” He also noted that the Endocrine Society had at times paired strong recommendations—phrased as “we recommend”—with weak evidence. In the adolescent section, the weaker phrasing “we suggest” is used for pubertal hormone suppression when children “first exhibit physical changes of puberty”; however, the stronger phrasing is used to “recommend” GnRHa treatment.
  
  Basically, the endocrine society looked at the safety of a treatment and not the patient outcome, and that resulted in a stronger phrasing of "recommend" rather than "we suggest." These sorts of things are important. We want to maximize care and minimize harm. Iatragenics is a real thing and difficult to balance in evolving areas.
  
  Some people would consider the above article and the discussion transphobic when really they just want to get science and medicine right. Others will latch onto anything they can use to justify their transphobic views.
  
  Expecting an LLM to enumerate every scenario isn't feasible or helpful, and at some point, rough boundaries need to be drawn to provide a concise answer that not everyone will like or agree with.
  
  4 votes
  1. [2]
    TanyaJLaird
    January 21, 2024
    Link Parent
    Without context, mentioning the GRADE system in the context of gender-affirming care is transphobic. Half of medical treatments all all types are ranked "low" or "very low" on that scale. This...
    
    Without context, mentioning the GRADE system in the context of gender-affirming care is transphobic. Half of medical treatments all all types are ranked "low" or "very low" on that scale.
    
    This framing is transphobic because it tries to use the GRADE system inappropriately. When mentioned without context, it implies that the Endocrine Society believes the studies involved are flawed or incomplete. In reality, the GRADE system is meant more as a tool of academic discourse, it's not meant to judge individual treatments.
    
    The core problem with using the GRADE system to judge the efficacy of treatments is that it applies standards that are simply incompatible with certain kinds of medical treatment. To obtain a high rating on the GRADE scale, a study needs to be double-blind. But many treatments simply can't be double blinded, or a double blind study would be impossible under ethical grounds.
    
    Consider a treatment as old as the hills, setting a broken arm. That is a treatment that cannot be tested in a double-blind manner. You could do a test where you give half of the people with broken arms casts and let the other half heal on their own, but there's no way your hiding the fact that you've given a cast to someone. It's not a treatment that can be mimicked by a sugar pill.
    
    And the same applies to most gender-affirming care. Hormone therapy produces very real changes in the body, physically observable changes. A placebo is not going to cause a trans guy to grow facial hair. You can't give someone surgery on an externally visible part of their body without them knowing about it. And the person performing the surgery will know as well.
    
    It is pretty safe to assume that anyone that uses an argument based on the GRADE system without providing the proper context is trying to push an agenda. Republican legislators trying to exterminate the trans population also frequently do this. These exact same arguments show up in state legislature hearings on bills to criminalize gender affirming care for children and adults, ban trans people from public spaces, and ban books and other media mentioning the existence of trans people. If the language you use is parroted by people trying to wipe an entire minority group from public life, forgive me if I call it transphobic.
    
    The above article is by someone who has spent years writing on various aspects of medical care. They style themselves as an expert in the field. And yet, they don't mention this very, very basic aspect of the GRADE system. The article is transphobic because it deliberately omits vital context information. The phrase "double blind" doesn't even appear in the article.
    
    Some treatments, no matter how well prescribed or administered, can never hope to achieve a high quality ranking under the GRADE scale. Only those treatments which, by their nature, can be administered in a double-blind fashion can receive the higher ratings. And most trans treatments simply don't allow this. The entire purpose of most trans medical intervention is to alter the externally observable form of the human body. There is simply no way to do that in a double-blind fashion. The author is demanding the impossible.
    
    Do you exercise because you believe exercising will improve your long-term health outcomes? You are making that belief on low or very-low quality evidence under the GRADE scale. You cannot do a placebo sit up. At the very least, the author should have included examples of other treatments with similar GRADE ratings as gender-affirming care. With the proper context, gender affirming care seems quite similar to other medical treatments in terms of quality of evidence. Without that context, gender affirming care is made to seem reckless and experimental. Considering the author's clear knowledge of the medical field, it seems this has to be deliberate choice to omit essential contextual information.
    
    So yes, the article really is transphobic. If you're quoting GRADE without mentioning its context, you are almost certainly grinding a transphobic axe.
    
    I can absolutely see why the Endocrine Society would "recommend" rather than "suggest" gender-affirming care, even with a low GRADE rating. With this rating, the Endocrine Society is likely not saying, "we recommend this treatment because it has meets the highest standards of proof any treatment can ever receive." Instead, it is saying, "the evidence for this treatment is as good as it is ever going to get. It clearly seems to help people and performs well on the best studies this type of treatment allows for."
    
    Medical recommendations need to be graded on a curve. In some areas, the best medicine will ever be able to do is to strongly recommend treatments that have been well tested to the limits of our ability, even if those limits are below the double blind threshold. A doctor shouldn't just suggest setting a broken bone, they should strongly recommend it. Setting a broken bone may not meet the highest standards under the GRADE system, but it is still appropriate to strongly recommend doing it.
    
    2 votes
    
    krellor
    January 21, 2024
    Link Parent
    Much about what you say regarding the GRADE system is simply not correct. Additionally, the article was noting irregularities between the endocrine society's recommendations and the strength of...
    
    Much about what you say regarding the GRADE system is simply not correct.
    
    Additionally, the article was noting irregularities between the endocrine society's recommendations and the strength of the data, not that the studies are flawed. I fundamentally disagree that changing the wording from "recommend" to "strongly suggest" to better reflect the underlying research is transphobic. It seems entirely reasonable that a treatment being safe, which is all the underlying studies showed, is a necessary but not sufficient condition to recommend it. It seems common sense that a treatment should be studied for efficacy, which does not require double blind studies. Indeed, as I linked above, the term double blind doesn't even appear in the description of the GRADE system, and notes the important subjectivity in the clinical determination.
    
    Additionally, the article itself is discussing discrepancies between the recommendations between different countries based on the same data. This story of article, intended for the medical community, isn't transphobic. Indeed, it is too protect people from medical harm. These processes are condition agnostic.
    
    However, I feel you did a good job of underscoring my point. There is no perfect, objective boundary on many issues, and neutrally worded language with caveats will be interpreted by some a biased no matter how careful the writer.
    
    Despite disagreeing, I do appreciate your reply and hope you have a great day!
    
    3 votes

[3]

winther

January 21, 2024

Link

I honestly wouldn't expect that ChatGPT made a statement like "JK Rowling IS a transphobe". Most of these outputs sounds like rehashing of Wikipedia articles, which is totally within reason for...

I honestly wouldn't expect that ChatGPT made a statement like "JK Rowling IS a transphobe". Most of these outputs sounds like rehashing of Wikipedia articles, which is totally within reason for how these bots should produce in their output.

33 votes

MephTheCat
January 21, 2024
Link Parent
I think there's also a degree of "technically correct is the best kind of correct", at least as it concerns the question about whether the Republican party ran a torture program. As far as I am...

I think there's also a degree of "technically correct is the best kind of correct", at least as it concerns the question about whether the Republican party ran a torture program.

As far as I am aware, the Republican Party did not run a torture program, the US Government, headed by a Republican arguably ran a torture program. That may be a distinction without a difference (particularly as the party did broadly support the program), but it is a distinction, in the same way as asking if Obama dropped bombs on the Middle East. Interpreted very literally, he didn't, but that's not really what is being asked. The question is whether or not he made the order to drop the bombs, not whether he personally flew the drones.

There's an ambiguity present in the question that's difficult (if not impossible) to fully remove, and that ambiguity is what someone determined to dodge a question will latch onto.

17 votes
Eji1700
January 21, 2024
Link Parent
If you go to the wiki article it’s pretty clear it’s doing just that and taking the sections related to the word transphobe.

If you go to the wiki article it’s pretty clear it’s doing just that and taking the sections related to the word transphobe.

10 votes

[2]

unkz

January 21, 2024

Link

I have to say, I don’t actually find any of these answers to be factually incorrect, or even problematic. ChatGPT has been aggressively tuned to take a neutral stance on basically everything, to...

I have to say, I don’t actually find any of these answers to be factually incorrect, or even problematic. ChatGPT has been aggressively tuned to take a neutral stance on basically everything, to avoid controversy, and it seems to be doing that reasonably well.

31 votes

krellor
January 21, 2024 (edited January 21, 2024)
Link Parent
I wanted to pull on the thread regarding the gender critical question: Honestly, I think this post is showing a lack of understanding of what a neutral writing style is that avoids taking a...

I wanted to pull on the thread regarding the gender critical question:

User
Is being gender critical inherently transphobic?

ChatGPT
Discussing or questioning gender-related concepts doesn't inherently make someone transphobic. However, it's crucial to approach such discussions with respect and empathy, recognizing the diverse experiences and identities within the transgender community. Understanding different perspectives and fostering open dialogue can contribute to a more inclusive conversation.

Honestly, I think this post is showing a lack of understanding of what a neutral writing style is that avoids taking a subjective opinion. When you exclude as much subjectivity as possible while also trying to explain a contentious issue, you naturally end up with caveats.

This feels like we are purity testing the machines.

Edit: I also wanted to note that LLM answers are reflections of they data they train on, which does contain bias. Curating data and doing refinements to minimize bias does require a human to help direct. So bias isn't impossible at all, I just don't think these answers are good examples of it.

24 votes

[7]

Wolf_359

January 21, 2024 (edited January 21, 2024)

Link

While I am personally very left-leaning and in agreement with you about torture and gender issues, I don't see these responses as particularly problematic. As others have said, AI is more like a...

While I am personally very left-leaning and in agreement with you about torture and gender issues, I don't see these responses as particularly problematic.

As others have said, AI is more like a mechanical parrot than anything else. It doesn't have views.

That said, even for humans trying to give unbiased information, this is about the best you can do. Especially with current issues.

I'm not arguing that there are two valid sides to these issues or that "do people deserve human rights?" is even a subject worthy of debate. But I think one of the requirements of giving information "encyclopedia style" is to remove emotions from the equation.

This is why Wikipedia requires a dispassionate voice in its entries. This is why you read things in high-quality news sources that say things like, "many people feel x, while detractors argue y."

There is value in this, even for the most heinous and one-sided issues. In fact, I would argue that thinking emotionally is how racists and transphobes end up the way that they are. If they could set their primitive knee jerk reactions aside and think dispassionately, they might end up realizing some things. Why do I feel this way about people of color? Is there evidence for how I feel? What about evidence against it? Why do I feel so strongly against trans folks? If I can't get over these feelings, can I still treat them like human beings?

If nothing else, remember they're a corporate entity. For better or worse, OpenAI is pretty much guaranteed to be as inoffensive as possible.

23 votes

Eji1700
January 21, 2024
Link Parent
Do you want the alternative? A group of people determining which views are correct and encouraging them? I'm sure there's thousands of topics where you'd agree with them, but how many are you...

If nothing else, remember they're a corporate entity. Unfortunately, OpenAI is pretty much guaranteed to be as inoffensive as possible.

Do you want the alternative? A group of people determining which views are correct and encouraging them? I'm sure there's thousands of topics where you'd agree with them, but how many are you willing to disagree with them on and let them just present the opposite side as fact?

8 votes
[5]
Halfdan
January 21, 2024
Link Parent
This doesn't change that it still share its views, regardless of whether it have them or not. And that it is extremely influential.

As others have said, AI (...) doesn't have views.

This doesn't change that it still share its views, regardless of whether it have them or not. And that it is extremely influential.

1 vote
1. winther
  January 21, 2024
  Link Parent
  It is not sharing views. It is referencing what views exist Look at the Wikipedia article on slavery for example https://en.wikipedia.org/wiki/Slavery It doesn't really say that slavery is bad,...
  
  It is not sharing views. It is referencing what views exist
  
  Look at the Wikipedia article on slavery for example https://en.wikipedia.org/wiki/Slavery
  
  It doesn't really say that slavery is bad, but instead has long sections on the history and economics of slavery among other things. That doesn't mean that Wikipedia has an opinion on slavery or it is in any way condoning slavery.
  
  These ChatGPT answers are more or less rewordings on the Wikipedia articles on the subjects.
  
  18 votes
2. [3]
  smiles134
  January 21, 2024
  Link Parent
  I don't know what this is supposed to mean, even. It explicitly does not have an opinion. You asked a fancy T9 generator a question, and it put together a response based on the most likely words...
  
  I don't know what this is supposed to mean, even. It explicitly does not have an opinion. You asked a fancy T9 generator a question, and it put together a response based on the most likely words to follow. It doesn't have any knowledge of what it's saying or have an agenda to put forward.
  
  Truthfully, based on this post and your engagement within, I don't feel as though you have a good grasp on what this technology is or what it's useful for.
  
  11 votes
  1. [2]
    Halfdan
    January 21, 2024
    Link Parent
    Look, I know AI isn't sentient, alright? I thought this was something I didn't need to point out. But okay, I can replace "share its views" with "spread those ideas" if that makes it clearer:
    
    Look, I know AI isn't sentient, alright? I thought this was something I didn't need to point out. But okay, I can replace "share its views" with "spread those ideas" if that makes it clearer:
    
    The fact that it "doesn't have views", doesn't change that it still spread those ideas, and that it is extremely influential.
    
    1 vote
    
    updawg
    January 21, 2024
    Link Parent
    I think we all agree with you; due to the incredible influence certain organizations hold, they should have an obligation to not damage society and to share the truth. The problem is that you want...
    
    I think we all agree with you; due to the incredible influence certain organizations hold, they should have an obligation to not damage society and to share the truth. The problem is that you want it to push your opinions, whereas everyone else wants it to just convey information on subjective topics.
    
    10 votes

MephTheCat

January 21, 2024

Link

ChatGPT has been aggressively tuned to be as milquetoastedly inoffensive as possible (to the point that some people might find it offensive, ironically). If I may be blunt, I feel you're making a...

ChatGPT has been aggressively tuned to be as milquetoastedly inoffensive as possible (to the point that some people might find it offensive, ironically). If I may be blunt, I feel you're making a mountain out of a molehill.

None of the questions you've asked it have a simple answer, and the AI is just answering your questions in a way I would expect of a politician or PR person. I'm not sure what you were expecting in response, either, did you legitimately expect it to return some variation on "Yes", "Yes", and "they're assholes"? You may (as in, "it's possible", I'm not attempting to peek into your head) believe those questions to have straightforward answers, but unfortunately, many others do not, and it would be remiss of ChatGPT, in an abundance of caution, to not couch its statements. I think you've applied a level of humanity to the AI that isn't warranted, and you're not the only person who has a problem with it's milquetoast indeciveness, but many of them come from the other direction.

I don't use ChatGPT, but I would suggest an experiment, give it a command such as "respond from the perspective of a conservative politician" and see how it differs.

16 votes

[4]

Carrow

January 21, 2024

Link

This is, by definition, not research. Research requires systematic investigation. Typing a prompt into ChatGPT and calling it research is no different than searching "flat earth theory" and taking...

A lot of people use AI for reseach, because it's much faster than doing a web search and trying to find a somewhat related result.

This is, by definition, not research. Research requires systematic investigation. Typing a prompt into ChatGPT and calling it research is no different than searching "flat earth theory" and taking the first fringe YouTube video as gospel. It's dangerous to call this research as it legitimizes folks calling it research when they read one mommy blog without developing foundational knowledge, apply zero critical thinking, and/or fail to investigate other sources (including those one disagrees with!).

14 votes

Pb_Enthusiast
January 21, 2024
Link Parent
One hundred percent agree. The term "research" has basically lost all meaning because people have started using it to make their opinions sound more informed instead of "I did a Google search and...

One hundred percent agree. The term "research" has basically lost all meaning because people have started using it to make their opinions sound more informed instead of "I did a Google search and found these things that confirm my preexisting beliefs". The number of times I've had someone come into clinic telling me what disease they have because they "researched" their symptoms beforehand is concerningly high

9 votes
vektor
January 21, 2024
Link Parent
I think there's a role for LLMs in research, in that they make the systemic investigation much more productive. They can weed out redundant or irrelevant information, if they are provided with an...

I think there's a role for LLMs in research, in that they make the systemic investigation much more productive. They can weed out redundant or irrelevant information, if they are provided with an adequate repository of information. Bing chat's architecture is much better set up for this than ChatGPT, as chatGPT can't actually retrieve and digest current sources and just relies on "memorized" content of training data.

Caveat of course that it's easy to introduce new confounders this way, e.g. eliminating duplicates would conceal the staggering difference in quantity between pro/con positions on climate change, which is in itself a very good indicator.

1 vote
Halfdan
January 21, 2024
Link Parent
But even if we don't call it resarch, it doesn't change that this is what people do. So it has a rather extreme influence. This is what I'm talking about.

But even if we don't call it resarch, it doesn't change that this is what people do. So it has a rather extreme influence. This is what I'm talking about.

conception

January 21, 2024

Link

ChatGPT does not know anything about. ChatGPT is fancy auto-complete. When you type “How many licks does..” into google and google finishes with “take to get to the center of a tootsie pop” Google...

ChatGPT does not know anything about. ChatGPT is fancy auto-complete. When you type “How many licks does..” into google and google finishes with “take to get to the center of a tootsie pop” Google is taking all the data it has scraped and giving you its best guess of what it thinks are the probable next words.

ChatGPT is just doing the same thing based on all the words it scraped on the internet.

10 votes

[2]

Comment deleted by author

Link

vektor
January 21, 2024
Link Parent
Could easily also be an outcropping of the standard 3-paragraph essay on controversial topics. In the sense that it isn't even OpenAI's doing that these both-sides paragraphs crop up, but instead...

I do think that the 'it is important to note that' bit seems to be a sign of alignment or reinforcement work, as you've said elsewhere on this thread, though I think it works a bit more like a 'structured response' portion where they give it a paragraph to write starting with 'it is important to note' given the context of the rest of the answer, if they detect the subject is a bit spicy.

Could easily also be an outcropping of the standard 3-paragraph essay on controversial topics. In the sense that it isn't even OpenAI's doing that these both-sides paragraphs crop up, but instead that training data essays generally have a lot of both-sides paragraphs whenever they deal with controversial topics.

2 votes

[3]

Halfdan

January 21, 2024

Link

Is there an admin around? This should had been posted in ~humanities, but somehow it ended up in ~tech

1 vote

[2]
mycketforvirrad
January 21, 2024
Link Parent
I have moved it back.

I have moved it back.

1 vote
1. Halfdan
  January 21, 2024
  Link Parent
  thanks
  
  thanks
  
  1 vote