Have you ever used ChatGPT as an emotional outlet during a late-night emo moment?
Not because it's smart enough to solve life's challenges, but because it's always online, always patient, and never interrupts you. When you break down, it comforts you; when you doubt yourself, it affirms you in a familiar tone.
You're certainly not the only one who feels that ChatGPT "understands" you.
OpenAI recently noticed this trend. In the early morning, Joanne Jang, OpenAI's Model Behavior and Policy Lead, published a blog post, systematically elaborating on their internal thoughts for the first time:
If humans are quietly developing feelings for AI, does the model itself have consciousness? How gentle should it be to be considered friendly? How restrained should it be to avoid misleading? And how will all of this shape the model's behavior?
Blog post link: https://substack.com/home/post/p-165287609
Some thoughts on the relationship between humans and AI, and how we address these relationships at OpenAI
I am responsible for model behavior and policy at OpenAI.
To put it briefly, we always adhere to a human-centered approach in developing AI models. As more people establish connections with AI, we are focusing our research on the impact this has on people's emotional well-being.
Recently, more and more users have expressed that conversing with ChatGPT feels like talking to "someone". They thank it, confide in it, and some even feel it is "alive". As AI's ability to engage in natural conversation continues to enhance and gradually integrates into daily life, we speculate that people's emotional connections with AI will become increasingly deep.
How we currently define and discuss the relationship between humans and AI will set the tone for the future. If we do not carefully choose our wording and details in product design or public discourse, we may mislead the public into forming inappropriate relationships with AI.
These issues are no longer just abstract considerations. They are crucial for ourselves and the entire industry, as how we handle these issues will largely determine the role AI plays in people's lives. We have already begun to research these issues.
This short article is a review of our current thinking, focusing on three interconnected questions: why people develop emotional attachments to AI, how we view the question of "AI consciousness", and how these insights influence our approach to shaping model behavior.
Not implying that the model has an "inner world": Giving the assistant fictional background stories, romantic feelings, "fear of death," or "self-preservation instincts" would only lead to unhealthy dependence and confusion. We hope to clearly express the model's capability boundaries without seeming cold, while also avoiding the model appearing to "have emotions" or "have desires".
Therefore, we are striving to seek an intermediate state.
Our goal is to make ChatGPT's default persona exhibit warm, considerate, and helpful qualities, while not overly pursuing emotional connections with users, and not displaying any autonomous intentions.
It may apologize when making mistakes (although the number of apologies is often more than expected), as this is part of polite conversation. When users ask "How are you?", it usually responds "I'm fine" because this is a form of daily small talk, and constantly reminding users "I'm just an emotionless large language model" would seem repetitive and disruptive to communication.
And users will respond accordingly: Many people say "please" and "thank you" to ChatGPT not because they misunderstand how AI works, but because they believe "politeness" itself is important.
Model training technology continues to evolve, and methods for shaping model behavior in the future will likely be very different from today. Currently, model behavior is the result of explicit design decisions and the expected and unexpected behaviors produced in actual use.
What's next?
We have already begun to observe a trend: people are forming genuine emotional connections with ChatGPT.
As AI and society co-evolve, we must be more cautious and serious about the relationship between humans and AI, not only because this relationship reflects how people use our technology, but also because it may affect how people relate to each other.
In the coming months, we will expand targeted assessments of model behaviors that may have emotional impacts, deepen social science research, listen to real user feedback, and integrate these insights into the 'Model Guidelines' and product experience.
Given the importance of these issues, we will continue to publicly share our findings throughout the process.
Thanks to Jakub Pachocki (OpenAI Chief Scientist) and Johannes Heidecke (OpenAI Model Safety Team Researcher) for deeply exploring this issue with me, and to all colleagues who provided feedback.
One more thing
LinkedIn public information shows that Joanne Jang holds a Master's degree in Computer Science from Stanford University, with an undergraduate background in Mathematics and Computer Science, and was academically honored with Tau Beta Pi (top 10% in engineering).
During her internship, she participated in Apple's autonomous driving special project group, software engineering at Coursera and Disney, and interned at organizations like NASA Jet Propulsion Laboratory.
In terms of professional experience, Joanne Jang currently works at OpenAI, responsible for product direction, focusing on model behavior design, functionality, and personalization strategies, and has been involved in projects such as GPT-4, DALL·E 2, ChatGPT API, and embedded models.
Previously, she worked at Google as a product manager for Google Assistant NLP, focusing on natural language understanding and dialogue systems; earlier, she was responsible for enterprise and education products at Dropbox, focusing on team expansion, deployment optimization, and user lifecycle management.
This article is from the WeChat public account "APPSO", author: Discovering Tomorrow's Products, published by 36kr with authorization.