ChengCheng Tan

Senior Communications Specialist

ChengCheng is a Senior Communications Specialist at FAR.AI. She loves working with artificial intelligence, is passionate about lifelong learning, and is devoted to making technology education accessible to a wider audience. She brings over 20 years of experience consulting in UI/UX research, design and programming. At Stanford University, her MS in Computer Science specializing in Human Computer Interaction was advised by Terry Winograd and funded by the NSF. Prior to this, she studied computational linguistics at UCLA and graphic design at the Laguna College of Art + Design.

NEWs & publications

Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google

February 4, 2025

illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google

Bay Area Alignment Workshop 2024

December 10, 2024

bay-area-alignment-workshop-2024

GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning

October 31, 2024

gpt-4o-guardrails-gone-data-poisoning-jailbreak-tuning

Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws

scaling-laws-for-data-poisoning-in-llms

Beyond the Board: Exploring AI Robustness Through Go

December 21, 2023

beyond-the-board-exploring-ai-robustness-through-go

Can Go AIs be adversarially robust?

can-go-ais-be-adversarially-robust

Pacing Outside the Box: RNNs Learn to Plan in Sokoban

July 24, 2024

pacing-outside-the-box-rnns-learn-to-plan-in-sokoban

Planning behavior in a recurrent neural network that plays Sokoban

planning-behavior-in-a-recurrent-neural-network-that-plays-sokoban

Vienna Alignment Workshop 2024

September 10, 2024

vienna-alignment-workshop-2024

NOLA Alignment Workshop 2023

February 7, 2024

nola-alignment-workshop-2023

We Found Exploits in GPT-4’s Fine-tuning & Assistants APIs

December 21, 2023

we-found-exploits-in-gpt-4s-fine-tuning-assistants-apis

Exploiting Novel GPT-4 APIs

exploiting-novel-gpt-4-apis

Uncovering Latent Human Wellbeing in LLM Embeddings

September 12, 2023

uncovering-latent-human-wellbeing-in-llm-embeddings

Uncovering Latent Human Wellbeing in Language Model Embeddings

uncovering-latent-human-wellbeing-in-language-model-embeddings

VLM-RM: Specifying Rewards with Natural Language

October 19, 2023

vlm-rm-specifying-rewards-with-natural-language

Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning

vision-language-models-are-zero-shot-reward-models-for-reinforcement-learning

Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google

February 4, 2025

illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google

Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google

illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google

Uncovering Latent Human Wellbeing in Language Model Embeddings

February 19, 2024

uncovering-latent-human-wellbeing-in-language-model-embeddings

Uncovering Latent Human Wellbeing in LLM Embeddings

uncovering-latent-human-wellbeing-in-llm-embeddings

publications:

No studies available yet.

Research

Our research explores a portfolio
of high-potential agendas.

Events

Our events bring together
global leaders in AI.

Programs

Our programs build the field of trustworthy and secure AI

Research

Our research explores a portfolio
of high-potential agendas.

Events

Our events bring together
global leaders in AI.

Programs

Our programs build the field of trustworthy and secure AI