Your tasks will involve writing adversarial prompts to identify weaknesses in various cutting-edge AI models, including Large Language Models (LLMs), Text-to-Image, Text-to-Video, Multi-Modal models, AI Agents and beyond. Youll also manage and analyze datasets to ensure the generation of high-quality outputs and actionable insights that contribute to AI safety research. Key Responsibilities
* Design adversarial prompts to TEST AI systems across multiple modalities.
* Identify, categorize, and document model weaknesses or unsafe outputs.
* Support data annotation, curation, and quality control processes.
* Summarize findings into structured reports or data templates.
* Design adversarial prompts to TEST AI systems across multiple modalities.
* Identify, categorize, and document model weaknesses or unsafe outputs.
* Support data annotation, curation, and quality control processes.
* Summarize findings into structured reports or data templates.
Requirements:
* Proven experience with Generative AI models is essential, though direct technical experience is not a prerequisite.
* Understanding of risk taxonomies (e.g., harm categories, policy tiers).
* Command of English at a near-native level.
* Attention to detail, organizational capabilities.
* Ability to manage multiple tasks simultaneously and meet deadlines. Additional Wants:
* Familiarity with various model types (Text-to-Text, Text-to-Image) is desirable.
* Experience with prompt injection techniques, jailbreaks and red-teaming techniques.
* Prior work in model evaluation,prompt engineering, or safety analysis.
* Regional expertise or cultural fluency in specific geopolitical areas.
* Proven experience with Generative AI models is essential, though direct technical experience is not a prerequisite.
* Understanding of risk taxonomies (e.g., harm categories, policy tiers).
* Command of English at a near-native level.
* Attention to detail, organizational capabilities.
* Ability to manage multiple tasks simultaneously and meet deadlines. Additional Wants:
* Familiarity with various model types (Text-to-Text, Text-to-Image) is desirable.
* Experience with prompt injection techniques, jailbreaks and red-teaming techniques.
* Prior work in model evaluation,prompt engineering, or safety analysis.
* Regional expertise or cultural fluency in specific geopolitical areas.
This position is open to all candidates.







