English
Korean
Turing Is Hiring A Remote Ai Quality Analyst
Join us as an AI Quality Analyst to evaluate and enhance personalized AI features, blending creativity with analytical expertise in a dynamic, remote environment. 🚀
Role Overview
As an AI Quality Analyst, you will assess a new personalization feature for Gemini. Your task is to evaluate how effectively the model leverages information from your past Gemini conversations, Gmail, Google Search, and YouTube activity to generate more relevant and helpful responses. This role uniquely combines creativity with analytical rigor, requiring the design of prompts from personal experiences and detailed evaluation of the model’s responses based on metrics like Grounding, Integration, and Helpfulness.
Key Responsibilities
- Design multi-turn prompts (typically 1-5 turns) that prompt the AI to utilize your personal information and experiences.
- Evaluate model responses for ground truth accuracy, relevance, and personalization quality.
- Assess Grounding: Ensure claims about you are supported by evidence and free from hallucinations or flawed inferences.
- Check Integration: Confirm personal data is woven naturally into responses without robotic over-narration.
- Compare responses side-by-side (SxS): Rank which response is more helpful, natural, and user-friendly, providing clear rationales.
- Extract and verify Debug Info to confirm proper utilization of chat summaries and data sources.
- Maintain data hygiene by securely deleting evaluation conversations to protect future chat history.
Key Qualifications
- Korean Proficiency: Ability to read and write in Korean at a high level, as Korean is the focus language.
- Personal Account Usage: Willingness to use your primary Google account with enabled personal data sources for genuine testing.
- Schedule Flexibility: Full-time availability aligned with your local time zone; capable of supporting our global, 24-hour operations.
- Exceptional Analytical Thinking: Skill in evaluating nuanced AI responses and personalization quality.
- Creative Prompt Engineering: Experience designing multi-turn prompts based on personal context to thoroughly test the model.
- Strong Evaluation Acumen: Knowledge of personalization concepts and ability to identify errors or forced connections.
- Meticulous Attention to Detail: Skilled in spotting subtle differences in naturalness and over-narration during response comparison.
- Excellent Communication: Ability to write clear, concise, and structured rationales with explicit turn number references.
- Constructive Feedback & Collaboration: Proficient in providing detailed annotations and working effectively in teams.
- Independence: Self-motivated, capable of working remotely without supervision.
- Technical Setup: Reliable desktop/laptop with a good internet connection.
Role Details
This position involves evaluating AI responses, ensuring data integrity, and providing actionable feedback to improve personalization features.
Qualifications & Engagement
- Education: BS/BA or equivalent in Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or related fields.
- Experience: Prior work in data annotation, AI quality evaluation, content moderation, or related roles is highly preferred.
- Commitment: At least 4 hours per day, a minimum of 30 hours per week, with a 4-hour overlap with PST.
- Engagement Duration: 3 months as a contractor, with flexible weekly hours (30 or 40 hours).