Turing

Apply

United States

Remote

Posted May 15, 2026

English

Japanese

Turing Is Hiring A Remote Ai Quality Analyst

Join Turing as an AI Quality Analyst 🌐

Based in San Francisco, California, Turing is a leading research accelerator for frontier AI labs, trusted by global enterprises to deploy advanced AI systems and transform AI from proof of concept to impactful solutions.

About Turing

We support our customers by:

Accelerating frontier research with high-quality data, cutting-edge training pipelines, and top-tier AI researchers specialized in coding, reasoning, STEM, multilinguality, multimodality, and agents.
Transforming AI into proprietary intelligence that delivers reliable performance, measurable impact, and lasting results on the P&L for enterprises.

Role Overview 🚀

As an AI Quality Analyst, you will evaluate a new personalization feature for Gemini, assessing how well the model uses your past conversations and activity on Gmail, Google Search, and YouTube to provide relevant and helpful responses. This role combines creativity with analytical rigor.

You will:

Design prompts from your personal experiences.
Use your analytical skills to evaluate the quality of personalized responses across dimensions like Grounding, Integration, and Helpfulness.

Key Qualifications 💡

Japanese Proficiency: Ability to read and write in Japanese with high competence, as Japanese is the focus language.
Personal Account Usage: Willing to use your primary personal Google account and enable personal data sources for genuine assessment.
Schedule Flexibility: Full-time availability in your local time zone to join a 24/7 global team.
Exceptional Analytical Thinking: Ability to evaluate nuanced AI responses with a focus on personalization quality.
Creative Prompt Engineering: Experience designing prompts based on personal context for thorough testing.
Strong Evaluation Skills: Understanding personalization, identifying incorrect inferences, and detecting forced connections.
Meticulous Attention to Detail: Skilled at spotting subtle differences in model responses and naturalness.
Excellent Written Communication: Ability to write clear, structured rationales referencing specific conversation turns.
Constructive Feedback: Capable of providing detailed annotations and suggestions.
Team Collaboration: Strong communication skills for effective teamwork.
Independence: Self-motivated with the ability to work remotely and autonomously.
Technical Setup: Desktop or laptop with reliable internet connection.

Responsibilities 📝

In this dynamic role, you will:

Design and execute multi-turn conversational prompts (1-5 turns) leveraging your personal information and experiences.
Evaluate model responses to ensure personalization was correctly applied based on your initial prompt.
Analyze responses for Grounding issues—checking if claims are supported by evidence and free of hallucinations.
Assess Integration quality, verifying natural incorporation of personal data without robotic overnarration.
Compare two model responses side-by-side (SxS) to determine which is more helpful and user-friendly, and write clear rationales.
Extract and verify Debug Info to confirm proper use of chat summaries and data sources.
Maintain strict data hygiene by deleting evaluation conversations to prevent data pollution.

Education & Experience 🎓

BS/BA or equivalent experience in fields such as Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or related analytical disciplines.
Experience in data annotation, AI quality evaluation, content moderation, or similar roles is highly preferred.

Additional Details ⏰

Commitment: At least 4 hours per day, minimum 30 hours per week, with a 4-hour overlap with PST.
Time Options: 30 or 40 hours per week.
Engagement Type: Contractor
Duration: 3 months

Apply