Go See All Bilingual JobsApply
$15/ hour
·
United States
·
Remote
·
Posted May 6, 2026
English
Polish
Crossing Hurdles Is Hiring An Ai Quality Evaluator
We are seeking a dedicated AI evaluation specialist to assess and enhance the personalization quality of AI model responses. Join our team to help refine AI interactions through detailed analysis and collaborative improvements.
📝 Responsibilities
- Evaluate AI model responses for personalization quality, including grounding, integration, and helpfulness.
- Design and execute multi-turn prompts based on personal context to test AI capabilities.
- Analyze responses for hallucinations, incorrect personalization, and poor inferences.
- Perform side-by-side comparison of model outputs to determine quality and effectiveness.
- Write clear and structured rationales for response evaluations and rankings.
- Extract and verify debug information to ensure proper use of data sources.
- Maintain strict data hygiene and ensure accurate documentation of evaluations.
- Collaborate with cross-functional teams to improve AI model performance.
🎯 Requirements
- Strong proficiency in Polish with excellent reading and writing skills.
- Experience in data annotation, AI evaluation, content moderation, or a related role.
- Strong analytical thinking with the ability to assess nuanced AI responses.
- Ability to design creative, multi-turn prompts based on personal context.
- Understanding of personalization concepts, including identifying incorrect or forced personalization.
- High attention to detail in evaluating subtle differences in model outputs.
- Excellent written communication and structured reasoning skills.
- Ability to work independently in a remote environment.
- Willingness to use a personal Google account for evaluation purposes.
- Full-time availability with at least 4 hours overlap with PST.
- Bachelor’s degree or equivalent experience in a relevant analytical field.