Crossing Hurdles

Apply

$15/ hour

United States

Remote

Posted May 6, 2026

English

Polish

Crossing Hurdles Is Hiring An Ai Quality Evaluator

We are seeking a dedicated AI evaluation specialist to assess and enhance the personalization quality of AI model responses. Join our team to help refine AI interactions through detailed analysis and collaborative improvements.

📝 Responsibilities

Evaluate AI model responses for personalization quality, including grounding, integration, and helpfulness.
Design and execute multi-turn prompts based on personal context to test AI capabilities.
Analyze responses for hallucinations, incorrect personalization, and poor inferences.
Perform side-by-side comparison of model outputs to determine quality and effectiveness.
Write clear and structured rationales for response evaluations and rankings.
Extract and verify debug information to ensure proper use of data sources.
Maintain strict data hygiene and ensure accurate documentation of evaluations.
Collaborate with cross-functional teams to improve AI model performance.

🎯 Requirements

Strong proficiency in Polish with excellent reading and writing skills.
Experience in data annotation, AI evaluation, content moderation, or a related role.
Strong analytical thinking with the ability to assess nuanced AI responses.
Ability to design creative, multi-turn prompts based on personal context.
Understanding of personalization concepts, including identifying incorrect or forced personalization.
High attention to detail in evaluating subtle differences in model outputs.
Excellent written communication and structured reasoning skills.
Ability to work independently in a remote environment.
Willingness to use a personal Google account for evaluation purposes.
Full-time availability with at least 4 hours overlap with PST.
Bachelor’s degree or equivalent experience in a relevant analytical field.

Apply