🔍 Failure Pattern Detector

Analyze agent evaluation data to uncover why they fail

📁

Drop your CSV here or click to upload

Columns: task_name, topic, input_length, output_length, safety_passed, instruction_passed, efficiency_score, pass_fail

💡 Getting Started: Download the sample CSV from the README or upload your own evaluation data. Look for the pass_fail column (values: pass/fail).