2026-05-25agentsscalingcommunity code

When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges

Parth Darshan, Abhishek Divekar

PDF preview unavailable

Key claim

Multi-task optimization can degrade performance in LLM customization.

This paper explores the challenges of customizing large language models for specific tasks using textual gradient methods. A key result is that combining multiple task instructions into a single prompt can significantly degrade performance, highlighting the need for careful design in multi-objective optimization.

In plain English

Novelty

6.5/10

The paper introduces a new approach to optimizing prompts for multi-objective tasks, which is a meaningful extension of existing methods.

Reliability

7.0/10

The findings are supported by empirical results, although the evaluation could be more comprehensive.

Deep reliability assessment

The methodology supports the identification of failure modes in multi-objective prompt optimization for LLM judges, but it may overclaim the generalizability of these findings beyond the specific configurations tested.

Reproducibility

Yes, the authors mention that their code and diagnostics will be released under an open-source license to support reproducibility.

Discussion questions

1.What assumptions about the independence of evaluation criteria are challenged by the findings?
2.How can builders effectively mitigate the identified failure modes in practical applications?
3.What experimental conditions would need to change to invalidate the observed degradation in performance?

Key figure

Figure 1 illustrates the four-stage optimization pipeline for multi-criteria prompt optimization, detailing the roles of the task model, loss LLM, gradient LLM, and optimizer LLM.

GitHub1 repo

adivekar-utexas/when-gradients-collideCommunity