Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis

by   Sohee Yang, et al.

Large Language Models (LLMs) have demonstrated great capabilities in solving a wide range of tasks in a resource-efficient manner through prompting, which does not require task-specific training, but suffers from performance fluctuation when there are multiple prompt candidates. Previous works have introduced gradient-free probability-based prompt selection methods that aim to choose the optimal prompt among the candidates for a given task but fail to provide a comprehensive and fair comparison between each other. In this paper, we propose a unified framework to interpret and evaluate the existing probability-based prompt selection methods by performing extensive experiments on 13 common NLP tasks. We find that all existing methods can be unified into some variant of the method that maximizes the mutual information between the input and the corresponding model output (denoted as MI). Using the finding, we develop several variants of MI and increases the effectiveness of the best prompt selection method from 87.79 performance of the selected prompt to that of the optimal oracle prompt. Furthermore, we propose a novel calibration method called Calibration by Marginalization (CBM) that is orthogonal to existing methods and helps increase the prompt selection effectiveness of the best method by 99.44 datasets used in our work will be released at


page 1

page 13

page 18

page 20

page 21

page 23

page 25

page 26


VideoLLM: Modeling Video Sequence with Large Language Models

With the exponential growth of video data, there is an urgent need for a...

On the Efficiency of K-Means Clustering: Evaluation, Optimization, and Algorithm Selection

This paper presents a thorough evaluation of the existing methods that a...

Compositional Exemplars for In-context Learning

Large pretrained language models (LMs) have shown impressive In-Context ...

Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks

Natural Language Explanations (NLE) aim at supplementing the prediction ...

Text Alignment Is An Efficient Unified Model for Massive NLP Tasks

Large language models (LLMs), typically designed as a function of next-w...

A Unified Framework of Surrogate Loss by Refactoring and Interpolation

We introduce UniLoss, a unified framework to generate surrogate losses f...

3D-EX : A Unified Dataset of Definitions and Dictionary Examples

Definitions are a fundamental building block in lexicography, linguistic...

Please sign up or login with your details

Forgot password? Click here to reset