4

What is the difference between instruction tuning and normal fine-tuning for large language models?

Also the instruction-tuning I'm referring to isn't the in-context/prompt one.

All the recent papers about fine-tuning seem to be about instruction tuning.

I have looked at a couple of papers about fine-tuning/instruction tuning (e.g. FLAN) and none really describe the difference between instruction tuning and the alternatives (whatever the alternatives are).

I understand instruction-tuning is a form of fine-tuning but with an instruction dataset. But are all datasets not instruction datasets? What other kinds are there?

Flo
  • 51
  • 1
  • 4
  • 1
    If my answer solves your question, could you mark it as solution, please? And otherwise, could you please give feedback on what is missing? – Bernhard Stadler Jun 26 '23 at 15:10

1 Answers1

5

As you said, fine-tuning and instruction tuning are not mutually exclusive, but instruction tuning is a form of (supervised) fine-tuning, so there is no distinguishing feature of fine-tuning that differentiates it from instruction tuning, but only the other way around. So the answer to your first question is "No" (I read it as "Is every dataset an instruction dataset?", not as "Do instruction datasets even exist?").

What is special about instruction tuning is that the model is fine-tuned for an instruction-following task, which involves instructing the instruction receiver to perform another task, i.e. you have a second "level" of tasks (e.g. "Split the following number into digits") that is defined only in the instructions, which are part of the model's input sequence.

In classical types of supervised fine-tuning, you have no instructions, but directly tune the model to perform a single downstream task, e.g. to split an input number into digits, without being explicitly told to do so in the model input. (However, there are also hybrid approaches that involve both fine-tuning and explicit instructions.)

So although the word "task" is often used to refer to either, it is essential to conceptually distinguish between:

  • the task the model is fine-tuned to (if at all),
  • the task the end-user wants the model to perform, and
  • the way inputs for either of these tasks are presented to the model
  • (and also the corresponding datasets and statistical distributions)

Your confusion might be connected to the fact that prompting, which is another widespread adaptation technique, can involve an abstract description of the task (e.g. in zero-shot learning), which can be formulated as an instruction.

But again, this is not necessary: Few-shot prompting does not necessarily involve an abstract description of the task, but the prompt may consist only of input-output examples of the task, plus the input for which the model should create an output.

To answer your second question: You can find many datasets/benchmarks on the Hugging Face Hub. If you randomly click at a few of them, you will see in the preview that most of them don't contain any instructions.

Bernhard Stadler
  • 1,725
  • 14
  • 24