Will Inconsistent Alternation of Responses Affect Fine-Tuning LLAMA2 with Chat History

Question

I am working on fine-tuning LLAMA2 with a dataset containing chat history. While preparing the data, I've noticed that the dialogue doesn't always follow a pattern of alternating responses between speakers. In some cases, one person responds several times before the other person replies.

Here's an example of what I mean:

Person A: Hi!
Person A: How are you?
Person B: Hello, I'm fine. You?
Person A: I'm good too. Thanks for asking!
Person A: What are you up to?

My concern is whether this inconsistency in the pattern of responses might pose a problem when training LLAMA2. Specifically, I'm worried that the lack of a strict alternation might influence the model's ability to understand the context of the conversation.

Will this inconsistency have an effect on the performance or behavior of the model?

I'm using: https://github.com/georgesung/llm_qlora/tree/main

Sample dataset

[
{
    "id": "AAaDWwu_0",
    "conversations": [
        {
            "from": "human",
            "value": "Hello there"
        },
        {
            "from": "human",
            "value": "are you there?"
        },
        {
            "from": "gpt",
            "value": "Oh hi!"
        }
       
      ]
},
{
    "id": "AAaDWwu_2",
    "conversations": [
        {
            "from": "human",
            "value": "Ola"
        },
        {
            "from": "human",
            "value": "Did you have a fun day?"
        },
        {
            "from": "gpt",
            "value": "Yes!"
        }
       
      ]
}

]

yaml

model_name: open_llama_7b_qlora_uncensored
base_model: georgesung/llama2_7b_chat_uncensored
model_family: llama  # if unspecified will use AutoModelForCausalLM/AutoTokenizer
model_context_window: 2048  # if unspecified will use tokenizer.model_max_length
data:
  type: vicuna
  dataset: xxx/yyy  # HuggingFace hub
  user_header: "### HUMAN:\n"
  response_header: "### RESPONSE:\n"
lora:
  r: 8
  lora_alpha: 32
  target_modules:  # modules for which to train lora adapters
  - q_proj
  - k_proj
  - v_proj
  lora_dropout: 0.05
  bias: none
  task_type: CAUSAL_LM
trainer:
  batch_size: 1
  gradient_accumulation_steps: 4
  warmup_steps: 100
  num_train_epochs: 5
  learning_rate: 0.0002  # 2e-4
  logging_steps: 20
trainer_output_dir: trainer_outputs/
model_output_dir: models/  # model saved in {model_output_dir}/{model_name}

Will Inconsistent Alternation of Responses Affect Fine-Tuning LLAMA2 with Chat History

0 Answers0