8-3 Meta AI: Llama3 モデル

学習目標

Pythonプログラムを使って、Hugging FaceプラットフォームからLlama 3モデルをダウンロードし、簡単な質問を投げかけてLlama 3に回答を得る。

Llama 3とは？

Llama 3は、Meta社が開発した大規模な言語モデルで、テキストを理解し、記事を執筆し、質問に答えることができるAIの頭脳のように機能します。インターネット上の膨大な量のテキストを読み込み、自然で流暢な会話方法を学習しました。

Llama 3でできること

1. エッセイ/文章の作成：テキストの一部を入力すると、AIが残りの部分を自動的に補完します。

2. 質問への回答：歴史、科学、数学、日常生活に関する質問に回答できます。

3. コード例の生成：簡単なプログラミング構文の例を提供できます。

利用開始方法

1. Hugging Face Llama 3 のページ (https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6) にアクセスし、meta-llama/Meta-Llama-3-8B-Instruct を選択します。

2. アクセス申請に必要な情報を入力し、承認をお待ちください。

3. フォームへの入力が完了したら、設定画面で処理状況を確認できます。

4. 成功すると、次の画面が表示されます。

5. 次のサンプルコードは、Llama 3に「猫とは何ですか？一文で説明してください。」という問いに対する回答を生成させるものです。

(1) トークンを、先ほど生成した「抱きしめる顔」トークンに置き換えてください。

import transformers

import torch

from huggingface_hub import login
# Log in to Hugging Face

login(token="hf_XXXXXXXXXXXXXXXX") # Replace with your own Hugging Face token
# Model name

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
# Create a pipeline for text generation

pipeline = transformers.pipeline(

    "text-generation",  # Specify the task type as text generation

    model=model_id,  # Specify the model ID

    model_kwargs={

        "torch_dtype": torch.bfloat16, # Set the data type of the model (bfloat16)

        "cache_dir": "./model" # Set the path to store the model (default is ~/.cache/huggingface)

    },

    device_map="auto",  # Automatically select device (CPU or GPU)

)
# Set up the conversation messages, with roles and content

messages = [

    {"role": "user", "content": "What is a cat? Describe in one sentence."},  # User asks a question

]
# Define end-of-sequence tokens to identify where generation should stop

terminators = [

    pipeline.tokenizer.eos_token_id,

    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")

]
# Use the pipeline to generate text

outputs = pipeline(

    messages,  # Input messages

    max_new_tokens=256,  # Maximum number of new tokens to generate

    eos_token_id=terminators,  # List of end-of-sequence token IDs

    do_sample=True,  # Enable random sampling for generation

    temperature=0.6,  # Controls randomness of generation

    top_p=0.9,  # Only consider tokens with cumulative probability up to 0.9

)
# Print the result

print(f'result: {outputs[0]["generated_text"][-1]["content"]}')

6. 実行後、システムはモデルをダウンロードしてロードします。モデルは最適化されていないため、生成速度は比較的遅く、待ち時間が発生します。

7. fter running it, you will see a response similar to the following:

完成したら、プログラムの入力問題を変更して、Llama 3 を使って文章作成、問題解決、プログラミング練習を行うことができます！

例えば、次のような質問を設定できます。

1. ニュートンの運動の3法則は何ですか？

2. アヘン戦争は西暦何年に起こりましたか？

3. 「1日1個のリンゴは医者いらず」を中国語に翻訳してください。など。

参考資料：

meta-llama/Meta-Llama-3-8B-Instruct · Hugging Face