8-9 DeepSeek-R1 Model

Learning Objectives

Use a Python program to download the DeepSeek-R1 model from the Hugging Face platform, and use simple prompts to ask DeepSeek-R1 questions and get answers.

What is DeepSeek-R1？

DeepSeek-R1 is a large language model developed by DeepSeek. It is like an AI brain that can understand text, write articles, answer questions, and generate responses after thoughtful reasoning.

What Can DeepSeek-R1 Do？

1. Provide automated customer service

2. Perform summarization or translation

3. Assist with programming

4. Analyze documents, contracts, and reports

How to Get Started?

1. The following example code will make DeepSeek-R1 generate an answer to the prompt: "Please briefly explain the concept of quantum entanglement.”

from transformers import AutoTokenizer, AutoModelForCausalLM

import torch

from transformers import pipeline
model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
# Load tokenizer

tokenizer = AutoTokenizer.from_pretrained(

    model_id,

    trust_remote_code=True,  # Trust remote code execution

    cache_dir="./model",  # Specify model cache directory, default is ~/.cache/huggingface

)
# Load model

model = AutoModelForCausalLM.from_pretrained(

    model_id,

    trust_remote_code=True,  # Trust remote code execution

    torch_dtype=torch.float16,  # Specify model data type (float16)

    device_map="auto",  # Automatically select device (CPU or GPU)

    cache_dir="./model",  # Specify model cache directory, default is ~/.cache/huggingface

)
# Create a text generation pipeline

generator = pipeline(

    "text-generation",

    model=model,

    tokenizer=tokenizer,

    max_length=512,  # Maximum number of new tokens to generate

    temperature=0.6,  # Controls randomness of generation

    top_p=0.95,  # Only consider tokens with cumulative probability up to 0.95

    repetition_penalty=1.1,  # Penalty to reduce repeated content during generation

)
# Generate text and output

prompt = "Please briefly explain the concept of quantum entanglement."

outputs = generator(prompt, num_return_sequences=1)

print(outputs[0]["generated_text"])

2. After running it, you will see a response similar to the following:

Reference:

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B · Hugging Face