自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

<sub id="3dpon"><tfoot id="3dpon"></tfoot></sub>

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認證廠商認證 IT技術(shù)PMP項目管理免費題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

AI.x社區(qū)

登錄/注冊
51CTO

中國優(yōu)質(zhì)的IT技術(shù)網(wǎng)站

51CTO博客

專業(yè)IT技術(shù)創(chuàng)作平臺

51CTO學(xué)堂

IT職業(yè)在線教育平臺

白嫖資源訓(xùn)練 DeepSeek R1 推理模型精華

AIGC前沿技術(shù)追蹤

發(fā)布于 2025-2-26 14:40

瀏覽

0收藏

DeepSeek 顛覆了 AI 領(lǐng)域，通過推出一系列全新高級推理模型挑戰(zhàn) OpenAI 的主導(dǎo)地位。最棒的是？這些模型完全免費使用，沒有任何限制，每個人都可以使用。您可以在下面觀看有關(guān)如何微調(diào) DeepSeek 的視頻教程。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

在本教程中，我們將在 Hugging Face 的醫(yī)療思維鏈數(shù)據(jù)集上對模型進行微調(diào)，微調(diào)的基礎(chǔ)模型為 DeepSeek-R1-Distill-Llama-8B。這個精簡的 DeepSeek-R1 模型是通過在使用 DeepSeek-R1 生成的數(shù)據(jù)上對 Llama 3.1 8B 模型進行微調(diào)而創(chuàng)建的。它展示了與原始模型類似的推理能力。

如果您是 LLM 和微調(diào)的新手，我強烈建議您參加 Python 中的大型語言模型導(dǎo)論課程。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

DeepSeek R1 簡介

中國人工智能公司 DeepSeek AI （深度求索）已開源其第一代推理模型 DeepSeek-R1 和 DeepSeek-R1-Zero，它們在數(shù)學(xué)、編碼和邏輯等推理任務(wù)上的表現(xiàn)可與 OpenAI 的 o1 相媲美。您可以訪問 DeepSeek 的官方網(wǎng)站了解更詳細的內(nèi)容。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

DeepSeek-R1-Zero

DeepSeek-R1-Zero 是第一個完全用大規(guī)模強化學(xué)習(xí)（而不是監(jiān)督式微調(diào)）來訓(xùn)練的開源模型。這種方式讓模型能夠自己探索思路鏈推理，解決復(fù)雜問題，并不斷改進輸出。不過，它也有一些問題，比如會重復(fù)推理步驟、生成的內(nèi)容不容易讀懂，還有可能會混雜不同的語言，這些都會影響它的清晰度和實用性。

DeepSeek-R1

DeepSeek-R1 的推出是為了改進 DeepSeek-R1-Zero 的不足，通過在強化學(xué)習(xí)前加入一些初始數(shù)據(jù)，為處理推理和非推理任務(wù)打下更好的基礎(chǔ)。這種分階段的訓(xùn)練方法讓模型在數(shù)學(xué)、代碼和推理測試中的表現(xiàn)達到了與 OpenAI-o1 相當(dāng)?shù)母咚?，同時還提高了輸出內(nèi)容的可讀性和連貫性。

DeepSeek 蒸餾

除了那些需要大量計算資源和內(nèi)存支持的大型語言模型外，DeepSeek 還開發(fā)了一系列精簡版模型。這些更緊湊且高效的模型已經(jīng)證明能夠在推理性能上保持高水平。它們的參數(shù)規(guī)模從 1.5B 到 70B 不等，同時保留了卓越的推理能力。特別值得一提的是，DeepSeek-R1-Distill-Qwen-32B 模型在多個基準測試中均超過了 OpenAI-o1-mini 的表現(xiàn)。較小規(guī)模的模型成功地繼承了大規(guī)模模型的推理特性，充分展示了知識蒸餾技術(shù)的有效性。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

來源：deepseek-ai/DeepSeek-R1

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

閱讀DeepSeek -R1：功能、o1 比較、提煉模型等博客，了解其主要功能、開發(fā)過程、提煉模型、訪問、定價以及與 OpenAI o1 的比較。

微調(diào)所需資源

模型	GPU	CPU	內(nèi)存	磁盤	耗時
DeepSeek-R1-Distill-Llama-8B	T4 x 2 15G	4核	32G	200G	23分鐘

什么？你說上面的配置太高？?? 好吧，跟我往下走，教你如何白嫖！??????

微調(diào) DeepSeek R1：分步指南

要微調(diào)DeepSeek R1模型，您可以按照以下步驟操作：

1. 設(shè)置

對于這個項目，我們使用 Kaggle 作為我們的 Cloud IDE，因為它可以免費訪問 GPU，而這些 GPU 通常比 Google Colab 中提供的 GPU 更強大。首先，啟動一個新的 Kaggle 筆記本，并將您的 Hugging Face 令牌和 Weights & Biases 令牌添加為機密。關(guān)于如何獲取令牌參考文末 QA 環(huán)節(jié)。

您可以通過導(dǎo)航到 Add-ons?Kaggle 筆記本界面中的選項卡并選擇Secrets選項來添加機密。

設(shè)置機密后，安裝 unslothPython 包。Unsloth 是一個開源框架，旨在使微調(diào)大型語言模型 (LLM) 的速度提高 2 倍，并且更節(jié)省內(nèi)存。

閱讀我們的 Unsloth 指南：優(yōu)化和加速 LLM 微調(diào)，以了解 Unsloth 的主要特性、各種功能以及如何優(yōu)化您的微調(diào)工作流程。

!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

使用我們從 Kaggle Secrets 中安全提取的 Hugging Face API 登錄到 Hugging Face CLI。

from huggingface_hub import login
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()


hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN")
login(hf_token)

使用您的 API 密鑰登錄 Weights & Biases（wandb）并創(chuàng)建一個新項目來跟蹤實驗和微調(diào)進度。

import wandb


wb_token = user_secrets.get_secret("wandb")


wandb.login(key=wb_token)
run = wandb.init(
    project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset', 
    job_type="training", 
    annotallow="allow"
)

2. 加載模型和標(biāo)記器

對于這個項目，我們正在加載DeepSeek-R1-Distill-Llama-8B 的 Unsloth 版本。此外，我們將以 4 位量化加載模型，以優(yōu)化內(nèi)存使用和性能。

from unsloth import FastLanguageModel


max_seq_length = 2048 
dtype = None 
load_in_4bit = True




model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = hf_token, 
)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

3. 微調(diào)前的模型推理

為了為模型創(chuàng)建提示樣式，我們將定義一個系統(tǒng)提示，并包含用于生成問題和響應(yīng)的占位符。提示將引導(dǎo)模型逐步思考并提供合乎邏輯且準確的響應(yīng)。

prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.


### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 


### Question:
{}


### Response:
<think>{}"""




## =========================以下為中文翻譯======================================


prompt_style = """以下是一條描述任務(wù)的指令，以及為其提供更多背景信息的輸入內(nèi)容。請給出一個能恰當(dāng)完成該請求的回復(fù)。在回答之前，仔細思考問題，并創(chuàng)建一個逐步的思路鏈，以確保回復(fù)符合邏輯且準確。


### 指令：
你是一位在臨床推理、診斷和治療計劃方面擁有高級知識的醫(yī)學(xué)專家。請回答以下醫(yī)學(xué)問題。


### 問題：
{}


### 回復(fù)：
<think>{}"""

在這個例子中，我們將向提供一個醫(yī)療問題 prompt_style，將其轉(zhuǎn)換為標(biāo)記，然后將標(biāo)記傳遞給模型進行響應(yīng)生成。

question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"




FastLanguageModel.for_inference(model) 
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")


outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])




## =========================以下為中文翻譯======================================
一位 61 歲的女性，有長期在咳嗽或打噴嚏等活動時不自主漏尿但夜間無漏尿的病史，進行了婦科檢查和棉簽試驗?；谶@些發(fā)現(xiàn)，膀胱測壓最有可能揭示她的殘余尿量和逼尿肌收縮情況如何？

英文效果

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

中文效果

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

即使沒有微調(diào)，我們的模型也成功地生成了思路鏈，并在給出最終答案之前進行了推理。推理過程封裝在 <think></think> 標(biāo)簽中。

那么，為什么我們還需要微調(diào)呢？推理過程雖然詳細，但卻冗長而不簡潔。此外，最終答案是以項目符號格式呈現(xiàn)的，這偏離了我們想要微調(diào)的數(shù)據(jù)集的結(jié)構(gòu)和風(fēng)格。

<think>
Okay, so I have this medical question to answer. Let me try to break it down. The patient is a 61-year-old woman with a history of involuntary urine loss during activities like coughing or sneezing, but she doesn't leak at night. She's had a gynecological exam and a Q-tip test. I need to figure out what cystometry would show regarding her residual volume and detrusor contractions.


First, I should recall what I know about urinary incontinence. Involuntary urine loss during activities like coughing or sneezing makes me think of stress urinary incontinence. Stress incontinence typically happens when the urethral sphincter isn't strong enough to resist increased abdominal pressure from activities like coughing, laughing, or sneezing. This usually affects women, especially after childbirth when the pelvic muscles and ligaments are weakened.


The Q-tip test is a common diagnostic tool for stress urinary incontinence. The test involves inserting a Q-tip catheter, which is a small balloon catheter, into the urethra. The catheter is connected to a pressure gauge. The patient is asked to cough, and the pressure reading is taken. If the pressure is above normal (like above 100 mmHg), it suggests that the urethral sphincter isn't closing properly, which is a sign of stress incontinence.


So, based on the history and the Q-tip test, the diagnosis is likely stress urinary incontinence. Now, moving on to what cystometry would show. Cystometry, also known as a filling cystometry, is a diagnostic procedure where a catheter is inserted into the bladder, and the bladder is filled with a liquid to measure how much it can hold (residual volume) and how it responds to being filled (like during a cough or sneeze). This helps in assessing the capacity and compliance of the bladder.


In a patient with stress incontinence, the bladder's capacity might be normal, but the sphincter's function is impaired. So, during the cystometry, the residual volume might be within normal limits because the bladder isn't overfilled. However, when the patient is asked to cough or perform a Valsalva maneuver, the detrusor muscle (the smooth muscle layer of the bladder) might not contract effectively, leading to an increase in intra-abdominal pressure, which might cause leakage.


Wait, but detrusor contractions are usually associated with voiding. In stress incontinence, the issue isn't with the detrusor contractions but with the sphincter's inability to prevent leakage. So, during cystometry, the detrusor contractions would be normal because they are part of the normal voiding process. However, the problem is that the sphincter doesn't close properly, leading to leakage.


So, putting it all together, the residual volume might be normal, but the detrusor contractions would be normal as well. The key finding would be the impaired sphincter function leading to incontinence, which is typically demonstrated during the Q-tip test and clinical history. Therefore, the cystometry would likely show normal residual volume and normal detrusor contractions, but the underlying issue is the sphincter's inability to prevent leakage.
</think>


Based on the provided information, the cystometry findings in this 61-year-old woman with stress urinary incontinence would likely demonstrate the following:


1. **Residual Volume**: The residual volume would be within normal limits. This is because the bladder's capacity is typically normal in cases of stress incontinence, where the primary issue lies with the sphincter function rather than the bladder's capacity.


2. **Detrusor Contractions**: The detrusor contractions would also be normal. These contractions are part of the normal voiding process and are not impaired in stress urinary incontinence. The issue is not with the detrusor muscle but with the sphincter's inability to prevent leakage.


In summary, the key findings of the cystometry would be normal residual volume and normal detrusor contractions, highlighting the sphincteric defect as the underlying cause of the incontinence.<｜end▁of▁sentence｜>

4.加載和處理數(shù)據(jù)集

我們將通過添加復(fù)雜思路鏈列的第三個占位符來稍微改變處理數(shù)據(jù)集的提示樣式。

train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.


### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 


### Question:
{}


### Response:
<think>
{}
</think>
{}"""




## =========================以下為中文翻譯======================================
train_prompt_style = """以下是一個描述任務(wù)的指令，與提供進一步上下文的輸入相配對。寫出一個適當(dāng)完成請求的回應(yīng)。在回答之前，仔細思考問題并創(chuàng)建一個逐步的思維鏈，以確保邏輯準確的回應(yīng)。


### 指令：
您是醫(yī)學(xué)專家，在臨床推理、診斷和治療計劃方面擁有先進的知識。
請回答以下醫(yī)學(xué)問題。


### 問題：


{}
### 響應(yīng)：
<think>
{}
</think>
{}"""

編寫 Python 函數(shù)，在數(shù)據(jù)集中創(chuàng)建一個“文本”列，該列由訓(xùn)練提示樣式組成。用問題、文本鏈和答案填充占位符。

我們從 Hugging Face 獲取醫(yī)療行業(yè)的思維鏈數(shù)據(jù)集中加載前 500 個樣本。之后，我們將 text ?使用formatting_prompts_func 函數(shù)映射列。

from datasets import load_dataset
dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train[0:500]",trust_remote_code=True)
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

數(shù)據(jù)集樣例

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

正如我們所看到的，文本列有一個系統(tǒng)提示、說明、思路鏈以及答案。

"Below is an instruction that describes a task, paired with an input that provides further context. \n
Write a response that appropriately completes the request. \n
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n
### Instruction:\n
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. \n
Please answer the following medical question. \n\n
### Question:\n
A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?\n\n
### Response:\n
<think>\n
Okay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her abdominal pressure like coughing or sneezing. This sounds a lot like stress urinary incontinence to me. Now, it's interesting that she doesn't have any issues at night; she isn't experiencing leakage while sleeping. This likely means her bladder's ability to hold urine is fine when she isn't under physical stress. Hmm, that's a clue that we're dealing with something related to pressure rather than a bladder muscle problem. \n\nThe fact that she underwent a Q-tip test is intriguing too. This test is usually done to assess urethral mobility. In stress incontinence, a Q-tip might move significantly, showing urethral hypermobility. This kind of movement often means there's a weakness in the support structures that should help keep the urethra closed during increases in abdominal pressure. So, that's aligning well with stress incontinence.\n\nNow, let's think about what would happen during cystometry. Since stress incontinence isn't usually about sudden bladder contractions, I wouldn't expect to see involuntary detrusor contractions during this test. Her bladder isn't spasming or anything; it's more about the support structure failing under stress. Plus, she likely empties her bladder completely because stress incontinence doesn't typically involve incomplete emptying. So, her residual volume should be pretty normal. \n\n
All in all, it seems like if they do a cystometry on her, it will likely show a normal residual volume and no involuntary contractions. Yup, I think that makes sense given her symptoms and the typical presentations of stress urinary incontinence.\n
</think>\n
Cystometry in this case of stress urinary incontinence would most likely reveal a normal post-void residual volume, as stress incontinence typically does not involve issues with bladder emptying. Additionally, since stress urinary incontinence is primarily related to physical exertion and not an overactive bladder, you would not expect to see any involuntary detrusor contractions during the test.
<｜end▁of▁sentence｜>"

5. 建立模型

使用目標(biāo)模型，我們將通過向模型添加低秩適配器來建立模型。

model = FastLanguageModel.get_peft_model(
    model,
    r=16,  
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_alpha=16,
    lora_dropout=0,  
    bias="none",  
    use_gradient_checkpointing="unsloth",  # True or "unsloth" for very long context
    random_state=3407,
    use_rslora=False,  
    loftq_config=None,
)

接下來，我們將設(shè)置訓(xùn)練參數(shù)并創(chuàng)建訓(xùn)練器，通過提供模型、分詞器、數(shù)據(jù)集以及其他重要的訓(xùn)練參數(shù)，這些參數(shù)將優(yōu)化我們的微調(diào)過程。

from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported


trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        # Use num_train_epochs = 1, warmup_ratio for full training runs!
        warmup_steps=5,
        max_steps=60,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

如果報錯提示：AttributeError: _unwrapped_old_generate 則更新下庫

# 更新庫到最新版本
pip install --upgrade unsloth transformers


# 或者回退到特定版本
pip install unsloth==x.y.z transformers==a.b.c

6.模型訓(xùn)練

運行以下命令開始訓(xùn)練。

trainer_stats = trainer.train()

等待，訓(xùn)練中，不知道為啥只用一個 GPU 可能是沒有開并行訓(xùn)練的事，后續(xù)可以改下腳本試下

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

訓(xùn)練過程耗時 23 分鐘。訓(xùn)練損失逐漸減少，這是模型性能提高的一個好兆頭。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

登錄 wandb.ai 并查看項目，查看模型評估報告。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

如果您在運行上述代碼時遇到問題，請參閱微調(diào) DeepSeek R1（推理模型） Kaggle 筆記本。

7. 微調(diào)后的模型推理

為了比較結(jié)果，我們將向微調(diào)模型提出與之前相同的問題，看看有什么變化。

question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"




FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")


outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])

這要好得多且更加準確。思路連貫，答案直截了當(dāng)且在一個段落中。微調(diào)是成功的。

<think>
Okay, so let's think about this. We have a 61-year-old woman who's been dealing with involuntary urine loss during things like coughing or sneezing, but she's not leaking at night. That suggests she might have some kind of problem with her pelvic floor muscles or maybe her bladder.


Now, she's got a gynecological exam and a Q-tip test. Let's break that down. The Q-tip test is usually used to check for urethral obstruction. If it's positive, that means there's something blocking the urethra, like a urethral stricture or something else.


Given that she's had a positive Q-tip test, it's likely there's a urethral obstruction. That would mean her urethra is narrow, maybe due to a stricture or some kind of narrowing. So, her bladder can't empty properly during activities like coughing because the urethral obstruction is making it hard.


Now, let's think about what happens when her bladder can't empty. If there's a urethral obstruction, the bladder is forced to hold more urine, increasing the residual volume. That's because her bladder doesn't empty completely. So, her residual volume is probably increased.


Also, if her bladder can't empty properly, she might have increased detrusor contractions. These contractions are usually stronger to push the urine out. So, we expect her detrusor contractions to be increased.


Putting it all together, if she has a urethral obstruction and a positive Q-tip test, we'd expect her cystometry results to show increased residual volume and increased detrusor contractions. That makes sense because of the obstruction and how her bladder is trying to compensate by contracting more.
</think>
Based on the findings of the gynecological exam and the positive Q-tip test, it is most likely that the cystometry would reveal increased residual volume and increased detrusor contractions. The positive Q-tip test indicates urethral obstruction, which would force the bladder to retain more urine, thereby increasing the residual volume. Additionally, the obstruction can lead to increased detrusor contractions as the bladder tries to compensate by contracting more to expel the urine.<｜end▁of▁sentence｜>

8. 本地保存模型

現(xiàn)在，讓我們在本地保存 adopter、full model 和 tokenizer ，以便我們可以在其他項目中使用它們。

new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local) 
tokenizer.save_pretrained(new_model_local)


model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

9. 將模型推送至 Hugging Face Hub

我們還可以把 adopter, tokenizer, and model 推送到 Hugging Face Hub，以便 AI 社區(qū)可以將此模型集成到他們的系統(tǒng)中來利用它。

new_model_online = "skyxiaowang/DeepSeek-R1-Medical-COT"
model.push_to_hub(new_model_online)
tokenizer.push_to_hub(new_model_online)


model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit"))

注意：要提交到自己的命名空間下，提供的 HF 的 token 必須要有 write 權(quán)限

等待上傳....

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

ok，上傳完成，登錄 HF 查看，模型已經(jīng)存在

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

學(xué)習(xí)之旅的下一步是將模型部署到云端。您可以按照如何使用 BentoML 部署 LLM 指南進行操作，該指南提供了使用 BentoML 和 vLLM 等工具高效且經(jīng)濟高效地部署大型語言模型的分步流程。

或者，如果您更喜歡在本地使用該模型，您可以將其轉(zhuǎn)換為 GGUF 格式并在您的機器上運行。為此，請查看微調(diào) Llama 3.2 并在本地使用：分步指南指南，其中提供了有關(guān)本地使用的詳細說明。

微調(diào)結(jié)束，記著手動關(guān)閉 kaggle 環(huán)境，節(jié)省 GPU 資源

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

結(jié)論

在人工智能領(lǐng)域，情況正在迅速變化。開源社區(qū)正在崛起，挑戰(zhàn)過去三年中一直統(tǒng)治人工智能領(lǐng)域的專有模型的主導(dǎo)地位。開源大型語言模型（LLMs）正變得更好、更快、更高效，使得在較低的計算和內(nèi)存資源上對其進行微調(diào)比以往任何時候都更容易。在本教程中，我們探索了 DeepSeek R1 推理模型，并學(xué)習(xí)了如何對其精簡版本進行微調(diào)以用于醫(yī)療問答任務(wù)。經(jīng)過微調(diào)的推理模型不僅能提高性能，還能使其在醫(yī)學(xué)、緊急服務(wù)和醫(yī)療保健等關(guān)鍵領(lǐng)域得到應(yīng)用。為了應(yīng)對 DeepSeek R1 的推出，OpenAI 推出了兩個強大的工具：OpenAI 的 o3，一個更先進的推理模型，以及由新的計算機使用代理（CUA）模型驅(qū)動的 OpenAI 的 Operator AI 代理，它可以自主瀏覽網(wǎng)站并執(zhí)行任務(wù)。xAI 推出了帶深度思考的 Grok 3，一個用 20 萬塊顯卡訓(xùn)練的大模型，性能超過所有同類開源和閉源模型，但是實測也差強人意，每日智能免費問兩次，收費也貴的嚇人，得到了 30 美元/月，我摸了摸錢包還是很自覺的去用 DeepSeek R1 了，免費又好用，誰能不愛？

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

如果你覺著一步一步的寫代碼比較費時，不要緊我已經(jīng)給你準備好了懶人腳本，如下：

https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model

你說我對你好不好？??

關(guān)于小白問題的 QA 解答

1. 如何獲取 HF 令牌

訪問 Hugging Face 官網(wǎng) 并登錄你的賬戶。

點擊右上角你的頭像，選擇 “Settings”（設(shè)置）。

在左側(cè)菜單中選擇 “Access Tokens”（訪問令牌）。

點擊 “New token”（新令牌），為令牌設(shè)置一個名稱，選擇合適的權(quán)限（通常選擇 “read” 即可），然后點擊 “Generate a token”（生成令牌），復(fù)制生成的令牌。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

2. 如何獲取 Weights & Biases 令牌

訪問 Weights & Biases 官網(wǎng) 并登錄你的賬戶。

點擊右上角你的頭像，選擇 “Settings”（設(shè)置）。

在 “API Keys”（API 密鑰）部分，點擊 “Generate”（生成），復(fù)制生成的 API 密鑰。

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

3. Kaggle 使用

添加密鑰

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

開啟免費 GPU

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū)

點星標(biāo)，不迷路，獲取最新最前沿的人工智能技術(shù)

白嫖資源訓(xùn)練 DeepSeek R1 推理模型-AI.x社區(qū) 圖片

[1] Python 中的大型語言模型導(dǎo)論：https://www.datacamp.com/courses/introduction-to-llms-in-python

[2] 強化學(xué)習(xí)：基于 Python 示例的介紹：https://www.datacamp.com/tutorial/reinforcement-learning-python-introduction

[3] 思維鏈推理習(xí)：https://www.datacamp.com/tutorial/chain-of-thought-prompting

[4] DeepSeek-R1：https://github.com/deepseek-ai/DeepSeek-R1

[5] DeepSeek-R1 功能和 o1 的比較、蒸餾模型等：https://www.datacamp.com/blog/deepseek-r1

[6] Weights & Biases 官網(wǎng)（wandb）： https://wandb.ai/home

[7] kaggle：https://www.kaggle.com/

[8] 原文鏈接：https://www.datacamp.com/tutorial/fine-tuning-deepseek-r1-reasoning-model?utm_source=chatgpt.com

[9] Unsloth 指南：https://www.datacamp.com/tutorial/unsloth-guide-optimize-and-speed-up-llm-fine-tuning

[10] 基模 HF 地址：https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B

[11] Kaggle 使用指南：https://blog.csdn.net/weixin_42426841/article/details/143591586

[12] 醫(yī)學(xué)思維鏈數(shù)據(jù)集：https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT?row=46

[13] 微調(diào) DeepSeek R1（推理模型）Kaggle 筆記本：https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model

[14] 如何使用 BentoML 部署 LLM：https://www.datacamp.com/tutorial/deploy-llms-with-bentoml

[15] 微調(diào) Llama 3.2 并在本地使用：分步指南：https://www.datacamp.com/tutorial/fine-tuning-llama-3-2

[16] Hugging Face 官網(wǎng)：https://huggingface.co/

[17] OpenAI 的 O3：特性、與 O1 的比較、發(fā)布日期及更多內(nèi)容：https://www.datacamp.com/blog/o3-openai

[18] OpenAI 的 Operator：示例、用例、競爭及更多：https://www.datacamp.com/blog/operator

[19] 懶人腳本：https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model

[20] DeepSeek 的官方網(wǎng)站：?https://www.deepseek.com/

本文轉(zhuǎn)載自 ??AIGC前沿技術(shù)追蹤??，作者：喜歡學(xué)習(xí)的小仙女

標(biāo)簽

贊

收藏

回復(fù)

舉報

社區(qū)頭條

回復(fù)

相關(guān)推薦

大推理模型DeepSeek-R1深度解讀：成本降低95%，推動語言模型推理效率新高度

風(fēng)云2002_1 ? 1.1w瀏覽 ? 0回復(fù)
基于 DeepSeek R1 和 Ollama 開發(fā) RAG 系統(tǒng)

玄姐聊AGI ? 3333瀏覽 ? 0回復(fù)
OpenAI o3-mini 干翻了 DeepSeek R1？

PyTorch研習(xí)社 ? 1763瀏覽 ? 0回復(fù)
李飛飛團隊超低成本復(fù)刻DeepSeek R1推理！16張H100只訓(xùn)練了26分鐘，與R1訓(xùn)練方法不同！

51CTO技術(shù)棧 ? 2223瀏覽 ? 0回復(fù)
DeepSeek R1 Vs OpenAI o1！全球頂級推理模型訓(xùn)練技術(shù)對比大解密！

51CTO技術(shù)棧 ? 4669瀏覽 ? 0回復(fù)
外國專家解讀DeepSeek：預(yù)算有限，如何復(fù)制R1推理模型？純強化學(xué)習(xí)不現(xiàn)實！

51CTO技術(shù)棧 ? 1534瀏覽 ? 0回復(fù)
如何利用 DeepSeek-R1 本地部署強大的推理模型：從 ChatGPT 風(fēng)格界面到 API 集成

Halo咯咯 ? 2751瀏覽 ? 0回復(fù)
7G顯存，訓(xùn)練自己的 DeepSeek-R1：GRPO 資源下降80%

鴻煊的學(xué)習(xí)筆記 ? 2548瀏覽 ? 0回復(fù)
強化學(xué)習(xí)與大模型后訓(xùn)練：DeepSeek R1 如何獲得推理能力？

lintoms ? 3878瀏覽 ? 0回復(fù)
DeepSeek R1與Qwen大模型，構(gòu)建Agentic RAG全攻略

小虎哦哦 ? 4403瀏覽 ? 0回復(fù)
DeepSeek又開源R1部署最佳實踐！

探索AGI ? 1716瀏覽 ? 0回復(fù)
大模型對決：DeepSeek R1與o3-mini

丟翅膀的魚 ? 1910瀏覽 ? 0回復(fù)
DeepSeek R1 全系列模型部署指南

芝士AI吃魚 ? 6917瀏覽 ? 0回復(fù)
白話DeepSeek R1的GRPO強化學(xué)習(xí)算法：原理、圖解、視頻

后向傳播 ? 2597瀏覽 ? 0回復(fù)
32B逆襲671BDeepSeek R1！阿里推理模型炸翻了：小到筆記本就能run,成本僅1/10！又是強化學(xué)習(xí)帶來驚喜！

51CTO技術(shù)棧 ? 1653瀏覽 ? 0回復(fù)
阿里開源QwQ-32B，性能與Deepseek R1持平。一個擁有320億參數(shù)的全新推理模型

Halo咯咯 ? 2129瀏覽 ? 0回復(fù)
M3芯片+Ollama本地部署DeepSeek R1：小白也能玩轉(zhuǎn)AI推理

zhishan15 ? 1445瀏覽 ? 0回復(fù)
DeepSeek R1 & R2 技術(shù)原理

ceesoft ? 1903瀏覽 ? 0回復(fù)
DianJin-R1：金融領(lǐng)域推理增強大模型，全面超越DeepSeek-R1

靈度智能 ? 271瀏覽 ? 0回復(fù)

AIGC前沿技術(shù)追蹤

這個用戶很懶，還沒有個人簡介

帖子

聲望

粉絲

關(guān)注

最近發(fā)布

綜述：基于LLM的數(shù)據(jù)查詢與可視化 5天前發(fā)布
大語言模型增強的文本到 SQL 生成：綜述 2025-04-14 01:14:57發(fā)布

熱門推薦

大半精銳盡出！o1下線！滿血o3之后，模型本身就是Manus，最大賣點：替代人干真活！ 1回復(fù)

王炸！MCP 架構(gòu)設(shè)計深度剖析 & 使用 Spring AI + MCP 四步教你實現(xiàn) Agent 智能體開發(fā) 0回復(fù)

Dify從入門到高階系列二：手把手教學(xué)！超詳細的Dify知識庫配置全攻略 0回復(fù)

Crawl4AI：GitHub榜首40K星標(biāo)！LLM專屬極速開源爬蟲神器 0回復(fù)

只需5分鐘，教你用Python搭建MCP Server 0回復(fù)

上一篇：個性化大語言模型：PPlug——讓AI更懂你

下一篇：在消費級硬件上微調(diào) DeepSeek-R1

社區(qū)精華內(nèi)容

目錄