自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

<nobr id="ftebt"><optgroup id="ftebt"></optgroup></nobr>

51CTO首頁(yè)

AI.x社區(qū)

軟考社區(qū)

免費(fèi)課

企業(yè)培訓(xùn)

鴻蒙開(kāi)發(fā)者社區(qū)

WOT技術(shù)大會(huì)

公眾號(hào)矩陣

移動(dòng)端

視頻課免費(fèi)課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認(rèn)證廠商認(rèn)證 IT技術(shù)PMP項(xiàng)目管理免費(fèi)題庫(kù)

在線學(xué)習(xí)

文章資源問(wèn)答課堂專欄直播

51CTO

鴻蒙開(kāi)發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營(yíng)

鴻蒙開(kāi)發(fā)者社區(qū)訂閱號(hào)

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開(kāi)發(fā)者社區(qū)視頻號(hào)

51CTO軟考題庫(kù)

賬號(hào)設(shè)置退出

一日一技：如何快速生成大模型工具調(diào)用的JSON Schema

作者：kingname 2025-04-27 07:57:50

在使用大模型的工具調(diào)用時(shí)，我們需要編寫JSON Schema，例如下圖的tools字段的值：這個(gè)Schema寫起來(lái)非常麻煩，括號(hào)太多了，看著眼花。不信你肉眼看看，你需要幾秒鐘才能分清楚type: "object"跟哪個(gè)字段在同一層級(jí)？這個(gè)Schema有沒(méi)有什么辦法自動(dòng)生成呢？

在使用大模型的工具調(diào)用時(shí)，我們需要編寫JSON Schema，例如下圖的tools字段的值：

圖片

這個(gè)Schema寫起來(lái)非常麻煩，括號(hào)太多了，看著眼花。不信你肉眼看看，你需要幾秒鐘才能分清楚type: "object"跟哪個(gè)字段在同一層級(jí)？這個(gè)Schema有沒(méi)有什么辦法自動(dòng)生成呢？

LangChain提供了一個(gè)@tool裝飾器來(lái)簡(jiǎn)化工具調(diào)用的JSON Schema，直接裝飾函數(shù)就能使用了。例如：

import json
from langchain_core.tools.convert import tool


@tool(parse_docstring=True)
def parse_user_info(name: str, age: int, salary: float) -> bool:
    """
    保存用戶的個(gè)人信息
 
    Args:
        name: 用戶名
        age: 用戶的年齡
        salary: 用戶的工資
    """
    return True

然后，我們可以通過(guò)打印函數(shù)名的.args_schema.model_json_schema()來(lái)獲取到類似于Tool Calling的JSON Schema，如下圖所示：

圖片

這種方式有兩個(gè)問(wèn)題：

1. Tool Calling需要的JSON Schema中，參數(shù)名對(duì)應(yīng)的字段應(yīng)該是name，但這里導(dǎo)出來(lái)的是title。

2. 函數(shù)的docstring使用的是Google Style，跟Python的不一樣。

在Python里面，我們寫docstring時(shí)，一般這樣寫：:param 參數(shù)名: 參數(shù)解釋，例如下面這樣：

import json
from langchain_core.tools.convert import tool


@tool
def parse_user_info(name: str, age: int, salary: float) -> bool:
    """
    保存用戶的個(gè)人信息
 
    :param name: 用戶名 
    :param age: 用戶的年齡
    :param salary: 用戶的工資
    :return: bool，成功返回True，失敗返回False
    """
    return True

schema = parse_user_info.args_schema.model_json_schema()
print(json.dumps(schema, ensure_ascii=False, indent=2))

但使用這種方式定義的時(shí)候，@tool裝飾器不能加參數(shù)parse_docstring=True，否則會(huì)報(bào)錯(cuò)?？扇绻患樱崛〉男畔⒗锩?，字段沒(méi)有描述。效果如下圖所示：

圖片

這兩個(gè)問(wèn)題，其實(shí)有一個(gè)通用的解決辦法，那就是直接使用`Pydantic`。實(shí)際上，LangChain本身使用的也是Pydantic。如下圖所示：

圖片

我之前寫過(guò)一篇文章：一日一技：如何使用大模型提取結(jié)構(gòu)化數(shù)據(jù)，介紹了一個(gè)第三方庫(kù)，名叫`instructor`。它本質(zhì)上就是把Pydantic定義的類轉(zhuǎn)成Tool Calling需要的JSON Schema，然后通過(guò)大模型的Tool Calling來(lái)提取參數(shù)。使用使用它，我們可以非常容易的實(shí)現(xiàn)本文的目的。

使用Pydantic定義我們要提取的數(shù)據(jù)并轉(zhuǎn)換為JSON Schema格式：

import json
from pydantic import BaseModel, Field

class UserInfo(BaseModel):
    """
    用戶個(gè)人信息
    """
    name: str = Field(..., descriptinotallow='用戶的姓名')
    age: int = Field(default=None, descriptinotallow='用戶的年齡')
    salary: float = Field(default=None, descriptinotallow='用戶的工資')

schema = UserInfo.model_json_schema()
print(json.dumps(schema, indent=2, ensure_ascii=False))

Field的第一個(gè)參數(shù)如果是三個(gè)點(diǎn)...，表示這個(gè)字段是必填字段。如果想把一個(gè)字段設(shè)定為可選字段，那么Field加上參數(shù)default=None。

運(yùn)行效果如下圖所示：

圖片

參數(shù)描述直接寫到參數(shù)字段定義里面，根本不需要擔(dān)心注釋格式導(dǎo)致參數(shù)沒(méi)有描述，管他是Google Style還是Python Style。

接下來(lái)，我們要把Pydantic輸出的這個(gè)格式轉(zhuǎn)換為Tool Calling需要的JSON Schema格式。我們來(lái)看一下Instructor的源代碼：

圖片

把他這個(gè)代碼復(fù)制出來(lái)，用來(lái)處理剛剛Pydantic生成的JSON Schema：

from docstring_parser import parse


def generate_tool_calling_schema(cls):
    schema = cls.model_json_schema()
    docstring = parse(cls.__doc__ or'')
    parameters = {
        k: v for k, v in schema.items() if k notin ("title", "description")
    }
    for param in docstring.params:
        if (name := param.arg_name) in parameters["properties"] and (
            description := param.description
        ):
            if"description"notin parameters["properties"][name]:
                parameters["properties"][name]["description"] = description
    
    parameters["required"] = sorted(
        k for k, v in parameters["properties"].items() if"default"notin v
    )
    
    if"description"notin schema:
        if docstring.short_description:
            schema["description"] = docstring.short_description
        else:
            schema["description"] = (
                f"Correctly extracted `{cls.__name__}` with all "
                f"the required parameters with correct types"
            )
    
    return {
        "name": schema["title"],
        "description": schema["description"],
        "parameters": parameters,
    }

這里依賴一個(gè)第三方庫(kù)，叫做docstring_parser，這個(gè)庫(kù)的原理非常簡(jiǎn)單，就是正則表達(dá)處理docstring而已。大家甚至可以看一下他的源代碼然后自己實(shí)現(xiàn)。

運(yùn)行以后效果如下圖所示。

圖片

注意在參數(shù)信息里面，會(huì)有'default': null和title字段，這兩個(gè)字段即使傳給大模型也沒(méi)有關(guān)系，它會(huì)自動(dòng)忽略。如果大家覺(jué)得他們比較礙眼，也可以改動(dòng)一下代碼，實(shí)現(xiàn)跟Tool Calling 的JSON Schema完全一樣：

from docstring_parser import parse


def generate_tool_calling_schema(cls):
    schema = cls.model_json_schema()
    docstring = parse(cls.__doc__ or'')
    parameters = {
        k: v for k, v in schema.items() if k notin ("title", "description")
    }
    for param in docstring.params:
        if (name := param.arg_name) in parameters["properties"] and (
            description := param.description
        ):
            if"description"notin parameters["properties"][name]:
                parameters["properties"][name]["description"] = description

    parameters["required"] = sorted(
        k for k, v in parameters["properties"].items() if"default"notin v
    )

    for prop_name, prop_schema in parameters["properties"].items():
        prop_schema.pop("default", None)
        prop_schema.pop('title', None)

    if"description"notin schema:
        if docstring.short_description:
            schema["description"] = docstring.short_description
        else:
            schema["description"] = (
                f"Correctly extracted `{cls.__name__}` with all "
                f"the required parameters with correct types"
            )

    # 按 Tool Calling 規(guī)范封裝：
    return {
        "type": "function",
        "function": {
            "name": schema["title"],
            "description": schema["description"],
            "parameters": parameters,
        }
    }

運(yùn)行效果如下圖所示：

圖片

最后給大家出個(gè)思考題：如果函數(shù)的參數(shù)包含嵌套參數(shù)，應(yīng)該怎么處理？

責(zé)任編輯：武曉燕來(lái)源：未聞Code

大模型工具 JSON

點(diǎn)贊

51CTO技術(shù)棧公眾號(hào)

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開(kāi)發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營(yíng)

<blockquote id="i6ebr"></blockquote>