新一代的交互形式LUI(language user interface)到來(lái),微軟為它架了一座橋
ChatGPT為代表的LLM以其驚艷的自然語(yǔ)言理解能力得到大眾認(rèn)可之后,一大批基于自然語(yǔ)言的應(yīng)用被紛紛提上日程,以對(duì)話的方式進(jìn)行人機(jī)交互的模式再一次被業(yè)內(nèi)所關(guān)注。
大約五六年前,智能音箱因?yàn)槠渥匀徽Z(yǔ)言對(duì)話的交互形式,掀起了一場(chǎng)智能音箱熱潮,各大公司對(duì)其產(chǎn)品給予了厚望,但由于技術(shù)不夠成熟,理解和對(duì)話能力不夠強(qiáng),導(dǎo)致了智能音箱始終難以作為大眾消費(fèi)品獲得成功,而當(dāng)下基于大模型的語(yǔ)義理解和對(duì)話能力有了質(zhì)的提高,大家對(duì)于LUI(language user interface)又有了新的期待,都在等待新的時(shí)代到來(lái),類似蘋(píng)果通過(guò)革命性的電容屏和IOS替代nokia時(shí)代的電阻屏和塞班系統(tǒng)帶來(lái)的手機(jī)體驗(yàn)變革開(kāi)啟移動(dòng)互聯(lián)網(wǎng)時(shí)代。因此,行業(yè)內(nèi)興起了LUI作為下一代交互體驗(yàn)變革的研究和應(yīng)用。
LUI介紹,PPT:https://drive.google.com/file/d/1A6ZkodDUuyQzVE3HCrI8iHs3y-9wDI1t/view
然而,在開(kāi)發(fā)以ChatGPT為代表的大模型作為大腦,對(duì)接下游服務(wù)的LUI應(yīng)用時(shí),遇到了一個(gè)新矛盾(以前交互方式恰好相反),那就是對(duì)用戶友好,但對(duì)服務(wù)間集成不友好。人和系統(tǒng)的溝通和下達(dá)指令的語(yǔ)言是多變的,靈活的,然而系統(tǒng)與系統(tǒng)之間的調(diào)用卻希望是格式化的,穩(wěn)定的。而大模型做到了理解用戶多變的指令的同時(shí),帶來(lái)一個(gè)問(wèn)題,就是它自身的輸出同樣會(huì)變得多變和不穩(wěn)定,這就對(duì)于系統(tǒng)集成帶來(lái)了很大的挑戰(zhàn)。為了解決這一矛盾,langchain這類框架,通過(guò)精心構(gòu)造prompt以及專門設(shè)計(jì)“Output parsers”來(lái)處理各種各樣的輸出,當(dāng)然也包含之前提到了幾個(gè)庫(kù),如guidance,Guardrails,但基于prompt engineering的解決路徑,有一定的缺陷,首先,對(duì)于傳統(tǒng)開(kāi)發(fā)者來(lái)講,它是一個(gè)新物種,比較陌生,有一定的學(xué)習(xí)門檻,其次,模型輸出具有一定不確定性,隨著類型復(fù)雜度越來(lái)越高,鏈路也越來(lái)越脆弱,經(jīng)常會(huì)因?yàn)檫吔菃?wèn)題導(dǎo)致系統(tǒng)崩潰。為此,OpenAI自己也開(kāi)發(fā)了function calls功能來(lái)提升大模型與系統(tǒng)的對(duì)接能力。
近日(7.20),微軟在這一領(lǐng)域又發(fā)布了一款新的開(kāi)源前端庫(kù)TypeChat(https://github.com/microsoft/TypeChat),給后續(xù)開(kāi)發(fā)者解決這類問(wèn)題的一個(gè)新思路,它的一個(gè)創(chuàng)新點(diǎn)是通過(guò)schema engineering替代了 prompt engineering,巧妙的借助了typescript的能力,開(kāi)發(fā)者只需定義代表LUI應(yīng)用所支持的意圖的類型即可。這可以是簡(jiǎn)單的情感分類的接口類型,也可以是更復(fù)雜的數(shù)據(jù)類型,如購(gòu)物車或音樂(lè)應(yīng)用程序的類型。例如,要在schema中添加額外的意圖(intents),開(kāi)發(fā)人員可以在可辨識(shí)聯(lián)合(Discriminated Unions)添加額外的類型。為使schema具有層次性,開(kāi)發(fā)人員可以使用 "meta-schema",根據(jù)用戶輸入選擇一個(gè)或多個(gè)子schema。定義完類型后,TypeChat 搞定后續(xù)工作:
- 使用類型構(gòu)建對(duì) LLM 的提示。
- 驗(yàn)證 LLM 響應(yīng)是否符合模式。如果驗(yàn)證失敗,則通過(guò)進(jìn)一步的語(yǔ)言模型交互來(lái)修復(fù)不符合要求的輸出。
- 簡(jiǎn)明扼要地總結(jié)實(shí)例(無(wú)需使用 LLM),并確認(rèn)其符合用戶意圖。
模式對(duì)比:
- prompt 方式
開(kāi)發(fā)者構(gòu)造prompt,給出示例,要求LLM以JSON格式輸出,然后基于輸出進(jìn)行后續(xù)處理。
User: Provide 3 suggestions for specific places to go to in Seattle on a rainy day. Respond in the form of JSON. The JSON should have the following forma
[
{ "venue": "...", "description": "..." },
{ "venue": "...", "description": "..." }
]
ChatGPT:
{
"venue": "Seattle Art Museum",
"description": "The Seattle Art Museum (SAM) is a world-class museum with a diverse collection of art from around the globe. It offers a great indoor activity for a rainy day, allowing you to explore various art exhibitions and displays, including contemporary art, African and Oceanic art, and European painting and sculpture."
},
{
"venue": "Chihuly Garden and Glass",
"description": "Chihuly Garden and Glass is an extraordinary exhibition showcasing the breathtaking glass artwork of Dale Chihuly. Located at the Seattle Center, this indoor attraction offers a mesmerizing experience as you wander through vibrant glass sculptures and installations. The beautiful interplay of light and color will captivate your senses, making it a perfect rainy day destination."
},
{
"venue": "Museum of Pop Culture",
"description": "The Museum of Pop Culture (MoPOP) is a vibrant museum dedicated to contemporary pop culture, music, science fiction, and more. It features engaging exhibits that celebrate music icons, explore science fiction themes, and delve into the world of video games. Whether you're a fan of music, movies, or pop culture in general, MoPOP offers an immersive and entertaining experience, making it an ideal choice on a rainy day."
}
]
- schema方式
開(kāi)發(fā)者使用熟悉的方式開(kāi)發(fā),定義類型,TypeChat幫你搞定內(nèi)部細(xì)節(jié)。事實(shí)上,typechat利用了typescript本身能力,借助TypeScript編譯器來(lái)處理covercase,能夠很好消除語(yǔ)言模型響應(yīng)的一些尖銳邊緣,提高魯棒性。對(duì)于開(kāi)發(fā)者本身來(lái)講,也可以基于原有開(kāi)發(fā)模式編程,可進(jìn)一步地進(jìn)行數(shù)據(jù)加工和處理,達(dá)到1+1>2的效果。
User: Provide 3 suggestions for specific places to go to in Seattle on a rainy day. Respond strictly with JSON. The JSON should be compatible with the TypeScript type Response from the following:?
interface Response {
data: Array<{ venue: string, description: string }>;
}
ChatGPT:
{
"data": [
{
"venue": "Museum of Pop Culture",
"description": "Explore the fascinating world of popular culture through interactive exhibits, music memorabilia, and multimedia displays at the Museum of Pop Culture."
},
{
"venue": "Seattle Art Museum",
"description": "Indulge in the diverse collection of artworks from around the world at the Seattle Art Museum, featuring both contemporary and classical pieces."
},
{
"venue": "Pike Place Market",
"description": "Enjoy the vibrant atmosphere of Pike Place Market, where you can wander through covered stalls offering fresh produce, crafts, and delicious food from local vendors."
}
]
}
利用TypeChat做情感分析的例子:
// ./src/sentimentSchema.ts
// The following is a schema definition for determining the sentiment of a some user input.
export interface SentimentResponse {
/** The sentiment of the text. */
sentiment: "negative" | "neutral" | "positive";
}
// ./src/main.ts
import * as fs from "fs";
import * as path from "path";
import dotenv from "dotenv";
import * as typechat from "typechat";
import { SentimentResponse } from "./sentimentSchema";
// Load environment variables.
dotenv.config({ path: path.join(__dirname, "../.env") });
// Create a language model based on the environment variables.
const model = typechat.createLanguageModel(process.env);
// Load up the contents of our "Response" schema.
const schema = fs.readFileSync(path.join(__dirname, "sentimentSchema.ts"), "utf8");
const translator = typechat.createJsonTranslator<SentimentResponse>(model, schema, "SentimentResponse");
// Process requests interactively.
typechat.processRequests("??> ", /*inputFile*/ undefined, async (request) => {
const response = await translator.translate(request);
if (!response.success) {
console.log(response.message);
return;
}
console.log(`The sentiment is ${response.data.sentiment}`);
});
結(jié)語(yǔ):
TypeChat是微軟在LUI的一個(gè)新的嘗試,通過(guò)typeChat可以很好的連接語(yǔ)言和服務(wù),前端開(kāi)發(fā)者可以通過(guò)TypeChat將自己過(guò)去的軟件改造為新的LUI的交互形式,并且也能夠在此基礎(chǔ)上開(kāi)發(fā)更多功能,而這個(gè)開(kāi)發(fā)經(jīng)驗(yàn)本身是傳承的,模式是漸進(jìn)的。目前,該庫(kù)支持OPEN API和Azure OpenAI service ,對(duì)于其它模型來(lái)講,還不支持,期待能夠支持自定義的模型服務(wù)能力,當(dāng)然這也對(duì)大模型本身能力有一定的要求。
總之:Types are all you need!
本文轉(zhuǎn)載自 ??AI工程化??,作者: ully
