OpenAI手把手官方教學(xué):如何用GPT-4創(chuàng)建會(huì)議紀(jì)要生成AI
本教程將介紹如何使用 OpenAI 的 Whisper 和 GPT-4 模型開發(fā)一個(gè)自動(dòng)會(huì)議紀(jì)要生成器。該應(yīng)用的功能是轉(zhuǎn)錄會(huì)議音頻、總結(jié)討論的內(nèi)容、提取要點(diǎn)和行動(dòng)項(xiàng)目以及執(zhí)行情緒分析。
基礎(chǔ)技能
本教程假定讀者已經(jīng)對(duì) Python 和 OpenAI API 密鑰有了基本了解。你可以使用本教程提供的音頻或你自己的音頻。
此外,你還需要安裝 python-docx 和 OpenAI 庫。你可以使用以下命令新建一個(gè) Python 環(huán)境并安裝所需軟件包:
python -m venv env
source env/bin/activate
pip install openai
pip install python-docx
使用 Whisper 轉(zhuǎn)錄音頻
轉(zhuǎn)錄會(huì)議音頻的第一步是將會(huì)議的音頻文件傳遞給 OpenAI 的 /v1/audio API。Whisper 是支持該音頻 API 的模型,其可將口語轉(zhuǎn)換成文本。開始會(huì)避免傳遞 prompt 或溫度參數(shù)(用于控制模型輸出的可選參數(shù)),堅(jiān)持使用默認(rèn)值。
接下來,導(dǎo)入所需的軟件包并定義一個(gè)函數(shù) —— 該函數(shù)的功能是使用 Whisper 讀取音頻文件并轉(zhuǎn)錄它:
import openai
from docx import Document
def transcribe_audio(audio_file_path):
with open(audio_file_path, 'rb') as audio_file:
transcription = openai.Audio.transcribe("whisper-1", audio_file)
return transcription['text']
在該函數(shù)中,audio_file_path 是你想要轉(zhuǎn)錄的音頻文件的路徑。該函數(shù)會(huì)打開文件并將其傳遞給 Whisper ASR 模型(whisper-1)進(jìn)行轉(zhuǎn)錄。其返回的結(jié)果是原始文本形式。需要著重指出,openai.Audio.transcribe 函數(shù)需要傳入實(shí)際的音頻文件,而不僅僅是本地或遠(yuǎn)程服務(wù)器上文件的路徑。這意味著,如果你在一個(gè)可能沒有存儲(chǔ)音頻文件的服務(wù)器上運(yùn)行代碼,那么你可能需要一個(gè)預(yù)處理步驟將音頻文件首先下載到該設(shè)備上。
使用 GPT-4 總結(jié)和分析轉(zhuǎn)錄文本
獲得轉(zhuǎn)錄文本后,使用 ChatCompletions API 將其傳遞給 GPT-4。GPT-4 是 OpenAI 推出的當(dāng)前最佳的大型語言模型,將被用于生成摘要、提取要點(diǎn)和行動(dòng)項(xiàng)目并執(zhí)行情感分析。
對(duì)于我們想要 GPT-4 執(zhí)行的每一項(xiàng)不同任務(wù),本教程會(huì)使用不同的函數(shù)。這不是完成該任務(wù)的最高效的方法(你可以將這些指令放入一個(gè)函數(shù)內(nèi)),但是將這些任務(wù)分開能讓摘要的質(zhì)量更高。
為了分開這些任務(wù),定義一個(gè)函數(shù) meeting_minutes 并將其作為該應(yīng)用的主函數(shù):
def meeting_minutes(transcription):
abstract_summary = abstract_summary_extraction(transcription)
key_points = key_points_extraction(transcription)
action_items = action_item_extraction(transcription)
sentiment = sentiment_analysis(transcription)
return {
'abstract_summary': abstract_summary,
'key_points': key_points,
'action_items': action_items,
'sentiment': sentiment
}
在這個(gè)函數(shù)中,transcription 是從 Whisper 獲得的文本。transcription 可以轉(zhuǎn)遞給四個(gè)其它函數(shù),其中每個(gè)函數(shù)都執(zhí)行一個(gè)特定任務(wù):abstract_summary_extraction 用于生成會(huì)議摘要、key_points_extraction 用于提取要點(diǎn)、action_item_extraction 用于識(shí)別行動(dòng)項(xiàng)目、sentiment_analysis 用于執(zhí)行情感分析。如果你還想添加其它功能,可以使用上面所示的相同框架。
下面要介紹的是每個(gè)函數(shù)的工作方式:
摘要提取
abstract_summary_extraction 函數(shù)的功能是將轉(zhuǎn)錄文本總結(jié)成一段簡(jiǎn)潔的摘要,目的是保留最重要的要點(diǎn),同時(shí)避免不必要的細(xì)節(jié)或離題內(nèi)容。實(shí)現(xiàn)這一過程的主要機(jī)制是如下的系統(tǒng)消息。通過所謂的 prompt 工程設(shè)計(jì),有許多不同的可能方式都能得到相近的結(jié)果。如果你想知道如何才能最有效地做到這一點(diǎn),可以查看 OpenAI 提供的「GPT 最佳實(shí)踐指南」中提供的深度建議:https://platform.openai.com/docs/guides/gpt-best-practices
def abstract_summary_extraction(transcription):
response = openai.ChatCompletion.create(
model="gpt-4",
temperature=0,
messages=[
{
"role": "system",
"content": "You are a highly skilled AI trained in language comprehension and summarization. I would like you to read the following text and summarize it into a concise abstract paragraph. Aim to retain the most important points, providing a coherent and readable summary that could help a person understand the main points of the discussion without needing to read the entire text. Please avoid unnecessary details or tangential points."
},
{
"role": "user",
"content": transcription
}
]
)
return response['choices'][0]['message']['content']
要點(diǎn)提取
key_points_extraction 函數(shù)的功能是識(shí)別并羅列會(huì)議討論的重點(diǎn)。這些要點(diǎn)應(yīng)該包括最重要的想法、發(fā)現(xiàn)或?qū)?huì)議討論的實(shí)質(zhì)至關(guān)重要的話題。同樣,控制識(shí)別這些要點(diǎn)的主要機(jī)制是系統(tǒng)消息。這里你可能需要給出一些額外的信息來說明你的項(xiàng)目或公司的經(jīng)營方式,比如:「我們是一家向消費(fèi)者銷售賽車的公司。我們做的是什么,目標(biāo)是什么。」這些額外信息可以極大提升模型提取相關(guān)信息的能力。
def key_points_extraction(transcription):
response = openai.ChatCompletion.create(
model="gpt-4",
temperature=0,
messages=[
{
"role": "system",
"content": "You are a proficient AI with a specialty in distilling information into key points. Based on the following text, identify and list the main points that were discussed or brought up. These should be the most important ideas, findings, or topics that are crucial to the essence of the discussion. Your goal is to provide a list that someone could read to quickly understand what was talked about."
},
{
"role": "user",
"content": transcription
}
]
)
return response['choices'][0]['message']['content']
行動(dòng)項(xiàng)目提取
action_item_extraction 函數(shù)的功能是識(shí)別會(huì)議期間達(dá)成一致或被提及的任務(wù)、工作分配或行動(dòng)。具體可能包括指派給特定個(gè)人的任務(wù)或集體決定采取的行動(dòng)。盡管本教程不會(huì)詳細(xì)解釋,但 Chat Completions API 提供了一個(gè)函數(shù),其功能是讓用戶在任務(wù)管理軟件中自動(dòng)創(chuàng)建任務(wù)并將其指派給相關(guān)人員。
def action_item_extraction(transcription):
response = openai.ChatCompletion.create(
model="gpt-4",
temperature=0,
messages=[
{
"role": "system",
"content": "You are an AI expert in analyzing conversations and extracting action items. Please review the text and identify any tasks, assignments, or actions that were agreed upon or mentioned as needing to be done. These could be tasks assigned to specific individuals, or general actions that the group has decided to take. Please list these action items clearly and concisely."
},
{
"role": "user",
"content": transcription
}
]
)
return response['choices'][0]['message']['content']
情感分析
sentiment_analysis 函數(shù)的功能是分析會(huì)議討論的整體情感。它會(huì)考慮語氣、所用語言傳達(dá)的情緒、詞和短語所在的上下文。對(duì)于復(fù)雜度不高的任務(wù),除了 gpt-4 之外,gpt-3.5-turbo 也值得一試,你可以看看是否能獲得相近的性能水平。你也可以將 sentiment_analysis 函數(shù)的結(jié)果傳遞給其它函數(shù),看看對(duì)話的情感會(huì)對(duì)其它屬性產(chǎn)生何種影響,這可能也很有用。
def sentiment_analysis(transcription):
response = openai.ChatCompletion.create(
model="gpt-4",
temperature=0,
messages=[
{
"role": "system",
"content": "As an AI with expertise in language and emotion analysis, your task is to analyze the sentiment of the following text. Please consider the overall tone of the discussion, the emotion conveyed by the language used, and the context in which words and phrases are used. Indicate whether the sentiment is generally positive, negative, or neutral, and provide brief explanations for your analysis where possible."
},
{
"role": "user",
"content": transcription
}
]
)
return response['choices'][0]['message']['content']
導(dǎo)出會(huì)議紀(jì)要
生成會(huì)議紀(jì)要后,我們通常需要將其保存為人類可讀且易于分發(fā)的格式。此類報(bào)告的一種常見格式是 Microsoft Word。Python docx 軟件庫是一個(gè)用于創(chuàng)建 Word 文檔的常用開源軟件庫。如果你想構(gòu)建一個(gè)端到端的會(huì)議紀(jì)要應(yīng)用,你可能會(huì)考慮移除這個(gè)導(dǎo)出步驟,而是將摘要放在后續(xù)跟進(jìn)的電子郵件中一并發(fā)送。
要實(shí)現(xiàn)這個(gè)導(dǎo)出過程,可以定義一個(gè)將原始文本轉(zhuǎn)換成 Word 文檔的函數(shù) save_as_docx。
def save_as_docx(minutes, filename):
doc = Document()
for key, value in minutes.items():
# Replace underscores with spaces and capitalize each word for the heading
heading = ' '.join(word.capitalize() for word in key.split('_'))
doc.add_heading(heading, level=1)
doc.add_paragraph(value)
# Add a line break between sections
doc.add_paragraph()
doc.save(filename)
在這個(gè)函數(shù)中,minutes 是一個(gè)詞典,包含會(huì)議的摘要、要點(diǎn)、行動(dòng)項(xiàng)目和情感分析。filename 是要?jiǎng)?chuàng)建的 Word 文檔文件的名稱。這個(gè)函數(shù)會(huì)創(chuàng)建一個(gè)新 Word 文檔,并為該紀(jì)要的每個(gè)部分添加標(biāo)題和內(nèi)容,然后將該文檔保存到當(dāng)前工作目錄。
最后,你可以將所有內(nèi)容放在一起,從音頻文件生成會(huì)議紀(jì)要:
audio_file_path = "Earningscall.wav"
transcription = transcribe_audio(audio_file_path)
minutes = meeting_minutes(transcription)
print(minutes)
save_as_docx(minutes, 'meeting_minutes.docx')
這段代碼首先會(huì)轉(zhuǎn)錄音頻文件 Earningscall.wav,再生成會(huì)議紀(jì)要并輸出,然后將會(huì)議紀(jì)要保存為一個(gè) Word 文檔并命名為 meeting_minutes.docx。
這就是基本的會(huì)議紀(jì)要處理步驟,請(qǐng)?jiān)囋嚳赐ㄟ^ prompt 工程設(shè)計(jì)優(yōu)化其性能或通過本地函數(shù)調(diào)用構(gòu)建一個(gè)端到端系統(tǒng)。