MegaPortal - 為蘋(píng)果設(shè)備設(shè)計(jì)的易用 AI 模型加載工具
上圖為用 MegaPortal 加載不同的 Stable Diffusion 模型,然后輸入關(guān)于鋼鐵俠的提示語(yǔ)生成的圖片。
The figure above shows images generated by inputting prompts about Iron Man using different Stable Diffusion models loaded with MegaPortal.
什么是 MegaPortal / What is MegaPortal?
MegaPortal 是一個(gè)為蘋(píng)果設(shè)備設(shè)計(jì)的易用 AI 模型加載工具,下面是我讓 ChatGPT 幫我想的三點(diǎn)賣(mài)點(diǎn):
- 易于使用:MegaPortal代碼段由基本的、易于使用的可視塊組成,可以通過(guò)低成本進(jìn)行配置。
- AI 適用于所有人:MegaPortal 不僅面向 AI 專家和開(kāi)發(fā)人員,也面向非技術(shù)用戶。它是一個(gè)用戶友好型工具,可以被任何希望快速測(cè)試、利用或分享 AI 模型的人使用。
- 本地隱私優(yōu)先:所有的 AI 模型都在您的設(shè)備上本地運(yùn)行,所有的輸入數(shù)據(jù)也在本地處理。MegaPortal 不會(huì)從您或您的設(shè)備中收集任何數(shù)據(jù)。
詳情請(qǐng)?jiān)L問(wèn) MegaPortal 的文檔站點(diǎn):https://docs.getmegaportal.com/ 。同時(shí),MegaPortal 是一個(gè)小巧(4.5 MB, gzipped)且免費(fèi)的應(yīng)用 ??。
MegaPortal is an easy-to-use AI model loader designed for Apple devices. Here are three key selling points that ChatGPT helped me come up with:
- Easy to Use: MegaPortal snippets are composed of basic, easy-to-use visual blocks that can be configured with minimal effort.
- AI Accessible to All: MegaPortal is intended for use by not just AI experts and developers, but also non-technical users. It is a user-friendly tool that can be utilized by anyone looking to quickly test, utilize, or share AI models.
- Local and Privacy First: All AI Models Run Locally on your device, and all input data is processed locally as well. MegaPortal does not collect any data from you or your device.
For more details: https://docs.getmegaportal.com/
有什么用 / What is it used for?
MegaPortal 是一個(gè)文檔類型軟件,執(zhí)行文件稱為 Snippet,Snippet 由若干個(gè) Block 組成,例如下面是一個(gè) Snippet 的例子:
"MegaPortal is a document-type software, and its executable files are called snippets. Snippets are composed of multiple Blocks, as shown in the following example:
這個(gè) Snippet 實(shí)現(xiàn)了一個(gè)人臉濾鏡,以圖片作為輸出,先經(jīng)過(guò) AnimeGANv2 將圖片變成 Anime 風(fēng)格的圖片,由于這個(gè)模型輸出的是 512X512 大小的圖片,再通過(guò) SRGAN 超解析模型變成 2048X2048 大小的圖片。相關(guān)模型介紹:
This snippet implements a filter, generating an image output that goes through the AnimeGANv2 model to transform the image into an anime-style image. Since this model outputs images with a size of 512x512, the SRGAN super-resolution model is then applied to upscale the image to a size of 2048x2048. Here are some brief introductions to the relevant models:
為了進(jìn)一步介紹 MegaPortal 的用途,下面介紹三個(gè)本人配置的 Snippet:
To further illustrate the usage of MegaPortal, below are three Snippets that I have configured:
- 第一個(gè)視頻演示的是上述人臉濾鏡的效果,配置原理已經(jīng)介紹過(guò);
- 由于本人是原神玩家,第二個(gè)視頻配置了一個(gè)用于推薦原神抽卡的 Snippet。它接受米游社的角色列表圖片作為輸入,通過(guò) MegaPortal 的 Visual Text Recognition Block[1] 處理后得到圖片中的文字信息。接著,再通過(guò)一個(gè) Javascript Execution Block[2] 進(jìn)行處理。這個(gè) JS Block 主要承擔(dān)相似文字?jǐn)M合(由于原神生僻字太多,例如“云堇”可能會(huì)被識(shí)別成“云革”,需要進(jìn)行擬合)、推薦和表單展示邏輯功能。順帶一提,Javascript Execution Block 是通過(guò) VSCode 插件進(jìn)行書(shū)寫(xiě)的,插件配備類型提示和編譯功能。
- 第三個(gè)視頻則是配置了一個(gè)利用 Stable Diffusion 模型來(lái)通過(guò)文字生成圖片的 Snippet。第一個(gè) Block 是 Javascript Execution Block[3],用來(lái)展示表單和將表單輸入內(nèi)容傳遞給第二個(gè) Block。第二個(gè) Block 則加載一個(gè)起步為 2G 的 SD 模型來(lái)生成圖片。需要注意的是,由于 SD 模型較大,第一次運(yùn)行需要一定的加載時(shí)間。同時(shí),雖然上面視頻演示的是 iPhone 的運(yùn)行結(jié)果,但實(shí)際上,目前我只成功地跑起了一個(gè)標(biāo)準(zhǔn)的 SD 模型,其他模型運(yùn)行時(shí)都會(huì)崩潰。這也是我在“尋求幫助”部分想要求助的內(nèi)容。
- The first video demonstrates the effect of the above-mentioned filter, and how it worked has been introduced earlier.
- As a Genshin Impact player, the second video introduced a Snippet for recommending Genshin Impact gacha. It accepts the a picture of character list from miHoYo's app as input, processes the text information in the image through MegaPortal's Visual Text Recognition Block[4], and then processes it through a Javascript Execution Block[5]. This JS Block mainly handles similar text fitting (due to the many rare characters in Genshin Impact, such as "云堇" may be recognized as "云革", which needs to be fitted), recommendation, and display results. By the way, the Javascript Execution Block is written through a VSCode plug-in, which is equipped with type prompts and compilation functions.
- The third video demonstrates a Snippet configured to generate images from text using the Stable Diffusion model. The first Block is a Javascript Execution Block[6], which is used to display the form and pass the input content to the second Block. The second Block loads an SD model sized as at least 2G to generate images. It should be noted that, due to the large size of the SD model, it takes some time to load when it is first run. Also, although the video above demonstrates the results of running on an iPhone, in reality, I have only been able to successfully run a standard SD model, and other models crash during runtime. This is also what I am seeking help with in the "Seeking Help" section.
在應(yīng)用的 Snippet Center 中可以挖掘更多的有趣的 Snippet,你也可以配置自己的 Snippet,通過(guò)分享 Snippet 文件或者發(fā)布到 Snippet Center 中供其他用戶下載。
In the Snippet Center of the application, you can discover more interesting Snippets, and you can also configure your own Snippets to share the Snippet file or publish it to the Snippet Center for other users to download.
使用教程 / How to use it
本環(huán)節(jié)主要以配置一個(gè) Stable Diffusion 的 Snippet 為例,向大家講解 MegaPortal 的使用方法。
In this section, we will mainly use configuring a Stable Diffusion Snippet as an example to introduce the usage of MegaPortal.
系統(tǒng)要求 / System Requirement
為了使用 MegaPortal 的全部功能(特指 Stable Diffusion),請(qǐng)將設(shè)備至少升級(jí)至 iOS/iPadOS 16.2, macOS 13.1.
To use all the features of MegaPortal, especially Stable Diffusion, please upgrade your device to at least iOS/iPadOS 16.2 and macOS 13.1.
下載 / Download
請(qǐng)?jiān)L問(wèn) https://www.getmegaportal.com/ 下載最新版本,目前 iOS 和 macOS 版本都可下載.
Please visit https://www.getmegaportal.com/ to download the latest version. Currently, both iOS and macOS versions are available for download.
配置一個(gè) AI 生成圖片 Snippet / Configure an AI image Generation Snippet
下面將在 macOS 下為例,講述配置一個(gè) Snippet 的過(guò)程。
The following will describe the process of configuring a Snippet on macOS.
第一步,打開(kāi) MegaPortal,點(diǎn)擊新建文檔,創(chuàng)建一個(gè)空文件。然后,下載一個(gè) Stable Diffusion 的 Snippet 作為模板配置,然后雙擊打開(kāi)這個(gè)文件:
Step 1: Open MegaPortal and click "New Document" to create a new file. Then, download a Stable Diffusion Snippet as a template for configuration, and double-click to open the file.
由于模型中自帶 Stable Diffusion 模型,打開(kāi)后程序會(huì)自動(dòng)開(kāi)始下載相關(guān)模型,模型下載需要一定的時(shí)間。如果網(wǎng)絡(luò)不好,請(qǐng)先關(guān)閉程序,在下面「Stable Diffusion 模型下載」部分用下載工具先下載一個(gè)模型。
As the model comes with a prepared Stable Diffusion model, the program will automatically start downloading related models when opened, which may take some time. If the network is not good, please close the program and download a model using a download tool in the "Stable Diffusion Model Download" section below.
當(dāng)你下載完模型后,可以點(diǎn)開(kāi) AI Model Application 的 Block 進(jìn)行配置,配置完后點(diǎn)「Save」保存,由于之前的下載可能在進(jìn)行中,目前只能完全關(guān)閉 MegaPortal 才能中斷下載。
After downloading the model, you can click on the AI Model Application block to configure it. Once configured, click "Save" to save the changes. Since the previous download may still be in progress, you may need to completely close MegaPortal to interrupt the download.
重新打開(kāi)文件,點(diǎn)「播放」按鈕即可運(yùn)行這個(gè) Snippet:
To run this Snippet, you can reopen the file and click on the "Play" button.
單擊右鍵就可以將圖片保存至 Photos 應(yīng)用。
Right-click on the image and you can save it to the Photos app.
清理緩存 / Clear Cache
運(yùn)行一段時(shí)間之后,應(yīng)用會(huì)緩存很多的模型文件,可以通過(guò)下面步驟清理緩存:
- 點(diǎn)擊「More」按鈕;
- 點(diǎn)擊 Configuration 按鈕;
- 點(diǎn)擊「Local Caches」按鈕;
- 點(diǎn)擊「Delete All」或者長(zhǎng)按 / 右鍵某一條目進(jìn)行刪除;
To clear the cache of MegaPortal, you can follow these steps:
- Click the "More" button;
- Click the "Configuration" button;
- Click the "Local Caches" button;
- Click "Delete All" or long press/right click on a specific item to delete it.
緣起 / Source of Inspiration
回憶起之所以設(shè)計(jì)這個(gè)軟件,我覺(jué)得有三個(gè)契機(jī):
- 在過(guò)去一年時(shí)間我有幸接觸到了 Standford 的 CS193P[7] 課程,學(xué)習(xí)了 SwiftUI 的開(kāi)發(fā),同時(shí)由于自己是前端通道的同學(xué),所以同時(shí)順便了解到了 iOS 上的 Web 容器 WKWebView 和 JS Runtime JavascriptCore 的細(xì)節(jié)使用方法。
- 由于過(guò)于一段時(shí)間參與了 aPaaS 項(xiàng)目的相關(guān)工作,在 aPaaS 項(xiàng)目得到了一些認(rèn)知,這些認(rèn)知幫助我能夠更好得去抽象一個(gè)針對(duì)具體領(lǐng)域的效率提高工具的設(shè)計(jì)。
- 然后最重要的可能是,由于我是原神玩家,在原神中會(huì)遇到計(jì)算角色強(qiáng)度相關(guān)的問(wèn)題,而計(jì)算角色強(qiáng)度需要用戶將自己角色面板的數(shù)字輸入一些小程序 / 網(wǎng)頁(yè)中才能實(shí)現(xiàn),這個(gè)過(guò)程比較繁瑣,所謂懶惰是第一生產(chǎn)力,所以我就想著可以在手機(jī)上寫(xiě)一個(gè)程序,能夠以視頻或者圖片作為輸入,快速計(jì)算出角色強(qiáng)度。而解決這個(gè)問(wèn)題第一步需要解決 OCR 的問(wèn)題,在查閱和在 iOS 實(shí)現(xiàn) OCR 相關(guān)功能的過(guò)程中,發(fā)現(xiàn)蘋(píng)果 VNRecognizeTextRequest 乃至整個(gè) AI 應(yīng)用的相關(guān) API 是一個(gè)整體的體系,復(fù)用程度比較高,可以稍微整理產(chǎn)品化。
I recall three factors that inspired me to design this software:
- In the past year, I had the opportunity to study SwiftUI development through Stanford's CS193P course. At the same time, as a front-end engineer, I also learned about the details and usage of WKWebView and JS Runtime JavascriptCore on iOS.
- Due to my involvement in an aPaaS project for some time, I gained some knowledge that helped me better abstract the design of an efficiency improvement tool for a specific domain.
- The most important factor may be that I am a player of Genshin Impact, a game in which calculating the strength of characters involves inputting numbers from the character panel into some programs/websites. This process is quite cumbersome, and。I thought of creating a program on the phone that could quickly calculate the strength of characters using videos or images as input. Solving this problem first requires solving the OCR problem. While researching and implementing OCR-related functionality on iOS, I discovered that Apple's VNRecognizeTextRequest and the entire AI application-related APIs form a system with a relatively high degree of reuse, which can be productized with some organization.
游戲中的原圖
The original image in the game.
Text Recognition Block 得到的效果
The result obtained by the Text Recognition Block.
得到上述的結(jié)果后還需要經(jīng)由一個(gè) Javascript Block 來(lái)計(jì)算角色強(qiáng)度。
BTW,不道德曬抽卡:
20發(fā)大保底出雙金,夜蘭你是愛(ài)我滴
20發(fā)出武器
目前進(jìn)展和后續(xù)功能
目前剛添加完 Stable Diffusion 功能,后續(xù)有空的話會(huì)為 Stable Diffusion 模型應(yīng)用添加 image2image 功能,也就是下圖中的功能:
尋求幫助 / Seeking Help
發(fā)這個(gè)帖的最重要目的是目前 Stable Diffusion 功能在 iPhone 上穩(wěn)定運(yùn)行仍未解決,大部分我轉(zhuǎn)換的模型在 iPhone 上運(yùn)行都會(huì)崩潰,而 MegaPortal 設(shè)計(jì)之初就是希望能夠讓用戶在 macOS 上開(kāi)發(fā)調(diào)試,然后將 Snippet 通過(guò) iCloud 等方式同步到 iPhone 上,在 iPhone 上運(yùn)行或者分享給別的用戶,所以所有模型能夠在 iPhone 上運(yùn)行尤其重要,即使 Stable Diffusion 只能運(yùn)行在性能較好的 iPhone 上。
同時(shí),由于本人精力相對(duì)有限,完全吃透 Stable Diffusion 乃至 Pytorch 生態(tài)對(duì)于我來(lái)說(shuō)難度非常大,所以希望有 iOS 開(kāi)發(fā)經(jīng)驗(yàn)和 Pytorch + Stable Diffusion 開(kāi)發(fā)經(jīng)驗(yàn)的的同學(xué)能夠幫忙一起來(lái)看看這個(gè)問(wèn)題 ??。
下面附帶兩組模型文件幫忙定位:
The most important purpose of this post is that the Stable Diffusion feature is still not stable on the iPhone, and most of the models I convert will crash when running on the iPhone. MegaPortal was designed to allow users to develop and debug on macOS, and then sync the Snippet to the iPhone via iCloud or other means to run on the iPhone or share with other users. Therefore, it is particularly important that all models can run on the iPhone, even if Stable Diffusion can only run on higher-performance iPhones.
At the same time, due to my limited energy, it is very difficult for me to fully understand Stable Diffusion and the Pytorch ecosystem, so I hope that fellow developers with experience in iOS development and Pytorch + Stable Diffusion development can help me to identify this issue.
Attached are two sets of model files to help with debugging.
會(huì)崩潰的模型 / crashing:
- ??https://model.getmegaportal.com/classicAnim-v1-einsum_compiled.zip??
- ??https://huggingface.co/nitrosocke/classic-anim-diffusion/blob/main/classicAnim-v1.ckpt??
不會(huì)崩潰的模型 / not crashing:
Stable Diffusion 模型下載 / Model Download
根據(jù)網(wǎng)上的開(kāi)源 SD 項(xiàng)目,轉(zhuǎn)換好了對(duì)應(yīng)的 CoreML 格式的模型文件,大家可以根據(jù)需求下載。
- 由于為了節(jié)省空間和帶寬,大部分模型中沒(méi)有帶有 Safety Checker,請(qǐng)不要產(chǎn)生 NSFW 內(nèi)容 ??
- 模型大小為 2G~4G,用公司 VPN 下載的話速度尚可。
The CoreML format models corresponding to the open-source SD project have been converted and are available for download as needed.
- Due to space and bandwidth constraints, most models do not come with a Safety Checker. Please refrain from creating NSFW content ??.
- The models range in size from 2GB to 4GB and can be downloaded at reasonable speeds using the company VPN.
Ghibli Style / 吉卜力風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/ghibli-diffusion-v1-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/Ghibli-Diffusion
Anime Style / 動(dòng)漫風(fēng)格
本人最喜歡的一類風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/8528-diffusion2_split-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/852wa/8528-diffusion
Elden Ring Style / 指環(huán)王老頭環(huán)風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/eldenRing-v3-pruned-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/elden-ring-diffusion
Classic Disney Style / 經(jīng)典迪士尼風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/classicAnim-v1-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/classic-anim-diffusion
Redshift Style / RedShift 風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/redshift-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/redshift-diffusion
Spideverse Style / 蜘蛛人:新宇宙風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/spiderverse-v1-pruned-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/spider-verse-diffusion
Archer Style / 間諜亞契風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/archer-v1-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/archer-diffusion
Anime Style / 雙城之戰(zhàn)風(fēng)格
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/arcane-diffusion-v3-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/Ghibli-Diffusion
Stable Diffusion 2-1 原始 / Original Stable Diffusion 2-1(iPhone 測(cè)試可用)
這個(gè)模型也是唯一在我拍的 iPhone 14 Pro Max 上測(cè)試可用的模型
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/coreml-stable-diffusion-2-1-base_split_einsum_compiled.zip
- Source / 源項(xiàng)目:
- ??https://huggingface.co/stabilityai/stable-diffusion-2-1??
- ??https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base??
Midjourney v4
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/mdjrny-v4-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/prompthero/openjourney/tree/main
Nitro Diffusion
這是一個(gè)多風(fēng)格支持的模型,支持 archer style, arcane style or modern disney style
Links / 鏈接
- Download / 下載:https://model.getmegaportal.com/nitroDiffusion-v1-einsum_compiled.zip
- Source / 源項(xiàng)目:https://huggingface.co/nitrosocke/Nitro-Diffusion
致謝 / Thanks!
- 謝謝 ChatGPT 幫助翻譯和編程問(wèn)題的回答;
- 謝謝 Copilot 結(jié)對(duì)編程(盡管是雇傭關(guān)系 ??);
- 感謝社區(qū)各種開(kāi)源的 AI 模型項(xiàng)目,此處不一一致謝了 ??。
- "Thank you, ChatGPT, for helping me with the translation and programming questions."
- "Thank you, Copilot, for pair programming with me (even though it's a hired relationship ??)."
- "I am grateful to various open-source AI model projects in the community. I cannot thank each of them enough ??."
參考資料?
[1]Visual Text Recognition Block: ??https://docs.getmegaportal.com/docs/blocks/visual-text-recognition??
[2]Javascript Execution Block: ??https://docs.getmegaportal.com/docs/blocks/javascript-execution??
[3]Javascript Execution Block: ??https://docs.getmegaportal.com/docs/blocks/javascript-execution??
[4]Visual Text Recognition Block: ??https://docs.getmegaportal.com/docs/blocks/visual-text-recognition??
[5]Javascript Execution Block: ??https://docs.getmegaportal.com/docs/blocks/javascript-execution??
[6]Javascript Execution Block: ??https://docs.getmegaportal.com/docs/blocks/javascript-execution??
[7]CS193P: ??https://cs193p.sites.stanford.edu/??