自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

<cite id="pfpec"><rp id="pfpec"><form id="pfpec"></form></rp></cite><style id="pfpec"></style>

<center id="pfpec"></center>

^{<blockquote id="pfpec"></blockquote>}

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認證廠商認證 IT技術(shù)PMP項目管理免費題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設(shè)置退出

將 YOLOv10 部署至 LiteRT：在 Android 上使用 Google AI Edge 進行目標檢測

作者：二旺 2024-10-07 10:12:50

本文介紹如何將 Ultralytics 的最新 YOLOv10 目標檢測模型轉(zhuǎn)換和量化為 LiteRT（前稱 TensorFlow Lite）格式，在生成的 LiteRT 模型上運行推理，并將其部署在 Android 上進行實時檢測。

在大型語言模型（LLMs）興起之前，邊緣 AI 是一個熱門話題，這得益于其在設(shè)備上直接運行機器學(xué)習(xí)模型的顯著能力。這并不是說這個話題已經(jīng)失去了相關(guān)性；事實上，許多科技巨頭現(xiàn)在正將注意力轉(zhuǎn)向在移動平臺上部署 LLMs。

雖然我們今天不會討論生成性 AI，但我們將重新審視經(jīng)典的計算機視覺任務(wù)——目標檢測。這篇博客提供了一個全面的教程，介紹如何將 Ultralytics 的最新 YOLOv10 目標檢測模型轉(zhuǎn)換和量化為 LiteRT（前稱 TensorFlow Lite）格式，在生成的 LiteRT 模型上運行推理，并將其部署在 Android 上進行實時檢測。

如果你有目標檢測和在設(shè)備上部署模型的經(jīng)驗，你可能想知道為什么 MobileNet SSD 或 EfficientDet Lite 不是最佳選擇。原因如下：

為什么選擇 YOLOv10 而不是其他？

雖然 MobileNet SSD 和 EfficientDet Lite 表現(xiàn)良好，但它們在檢測較小物體時存在困難。然而，YOLOv10 可以快速有效地檢測到較小的物體。

在我們開始之前，讓我們簡要了解一下 YOLOv10 模型以及 LiteRT 是什么。

YOLOv10

作為 YOLO 模型家族的高級版本，YOLOv10 是實時目標檢測任務(wù)的最新首選。其增強的架構(gòu)和訓(xùn)練技術(shù)使其特別適合邊緣部署。

YOLOv10 模型變體

在所有變體中，納米版本（YOLOv10-N）最適合移動部署，因為它能夠在資源受限的環(huán)境中運行。在此處了解更多關(guān)于 YOLOv10 的信息。

注意：我們將使用在 COCO 數(shù)據(jù)集上訓(xùn)練過的預(yù)訓(xùn)練 YOLOv10-N 模型。

LiteRT

LiteRT，前稱 TensorFlow Lite，是 Google 的高性能設(shè)備上 AI 運行時。它允許你輕松地將 TensorFlow、PyTorch 和 JAX 模型轉(zhuǎn)換并以 TFLite 格式運行?，F(xiàn)在你已經(jīng)有了概覽，讓我們深入編碼部分。這是我們項目的流程：

流程：在 Android 上將 YOLOv10-N 轉(zhuǎn)換為 LiteRT

步驟 1：模型轉(zhuǎn)換

幾年前，將 YOLO 模型轉(zhuǎn)換為 TF Lite 是相當具有挑戰(zhàn)性的，因為模型的復(fù)雜步驟和顯著的架構(gòu)差異。然而，現(xiàn)在情況已不再如此，因為 Ultralytics 現(xiàn)在為你處理了所有的繁重工作。

通過克隆此倉庫開始獲取全部代碼：https://github.com/NSTiwari/YOLOv10-LiteRT-Android

# Install Ultralytics.
!pip install ultralytics

# Load the YOLOv10n model.
model = YOLO("yolov10n.pt")

# Export the model to LiteRT (TF Lite) format.
model.export(format="tflite")

export() 函數(shù)接受以下參數(shù)：

format：模型的輸出格式，如 tflite、onnx、tfjs、openvino、torchscript 等。
imgsz：模型輸入的期望圖像大?。ǜ叨?、寬度）。默認為 640 x 640。
int8：啟用模型的 INT8 量化以加快推理速度。默認設(shè)置為 false。

你可以根據(jù)用例調(diào)整許多其他參數(shù)，但上面提到的參數(shù)現(xiàn)在應(yīng)該足夠好了。在僅僅兩行代碼中，你可以完全將 YOLO PyTorch 模型轉(zhuǎn)換為 LiteRT 格式。以下是轉(zhuǎn)換過程的背景：PyTorch → ONNX 圖 → TensorFlow SavedModel → LiteRT。

步驟 2：解釋 LiteRT 模型

Google AI Edge 提供了模型探索器，這是一個類似于 Netron 的模型可視化工具，提供對模型圖和架構(gòu)的詳細洞察。

# Install Model Explorer.
!pip install ai-edge-model-explorer

LITE_RT_EXPORT_PATH = "yolov10n_saved_model/" # @param {type : 'string'}
LITE_RT_MODEL = "yolov10n_float16.tflite" # @param {type : 'string'}

LITE_RT_MODEL_PATH = LITE_RT_EXPORT_PATH + LITE_RT_MODEL

# Load the LiteRT model in Model Explorer.
model_explorer.visualize(LITE_RT_MODEL_PATH)

在模型探索器上可視化的 yolov10_float16.tflite

如果你查看輸出張量，你會看到只有一個節(jié)點（Identity），形狀為 [1, 300, 6]，與 MobileNet SSD 模型不同，后者通常有四個輸出張量。你也可以使用 AI Edge LiteRT 庫來解釋模型。

# Install Google AI Edge LiteRT
!pip install ai-edge-litert

# Load the TF Lite model.
interpreter = Interpreter(model_path = LITE_RT_MODEL_PATH)
interpreter.allocate_tensors()

# Get input and output details.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print(f"Model input size: {input_size}")
print(f"Output tensor shape: {output_details[0]['shape']}")

模型輸入大小為 640，輸出張量形狀 [1, 300, 6] 表示批次大?。?）、每張圖片的最大檢測數(shù)量（300）以及值 [xmin, ymin, xmax, ymax, score, class]。

步驟 3：推理轉(zhuǎn)換后的 LiteRT 模型

現(xiàn)在是推理時間。現(xiàn)在我們已經(jīng)解釋了模型的架構(gòu)，我們可以繼續(xù)在 Python 上使用 OpenCV 進行推理。

注意：導(dǎo)出的 LiteRT 模型的結(jié)果需要后處理，包括歸一化邊界框坐標并將類 ID 映射到相應(yīng)的標簽。

在 Colab 筆記本中，我包含了一些實用函數(shù)來處理所有必需的后處理步驟。

def detect(input_data, is_video_frame=False):
    input_size = input_details[0]['shape'][1]

    if is_video_frame:
        original_height, original_width = input_data.shape[:2]
        image = cv2.cvtColor(input_data, cv2.COLOR_BGR2RGB)
        image = cv2.resize(image, (input_size, input_size))
        image = image / 255.0
    else:
        image, (original_height, original_width) = load_image(input_data, input_size)

    interpreter.set_tensor(input_details[0]['index'], np.expand_dims(image, axis=0).astype(np.float32))
    interpreter.invoke()

    output_data = [interpreter.get_tensor(detail['index']) for detail in output_details]
    return output_data, (original_height, original_width)



# Postprocess the output.
def postprocess_output(output_data, original_dims, labels, confidence_threshold):
  output_tensor = output_data[0]
  detections = []
  original_height, original_width = original_dims

  for i in range(output_tensor.shape[1]):
    box = output_tensor[0, i, :4]
    confidence = output_tensor[0, i, 4]
    class_id = int(output_tensor[0, i, 5])

    if confidence > confidence_threshold:
      x_min = int(box[0] * original_width)
      y_min = int(box[1] * original_height)
      x_max = int(box[2] * original_width)
      y_max = int(box[3] * original_height)

      label_name = labels.get(str(class_id), "Unknown")

      detections.append({
          "box": [y_min, x_min, y_max, x_max],
          "score": confidence,
          "class": class_id,
          "label": label_name
      })

  return detections

Colab 筆記本支持對圖像和視頻進行推理。以下是我獲得的一些結(jié)果。

在圖像上的推理

在圖像上的推理

在視頻上的推理

令人印象深刻的是，轉(zhuǎn)換后的 LiteRT 模型在量化后仍然表現(xiàn)出色，有效地檢測到即使是很小的物體?，F(xiàn)在，我們準備將模型部署在 Android 上進行設(shè)備上推理。

步驟 4：在 Android 上部署模型

在步驟 1 中，我們克隆了倉庫來運行 Colab 筆記本，其中也包括了一個示例 Android 應(yīng)用。筆記本中的最后一步讓你可以下載 LiteRT 模型。下載后，將其復(fù)制到 Android 應(yīng)用的 assets 文件夾中。默認文件名為 yolov10n_float16.tflite。如果你使用不同的文件名，請確保相應(yīng)地更新 Constants.kt 文件中的第 4 行。

// Change this with your TF Lite model name.
const val MODEL_PATH = "yolov10n_float16.tflite"

Detector.kt 文件包含執(zhí)行推理的邏輯，以及提取檢測到的對象的邊界框、置信度得分和標簽。

// Detects the objects.
class Detector(
    private val context: Context,
    private val modelPath: String,
    private val labelPath: String?,
    private val detectorListener: DetectorListener,
    private val message: (String) -> Unit
) {
    private var interpreter: Interpreter
    private var labels = mutableListOf<String>()

    private var tensorWidth = 0
    private var tensorHeight = 0
    private var numChannel = 0
    private var numElements = 0

    private val imageProcessor = ImageProcessor.Builder()
        .add(NormalizeOp(INPUT_MEAN, INPUT_STANDARD_DEVIATION))
        .add(CastOp(INPUT_IMAGE_TYPE))
        .build()

    init {
        val options = Interpreter.Options().apply{
            this.setNumThreads(4)
        }

        val model = FileUtil.loadMappedFile(context, modelPath)
        interpreter = Interpreter(model, options)

        labels.addAll(extractNamesFromMetadata(model))
        if (labels.isEmpty()) {
            if (labelPath == null) {
                message("Model not contains metadata, provide LABELS_PATH in Constants.kt")
                labels.addAll(MetaData.TEMP_CLASSES)
            } else {
                labels.addAll(extractNamesFromLabelFile(context, labelPath))
            }
        }

        labels.forEach(::println)

        val inputShape = interpreter.getInputTensor(0)?.shape()
        val outputShape = interpreter.getOutputTensor(0)?.shape()

        if (inputShape != null) {
            tensorWidth = inputShape[1]
            tensorHeight = inputShape[2]

            // If in case input shape is in format of [1, 3, ..., ...]
            if (inputShape[1] == 3) {
                tensorWidth = inputShape[2]
                tensorHeight = inputShape[3]
            }
        }

        if (outputShape != null) {
            numElements = outputShape[1]
            numChannel = outputShape[2]
        }
    }

// Extracts bounding box, label, confidence.
private fun bestBox(array: FloatArray) : List<BoundingBox> {
    val boundingBoxes = mutableListOf<BoundingBox>()
    for (r in 0 until numElements) {
        val cnf = array[r * numChannel + 4]
        if (cnf > CONFIDENCE_THRESHOLD) {
            val x1 = array[r * numChannel]
            val y1 = array[r * numChannel + 1]
            val x2 = array[r * numChannel + 2]
            val y2 = array[r * numChannel + 3]
            val cls = array[r * numChannel + 5].toInt()
            val clsName = labels[cls]
            boundingBoxes.add(
                BoundingBox(
                    x1 = x1, y1 = y1, x2 = x2, y2 = y2,
                    cnf = cnf, cls = cls, clsName = clsName
                )
            )
        }
    }
    return boundingBoxes
}

之后，OverlayView.kt 歸一化邊界框坐標并將它們疊加在攝像頭流上以可視化結(jié)果。

class OverlayView(context: Context?, attrs: AttributeSet?) : View(context, attrs) {

    private var results = listOf<BoundingBox>()
    private val boxPaint = Paint()
    private val textBackgroundPaint = Paint()
    private val textPaint = Paint()

    private var bounds = Rect()
    private val colorMap = mutableMapOf<String, Int>()

    init {
        initPaints()
    }

    fun clear() {
        results = listOf()
        textPaint.reset()
        textBackgroundPaint.reset()
        boxPaint.reset()
        invalidate()
        initPaints()
    }

    private fun initPaints() {
        textBackgroundPaint.color = Color.WHITE
        textBackgroundPaint.style = Paint.Style.FILL
        textBackgroundPaint.textSize = 42f

        textPaint.color = Color.WHITE
        textPaint.style = Paint.Style.FILL
        textPaint.textSize = 42f
    }

    override fun draw(canvas: Canvas) {
        super.draw(canvas)

        results.forEach { boundingBox ->
            // Get or create a color for this label
            val color = getColorForLabel(boundingBox.clsName)
            boxPaint.color = color
            boxPaint.strokeWidth = 8F
            boxPaint.style = Paint.Style.STROKE

            val left = boundingBox.x1 * width
            val top = boundingBox.y1 * height
            val right = boundingBox.x2 * width
            val bottom = boundingBox.y2 * height

            canvas.drawRoundRect(left, top, right, bottom, 16f, 16f, boxPaint)

            val drawableText = "${boundingBox.clsName} ${Math.round(boundingBox.cnf * 100.0) / 100.0}"

            textBackgroundPaint.getTextBounds(drawableText, 0, drawableText.length, bounds)
            val textWidth = bounds.width()
            val textHeight = bounds.height()

            val textBackgroundRect = RectF(
                left,
                top,
                left + textWidth + BOUNDING_RECT_TEXT_PADDING,
                top + textHeight + BOUNDING_RECT_TEXT_PADDING
            )
            textBackgroundPaint.color = color // Set background color same as bounding box
            canvas.drawRoundRect(textBackgroundRect, 8f, 8f, textBackgroundPaint)

            canvas.drawText(drawableText, left, top + textHeight, textPaint)
        }
    }

    private fun getColorForLabel(label: String): Int {
        return colorMap.getOrPut(label) {
            // Generate a random color or you can use a predefined set of colors
            Color.rgb((0..255).random(), (0..255).random(), (0..255).random())
        }
    }

    fun setResults(boundingBoxes: List<BoundingBox>) {
        results = boundingBoxes
        invalidate()
    }

    companion object {
        private const val BOUNDING_RECT_TEXT_PADDING = 8
    }
}

最后，在 Android Studio 中打開項目，構(gòu)建它，然后將手機連接起來安裝應(yīng)用。這是 Android 上的最終輸出。推理時間接近 300 毫秒。

Android 上的實時目標檢測

責任編輯：趙寧寧來源：小白玩轉(zhuǎn)Python

大型語言模型 LLMs AI YOLOv10

51CTO技術(shù)棧公眾號

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營

<cite id="olk48"><rp id="olk48"><form id="olk48"></form></rp></cite>