如何優(yōu)雅地 Hack 用戶的代碼

作者：theanarkh 2022-05-24 06:07:48

本文介紹一些一種在 JS 層面 hack 用戶代碼的方式。

前言：做基礎(chǔ)技術(shù)的時(shí)候，會(huì)經(jīng)常碰到一個(gè)問題就是如何讓自己提供的代碼對(duì)用戶少侵入，無感。比如我提供了一個(gè) SDK 收集 Node.js 進(jìn)程的 HTTP 請(qǐng)求耗時(shí)，最簡(jiǎn)單的方式就是給用戶提供一個(gè) request 方法，然后讓用戶統(tǒng)一調(diào)用，這樣我就可以在 request 里拿到這些數(shù)據(jù)。但是這種方式很多時(shí)候并不方便，這時(shí)候我們就需要去 hack Node.js 的 HTTP 模塊或者給 Node.js 提 PR。在操作系統(tǒng)層面，有提供很多技術(shù)解決這種問題，比如 ebpf、uprobe、kprobe。但是應(yīng)用層無法使用這種技術(shù)解決我們的問題，因?yàn)椴僮飨到y(tǒng)的這些技術(shù)針對(duì)的是底層的函數(shù)，比如我想知道一個(gè) JS 函數(shù)的耗時(shí)，只能在 V8 層面或者 JS 層面去解決，V8 這方面似乎也沒有提供很好能力，所以目前我們更多是考慮純 JS 或者 Node.js 內(nèi)核層面。本文介紹一些一種在 JS 層面 hack 用戶代碼的方式。

在 Node.js 中，統(tǒng)計(jì) JS 函數(shù)的耗時(shí)通常的做法是 cpu profile，但是這種方式只能拿到一段時(shí)間的耗時(shí)，如果我想實(shí)時(shí)收集耗時(shí)數(shù)據(jù)，cpu profile 就有點(diǎn)難搞，最直接的就是定時(shí)收集 cpu profile 數(shù)據(jù)，然后我們手動(dòng)去解析 profile 數(shù)據(jù)然后上報(bào)。除了這種方式外，本文介紹另外一種方式。就是通過 hack JS 代碼的方式。假如有以下一個(gè)函數(shù)。

function compute() {
    // do something
}

如果我們想統(tǒng)計(jì)這種函數(shù)的執(zhí)行耗時(shí)，最自然的方式就是在函數(shù)的開始和結(jié)束的地方插入一些代碼。但是我們不希望這種事情讓用戶手動(dòng)去做，而是使用一種更優(yōu)雅的方式。那就是通過分析源碼，拿到 AST，然后重寫 AST。我們看看怎么做。

const acorn = require('acorn');
const escodegen = require('escodegen');
const b = require('ast-types').builders;
const walk = require("acorn-walk");
const fs = require('fs');

// 分析源碼，拿到 AST
const ast = acorn.parse(fs.readFileSync('./test.js', 'utf-8'), {
    ecmaVersion: 'latest',
});

function inject(node) {
    // 在函數(shù)前后插入代碼
    const entryNode = b.variableDeclaration('const', [b.variableDeclarator(b.identifier('start'), b.callExpression(
        b.identifier('(() => { return Date.now(); })'), [],
    ))]);
    const exitNode = b.returnStatement(b.callExpression(
        b.identifier('((start) => {console.log(Date.now() - start);})'), [ 
            b.identifier('start')
        ],
    ));

    if (node.body.body) {
        node.body.body.unshift(entryNode);
        node.body.body.push(exitNode);
    }
}

// 遍歷 AST，修改 AST
walk.simple(ast, {
    ArrowFunctionExpression: inject,
    ArrowFunctionDeclaration: inject,
    FunctionDeclaration: inject,
    FunctionExpression: inject
});

// 根據(jù)修改的 AST 重新生成代碼
const newCode = escodegen.generate(ast);

fs.writeFileSync('test.js', newCode)

執(zhí)行上面的代碼后拿到如下結(jié)果。

function compute() {
    const start = (() => { return Date.now(); })();
    return ((start) => {console.log(Date.now() - start);})(start);
}

這樣我們就可以拿到每個(gè)函數(shù)的耗時(shí)數(shù)據(jù)了。但是這種方式是靜態(tài)分析源碼，落地起來需要用戶主動(dòng)操作，并不是那么友好。那么基于這個(gè)基礎(chǔ)我們利用 V8 調(diào)試協(xié)議中的 Debugger Domain 實(shí)現(xiàn)動(dòng)態(tài)重寫，這種方式還能重寫 Node.js 內(nèi)部的 JS 代碼。首先改一下測(cè)試代碼。

function compute() {
    // do something
}

setInterval(compute, 1000)

然后再看改寫代碼的邏輯。

const { Session } = require('inspector');
const acorn = require('acorn');
const escodegen = require('escodegen');
const b = require('ast-types').builders;
const walk = require("acorn-walk");
const session = new Session();
session.connect();

require('./test_ast');
// 監(jiān)聽 JS 代碼解析事件，拿到所有的 JS
session.on('Debugger.scriptParsed', (message) => {
    // 只處理這個(gè)文件
    if (message.params.url.indexOf('test_ast') === -1) {
        return;
    }
    // 拿到源碼
    session.post('Debugger.getScriptSource', {scriptId: message.params.scriptId}, (err, ret) => {
        const ast = acorn.parse(ret.scriptSource, {
            ecmaVersion: 'latest',
        });
        function inject(node) {
            const entry = b.variableDeclaration('const', [b.variableDeclarator(b.identifier('start'), b.callExpression(
                b.identifier('(() => { return Date.now(); })'), [],
            ))]);
            const exit = b.returnStatement(b.callExpression(
                b.identifier('((start) => {console.log(Date.now() - start);})'), [ 
                    b.identifier('start')
                ],
            ));

            if (node.body.body) {
                node.body.body.unshift(entry);
                node.body.body.push(exit);
            }
        }
        walk.simple(ast, {
            ArrowFunctionExpression: inject,
            ArrowFunctionDeclaration: inject,
            FunctionDeclaration: inject,
            FunctionExpression: inject
        });
        const newCode = escodegen.generate(ast);
        // 分析完，重寫 AST后生成新的代碼，并重寫
        session.post('Debugger.setScriptSource', {
            scriptId: message.params.scriptId,
            scriptSource: newCode,
            dryRun: false
        });
    })
});

session.post('Debugger.enable', () => {});

正常來說，setInterval 執(zhí)行的函數(shù)沒有東西輸出，但是我們發(fā)現(xiàn)會(huì)不斷輸出 0，也就是耗時(shí)，因?yàn)檫@里使用毫秒級(jí)的統(tǒng)計(jì)，所以是 0，不過我們不需要關(guān)注這個(gè)。這樣我們就完成了 hack 用戶的代碼，而對(duì)用戶來說是無感的，唯一需要做的事情就是引入我們提供的一個(gè) SDK。不過這種方式的難點(diǎn)在重寫代碼的邏輯，風(fēng)險(xiǎn)也比較大，但是如果我們解決了這個(gè)問題后，我們就可以隨便 hack 用戶的代碼，做我們想做的事情，當(dāng)然，是正事。

責(zé)任編輯：姜華來源：編程雜技

JS hack 用戶代碼

自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

如何優(yōu)雅地 Hack 用戶的代碼