為什么'\x1B'.length === 1？\x與\u知識延伸

作者：粥里有勺糖 2021-10-05 20:59:25

自己從0-1寫一個(gè)正則難免會(huì)有許多邊界情況考慮不周全，于是在chalk的README中找到了chalk/ansi-regex[3]這個(gè)庫。

[[427162]]

背景

先講一下背景，再說原因

大多數(shù)庫都會(huì)在日志中使用chalk庫為console的內(nèi)容進(jìn)行上色

被chalk處理后，其原本的內(nèi)容會(huì)被‘\x1B...’所包裹

console.log(chalk.blue('green')); 
console.log([chalk.blue('green')]);

圖片

在開發(fā)vite-plugin-monitor[1]時(shí)，為了獲取原始的日志內(nèi)容(上色之前)，需要將上色后的字符串還原

\x1B[34mgreen\x1B[39m => green

在使用正則處理內(nèi)容的時(shí)候發(fā)現(xiàn)了一個(gè)問題

'\x1B'.replace(/\\x/,'') // 結(jié)果？？

通過.length查看其長度，結(jié)果就如標(biāo)題所示

原因

反斜杠"\"通常標(biāo)識轉(zhuǎn)義字符，如\n(換行符),\t(制表符)

而\x就標(biāo)識16進(jìn)制，后面跟上兩位16進(jìn)制數(shù)

與此同時(shí)還有\(zhòng)u也是標(biāo)識16進(jìn)制，但其后面需跟上4位16進(jìn)制數(shù)

因此這里的\x1B實(shí)際上就是一個(gè)字符

'\x41' === 'A'   // true 
'A' === '\u0041' // true

\x

\xhh匹配一個(gè)以兩位十六進(jìn)制數(shù)(\x00-\xFF)表示的字符

主要用于ASCII碼[2]的表示

'\x41' === ‘A’ 
'A' === String.fromCharCode(65) 
 
'\x61' === ‘a’ 
'a' === String.fromCharCode(97)

\x后必須跟著兩位16進(jìn)制的字符，否則會(huì)報(bào)錯(cuò)，其中 A-F 不區(qū)分大小寫

'\x1' // Uncaught SyntaxError: Invalid hexadecimal escape sequence 
'\xfg' // Uncaught SyntaxError: Invalid hexadecimal escape sequence

\u

\uhhhh匹配一個(gè)以四位十六進(jìn)制數(shù)(\u0000-\uFFFF)表示的 Unicode 字符。

在正則表達(dá)式中常見于匹配中文字符

const r = /[\u4e00-\u9fa5]/ 
r.test('中文') // true 
r.test('English') // false

常規(guī)字符與Unicode字符互轉(zhuǎn)

str2Unicode

使用String.prototype.charCodeAt獲取指定位置的 Unicode 碼點(diǎn)(十進(jìn)制表示)

使用String.prototype.toString將其轉(zhuǎn)為十六進(jìn)制字符,轉(zhuǎn)為16進(jìn)制字符不會(huì)自動(dòng)補(bǔ)0

通過String.prototype.padStart進(jìn)行補(bǔ)0

編寫的通用處理方法如下

function str2Unicode(str) { 
    let s = '' 
    for (const c of str) { 
        s += `\\u${c.charCodeAt(0).toString(16).padStart(4, '0')}` 
    } 
    return s 
} 
 
str2Unicode('1a中文') // '\\u0031\\u0061\\u4e2d\\u6587'

unicode2Str

通過正則/\\u[\da-f]{4}/g匹配出所有的unicode字符

使用Number將0x${matchStr}轉(zhuǎn)換為10進(jìn)制數(shù)
使用String.fromCodePoint將unicode碼點(diǎn)轉(zhuǎn)為字符
使用String.prototype.replace進(jìn)行逐字符的轉(zhuǎn)換

function str2Unicode(str) { 
    let s = '' 
    for (const c of str) { 
        s += `\\u${c.charCodeAt(0).toString(16).padStart(4, '0')}` 
    } 
    return s 
} 
 
str2Unicode('1a中文') // '\\u0031\\u0061\\u4e2d\\u6587'

還原chalk處理后的字符串

自己從0-1寫一個(gè)正則難免會(huì)有許多邊界情況考慮不周全，于是在chalk的README中找到了chalk/ansi-regex[3]這個(gè)庫

可以將色值相關(guān)的 ANSI轉(zhuǎn)義碼匹配出來

import ansiRegex from 'ansi-regex'; 
 
'\u001B[4mcake\u001B[0m'.match(ansiRegex()); 
//=> ['\u001B[4m', '\u001B[0m'] 
 
'\u001B[4mcake\u001B[0m'.match(ansiRegex({onlyFirst: true})); 
//=> ['\u001B[4m']

編寫一下處理方法

function resetChalkStr(str) { 
    return str.replace(ansiRegex(), '') 
}

測試

console.log(chalk.green('green'), chalk.greenBright('greenBright')); 
 
console.log([chalk.green('green'), chalk.greenBright('greenBright')]); 
 
console.log(resetChalkStr(`${chalk.green('green')} ${chalk.greenBright('greenBright')}`));

總結(jié)

重拾了一下\x與\u相關(guān)的內(nèi)容，突然額外想到一個(gè)點(diǎn)，使用\u去做字符串的加解密(下來再捋一捋)

解決了一個(gè)chalk相關(guān)的問題“還原終端中的彩色內(nèi)容”

參考資料

[1]vite-plugin-monitor: https://github.com/ATQQ/vite-plugin-monitor

[2]ASCII碼: https://tool.oschina.net/commons?type=4

[3]chalk/ansi-regex: https://github.com/chalk/ansi-regex

責(zé)任編輯：武曉燕來源：粥里有勺糖

console 日志正則

自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

為什么'\x1B'.length === 1？\x與\u知識延伸

[[427162]]

背景

原因

\x

\u

常規(guī)字符與Unicode字符互轉(zhuǎn)

還原chalk處理后的字符串

總結(jié)

為什么'\x1B'.length === 1？\x與\u知識延伸