自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

AI.x社區(qū)

軟考社區(qū)

免費(fèi)課

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會

公眾號矩陣

移動端

視頻課免費(fèi)課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認(rèn)證廠商認(rèn)證 IT技術(shù)PMP項(xiàng)目管理免費(fèi)題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設(shè)置退出

移除注釋的完善思路：真的可以用正則實(shí)現(xiàn)？

作者：wmaker 2018-07-16 10:50:02

開發(fā) 前端

網(wǎng)上有很多自稱能實(shí)現(xiàn)移除JS注釋的正則表達(dá)式，實(shí)際上存在種種缺陷。這使人多少有些愕然，也不禁疑惑到：真的可以用正則實(shí)現(xiàn)嗎？而本篇文章以使用正則移除JS注釋為目標(biāo)，通過實(shí)踐，由淺及深，遇到問題解決問題，一步步看看到底能否用正則實(shí)現(xiàn)！

導(dǎo)語

網(wǎng)上有很多自稱能實(shí)現(xiàn)移除JS注釋的正則表達(dá)式，實(shí)際上存在種種缺陷。這使人多少有些愕然，也不禁疑惑到：真的可以用正則實(shí)現(xiàn)嗎？而本篇文章以使用正則移除JS注釋為目標(biāo)，通過實(shí)踐，由淺及深，遇到問題解決問題，一步步看看到底能否用正則實(shí)現(xiàn)！

移除注釋的完善思路：真的可以用正則實(shí)現(xiàn)？

1 單行注釋

單行注釋要么占據(jù)一整行，要么處于某一行的***。

正常情況下不難，直接通過正則匹配，再用replace方法移除便可。

let codes = `  
  let name = "Wmaker"; // This is name.  
  if (name) {  
    // Print name.  
    console.log("His name is:", name);  
  }  
`;  
 
 
console.log( codes.replace(/\/\/.*$/mg, '') );  
 
// 打印出：  
// let name = "Wmaker";   
// if (name) {  
//     
//   console.log("His name is:", name);  
// }

上面是成功的刪除了注釋，不過對于獨(dú)占一整行的注釋清理的不夠徹底，會留下空白行。實(shí)際上，行尾注釋前面的空白也被保留了下來。所以目標(biāo)稍稍提高，清除這些空白。操作起來也并不難，思路大致這樣：刪除整行，實(shí)際上是刪除本行末尾的換行符或上一行末尾的換行符。而換行符本身也屬于空白符。所以只需操作正則，匹配到注釋以及注釋前面所有的空白符即可，一箭雙雕。

let codes = `  
  let name = "Wmaker"; // This is name.  
  if (name) {  
    // Print name.  
    console.log("His name is:", name);  
  }  
`;   
 
console.log( codes.replace(/\s*\/\/.*$/mg, '') );  
 
// 打印出：  
// let name = "Wmaker";  
// if (name) {  
//   console.log("His name is:", name);  
// }

如果在字符串中出現(xiàn)完整的URL地址，上面的正則會直接匹配而將其刪除。網(wǎng)上大多會將URL的格式特征（http://xxx）：雙下劃線前面有冒號，作為解決途徑加以利用。但這只是治標(biāo)不治本的做法，畢竟//以任何形式出現(xiàn)在字符串中是它的自由，我們無從干涉。

這樣問題就轉(zhuǎn)變成：如何使正則匹配存在于引號外的雙下劃線？

想匹配被引號包圍，帶有雙下劃線的代碼塊比較簡單：/".*\/\/.*"/mg。難點(diǎn)在于如何實(shí)現(xiàn)這個(gè)否定，即當(dāng)正則匹配到雙下劃線后，再判斷其是否在引號里面？絞盡腦汁，也上網(wǎng)查了很多，都沒有像樣的結(jié)果。靜心平氣，洗把臉?biāo)⑺⒀涝贈_個(gè)頭冷靜之后，覺得單純使用正則的路已經(jīng)走不通了，得跳出這個(gè)圈。

就在***關(guān)頭，在那淫穢污濁的房間上方突然光芒萬丈。我急忙護(hù)住了充滿血絲的眼睛，靜待其適應(yīng)后定睛一看。只見那里顯現(xiàn)出了一段文字（Chinese）：孩兒啊，先將帶有//被引號包圍的字符串替換掉，去掉注釋后再還原，不就行了嗎？

let codes = `  
  let name = "Wmaker"; // This is name.  
  if (name) {  
    // Print name.  
    console.log("His name is:", name);  
    console.log("Unusual situation, characters of // in quotation marks.");  
  }  
`;   
 
// 之前的方式。  
console.log( codes.replace(/\s*\/\/.*$/mg, '') );  
// 打印出：  
// let name = "Wmaker"; 
// if (name) {  
//   console.log("His name is:", name);  
//   console.log("Unusual situation, characters of  
// }   
 
// 現(xiàn)在的方式。  
console.log( removeComments(codes) );  
// 打印出： 
// let name = "Wmaker";  
// if (name) {  
//   console.log("His name is:", name);  
//   console.log("Unusual situation, characters of // in quotation marks.");  
// }  
 
function removeComments(codes) {  
  let {replacedCodes, matchedObj} = replaceQuotationMarksWithForwardSlash(codes);  
 
  replacedCodes = replacedCodes.replace(/\s*\/\/.*$/mg, '');  
  Object.keys(matchedObj).forEach(k => {  
    replacedCodes = replacedCodes.replace(k, matchedObj[k]);  
  });  
 
  return replacedCodes;  
 
  function replaceQuotationMarksWithForwardSlash(codes) {  
    let matchedObj = {};  
    let replacedCodes = '';      
 
    let regQuotation = /".*\/\/.*"/mg;  
    let uniqueStr = 'QUOTATIONMARKS' + Math.floor(Math.random()*10000);  
 
    let index = 0;  
    replacedCodes = codes.replace(regQuotation, function(match) {  
      let s = uniqueStr + (index++);  
      matchedObj[s] = match;  
      return s;  
    });  
 
    return { replacedCodes, matchedObj };  
  }  
}

是的，目標(biāo)達(dá)成了，老天眷顧??！

另外，有一個(gè)需要優(yōu)化的地方：定義字符串的方式有三種 ' " ` ，目前我們只匹配了雙引號。

為了避免正則的記憶功能，都使用了正則字面量進(jìn)行測試。

--- 之前

console.log( /".*\/\/.*"/mg.test(`'Unu//sual'`) ); // false  
console.log( /".*\/\/.*"/mg.test(`"Unu//sual"`) ); // true  
console.log( /".*\/\/.*"/mg.test(`\`Unu//sual\``) ); // false

--- 之后

console.log( /('|"|`).*\/\/.*\1/mg.test(`'Unu//sual'`) ); // true  
console.log( /('|"|`).*\/\/.*\1/mg.test(`"Unu//sual"`) ); // true  
console.log( /('|"|`).*\/\/.*\1/mg.test(`\`Unu//sual\``) ); // true

?。栴}到此結(jié)束了！

真的結(jié)束了嗎？不！我看了看時(shí)間：02:17，然后將眼鏡摘下，扯了張紙巾，拭去了幾顆淚水。

以下是接連解決的兩個(gè)問題：貪婪模式和轉(zhuǎn)義字符。

--- STEP 1，由于正則的貪婪模式導(dǎo)致。 
let codes = ` 
  let str = 'abc//abc'; // abc' 
`; 
console.log( codes.match(/('|"|`).*\/\/.*\1/mg) ); // ["'abc//abc'; // abc'"] 
 
-- 解決  
let codes = ` 
  let str = 'abc//abc'; // abc' 
`; 
console.log( codes.match(/('|"|`).*?\/\/.*?\1/mg) ); // ["'abc//abc'"]  
 
--- STEP 2，由定義字符串時(shí)其中的轉(zhuǎn)義字符導(dǎo)致。 
let codes = ` 
  let str = 'http://x\\'x.com'; // 'acs 
`; 
console.log( codes.match(/('|"|`).*?\/\/.*?\1/mg) ); // ["'http://x\'", "'; // '"] 
 
-- 解決  
let reg = /(?<!\\)('|"|`).*?\/\/.*?(?<!\\)\1/mg; 
let codes = ` 
  let str = 'http://x\\'x.com'; // 'acs 
`; 
console.log( codes.match(reg) ); // ["'http://x\'x.com'"]

事情到這里，雖然勞累，但多少有些成就感，畢竟成功了。

可是，可是，可是在測試時(shí)，竟然無意間發(fā)現(xiàn)一個(gè)無法逾越的障礙。就好比費(fèi)勁千辛萬苦花費(fèi)無盡的財(cái)力物力之后，某某尤物終于愿意一同去情人旅館時(shí)，卻發(fā)現(xiàn)家家爆滿，沒有空余的房間。在強(qiáng)裝歡笑，玩命的哄騙著她，一家接連一家的尋找直到終于定到房間后，卻發(fā)現(xiàn)自己已然挺不起來了！

正則會將任意位置的引號作為查找的起始位置，它不在乎引號是成雙的道理。下面是一個(gè)示例。

let reg = /(?<!\\)('|"|`).*?\/\/.*?(?<!\\)\1/mg;  
let codes = `  
  let str = "abc"; // "  
`;  
console.log( codes.match(reg) ); // [""abc"; // ""]

不過，問題好歹在補(bǔ)過覺之后的 06:37 時(shí)得以解決。

思路是這樣的：雖然不能正確實(shí)現(xiàn)匹配帶有//被引號包圍的代碼塊（可能有方法，但能力有限），但是簡化成匹配單純被引號包圍的代碼塊，是簡單而且能正確做到的，雖然耗費(fèi)的內(nèi)存多了一些。另外，兩引號間也可能包含換行符，所以為其增加s模式：.代表全部字符。下面是去除單行注釋的最終代碼。

let codes = `  
  let name = "Wmaker"; // This is name.  
  let str = 'http://x\\'x.com' + " / / " + '/"/"/'; // '; // " "  
  if (name) {  
    // Print name.  
    console.log("His name is:", name);  
    console.log("Unusual situation, characters of // in quotation marks.");  
  } 
`;  
 
console.log(removeComments(codes));  
// 打印出：  
// let name = "Wmaker";  
// let str = 'http://x\'x.com' + " / / " + '/"/"/';  
// if (name) {  
//   console.log("His name is:", name);  
//   console.log("Unusual situation, characters of // in quotation marks.");  
// } 
 
function removeComments(codes) {  
  let {replacedCodes, matchedObj} = replaceQuotationMarksWithForwardSlash(codes);   
  replacedCodes = replacedCodes.replace(/\s*\/\/.*$/mg, '');  
  Object.keys(matchedObj).forEach(k => {  
    replacedCodes = replacedCodes.replace(k, matchedObj[k]);  
  });  
 
  return replacedCodes;  
 
  function replaceQuotationMarksWithForwardSlash(codes) {  
    let matchedObj = {};  
    let replacedCodes = '';     
 
    let regQuotation = /(?<!\\)('|"|`).*?(?<!\\)\1/smg;  
    let uniqueStr = 'QUOTATIONMARKS' + Math.floor(Math.random()*10000);  
 
    let index = 0;  
    replacedCodes = codes.replace(regQuotation, function(match) {  
      let s = uniqueStr + (index++);  
      matchedObj[s] = match;  
      return s;  
    });  
 
    return { replacedCodes, matchedObj };  
  }  
}

***補(bǔ)充一點(diǎn)，單雙引號雖然也可以多行顯示，但其解析后實(shí)際是單行的。

let codes = "' \  
  Wmaker \  
'";  
codes.match( /(?<!\\)('|"|`).*?(?<!\\)\1/smg ); // ["'   Wmaker '"]

2 多行注釋

??！難點(diǎn)已經(jīng)解決，現(xiàn)在就可以悠哉悠哉的往前推進(jìn)了。

多行注釋與單行思路相同，只需在刪除注釋時(shí)多加一個(gè)匹配模式。中和兩者的最終代碼如下。

let codes = `  
  let name = "Wmaker"; // This is name.  
  let str = 'http://x\\'x.com' + " / / " + '/"/"/'; // '; // " "  
  let str = 'http://x\\'x./*a*/com' + " / / " + '/"/"/'; // '; // "/*sad*/ "  
  if (name) {  
    // Print name.  
    /* Print name. */  
    console.log("His name is:", name);  
    console.log("Unusual situation, characters of // in quotation marks.");  
    /*  
     * Others test.  
     */  
    console.log("Unusual situation, characters of /* abc */ in quotation marks.");  
  }  
`;   
 
console.log(removeComments(codes));  
// 打印出：  
// let name = "Wmaker";  
// let str = 'http://x\'x.com' + " / / " + '/"/"/';  
// let str = 'http://x\'x./*a*/com' + " / / " + '/"/"/';  
// if (name) {  
//   console.log("His name is:", name);  
//   console.log("Unusual situation, characters of // in quotation marks.");  
//   console.log("Unusual situation, characters of /* abc */ in quotation marks."); 
// }  
 
function removeComments(codes) {  
  let {replacedCodes, matchedObj} = replaceQuotationMarksWithForwardSlash(codes);  
 
  replacedCodes = replacedCodes.replace(/(\s*\/\/.*$)|(\s*\/\*[\s\S]*?\*\/)/mg, '');  
  Object.keys(matchedObj).forEach(k => {  
    replacedCodes = replacedCodes.replace(k, matchedObj[k]);  
  }); 
 
  return replacedCodes;  
  function replaceQuotationMarksWithForwardSlash(codes) {  
    let matchedObj = {};  
    let replacedCodes = '';      
 
    let regQuotation = /(?<!\\)('|"|`).*?(?<!\\)\1/smg;  
    let uniqueStr = 'QUOTATIONMARKS' + Math.floor(Math.random()*10000);  
 
    let index = 0;  
    replacedCodes = codes.replace(regQuotation, function(match) {  
    let s = uniqueStr + (index++);  
    matchedObj[s] = match;  
    return s;  
    });  
    return { replacedCodes, matchedObj };  
  }  
}

3 總結(jié)

從以上可以得出結(jié)論，單純使用正則表達(dá)式是不能達(dá)到目標(biāo)的，需要配合其它操作才行。但現(xiàn)在得出的結(jié)果真的能覆蓋全部的情況？會不會有其它的隱藏問題，比如多字節(jié)字符的問題。雖然作為一個(gè)碼農(nóng)，該有的自信不會少，但慢慢的也明白了自己的局限性。從網(wǎng)上的其它資料看，使用UglifyJS，或在正確的解析中去除注釋，會更為穩(wěn)妥。但有可能自己動手解決的，沒理由不花費(fèi)些精力試試！

問題更新記錄

感謝熱心同志找出的錯(cuò)誤，我會將能改與不能改的都列于此地，并只會更新下面兩個(gè)示例的代碼。

1.沒有考慮正則字面量中的轉(zhuǎn)義字符。

出錯(cuò)示例：var reg=/a\//;。

修改方式：將刪除注釋的正則改為：/(\s*(?<!\\)\/\/.*$)|(\s*(?<!\\)\/\*[\s\S]*?(?<!\\)\*\/)/mg。

這里是工作于前端頁面的代碼及相應(yīng)示例，下載鏈接。

<!DOCTYPE html>  
<html> 
 
<head>  
  <meta charset="UTF-8">  
  <title>Remove Comments</title>  
</head>  
 
<body>  
  <p>輸入：</p>  
  <textarea id="input" cols="100" rows="12"></textarea>  
 
  <br /><br />  
  <button onclick="transform()">轉(zhuǎn)換</button>  
 
  <p>輸出：</p>  
  <textarea id="output" cols="100" rows="12"></textarea>    
 
  <script>  
    let input = document.querySelector('#input');  
    let output = document.querySelector('#output');  
 
    setDefaultValue();  
 
    function transform() {  
      output.value = removeComments(input.value);  
    } 
 
    function removeComments(codes) {  
      let {replacedCodes, matchedObj} = replaceQuotationMarksWithForwardSlash(codes);  
 
      replacedCodes = replacedCodes.replace(/(\s*(?<!\\)\/\/.*$)|(\s*(?<!\\)\/\*[\s\S]*?(?<!\\)\*\/)/mg, '');  
      Object.keys(matchedObj).forEach(k => { 
       replacedCodes = replacedCodes.replace(k, matchedObj[k]);  
      });  
 
      return replacedCodes;  
 
      function replaceQuotationMarksWithForwardSlash(codes) {  
        let matchedObj = {};  
        let replacedCodes = '';          
 
        let regQuotation = /(?<!\\)('|"|`).*?(?<!\\)\1/smg;  
        let uniqueStr = 'QUOTATIONMARKS' + Math.floor(Math.random()*10000);  
 
        let index = 0;  
        replacedCodes = codes.replace(regQuotation, function(match) {  
          let s = uniqueStr + (index++);  
          matchedObj[s] = match;  
          return s;  
        });  
 
        return { replacedCodes, matchedObj };  
      }  
    }  
 
    function setDefaultValue() {  
      input.value = `let name = "Wmaker"; // This is name.  
let str = 'http://x\\'x.com' + " / / " + '/"/"/'; // '; // " "  
let str = 'http://x\\'x./*a*/com' + " / / " + '/"/"/'; // '; // "/*sad*/ "  
if (name) {  
  // Print name.  
  /* Print name. */  
  console.log("His name is:", name);  
  console.log("Unusual situation, characters of // in quotation marks.");  
  /*  
   * Others test.  
   */  
  console.log("Unusual situation, characters of /* abc */ in quotation marks."); 
 }  
`;  
    }  
  </script>  
</body>  
</html>

這里是工作于Node端的代碼及相應(yīng)示例，下載鏈接。運(yùn)行命令：node 執(zhí)行文件待轉(zhuǎn)譯文件轉(zhuǎn)移后文件。

const fs = require('fs');  
const path = require('path');  
const process = require('process');  
 
let sourceFile = process.argv[2];  
let targetFile = process.argv[3];  
if (!sourceFile || !targetFile) {  
  throw new Error('Please set source file and target file.');  
} 
 
sourceFile = path.resolve(__dirname, sourceFile);  
targetFile = path.resolve(__dirname, targetFile);  
 
fs.readFile(sourceFile, 'utf8', (err, data) => {  
  if (err) throw err; 
fs.writeFile(targetFile, removeComments(data), 'utf8', (err, data) => {  
    if (err) throw err;  
    console.log('Remove Comments Done!');  
  });  
});  
 
function removeComments(codes) {  
  let {replacedCodes, matchedObj} = replaceQuotationMarksWithForwardSlash(codes);  
 
  replacedCodes = replacedCodes.replace(/(\s*(?<!\\)\/\/.*$)|(\s*(?<!\\)\/\*[\s\S]*?(?<!\\)\*\/)/mg, '');  
  Object.keys(matchedObj).forEach(k => {  
    replacedCodes = replacedCodes.replace(k, matchedObj[k]);  
  }); 
  
  return replacedCodes; 
 
  function replaceQuotationMarksWithForwardSlash(codes) {  
    let matchedObj = {};  
    let replacedCodes = '';       
 
    let regQuotation = /(?<!\\)('|"|`).*?(?<!\\)\1/smg;  
    let uniqueStr = 'QUOTATIONMARKS' + Math.floor(Math.random()*10000);  
 
    let index = 0;  
    replacedCodes = codes.replace(regQuotation, function(match) {  
      let s = uniqueStr + (index++);  
      matchedObj[s] = match;  
      return s;  
    });  
 
    return { replacedCodes, matchedObj };  
  }  
}

責(zé)任編輯：龐桂玉來源： segmentfault

注釋 javascript 前端

點(diǎn)贊

51CTO技術(shù)棧公眾號

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營

<li id="ju8jt"></li>