自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認證廠商認證 IT技術(shù)PMP項目管理免費題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設(shè)置退出

對python開源技術(shù)開發(fā)的相關(guān)了解

作者：佚名 2010-03-09 17:23:12

開發(fā) 后端

如果你想更好的了解python開源，那你可以仔細的閱讀以下的文章，通過對以下文章的的了解，希望你在以后的python開源開發(fā)中有所收獲。

python開源是一項十分值得開發(fā)的項目，只有開源，才會使更多的人進行相關(guān)的技術(shù)彌補，使python開源在開發(fā)的過程中有所不斷的創(chuàng)新使你在開源的項目中發(fā)現(xiàn)更多的信息技術(shù)。

1、用python來做一個蜘蛛程序抓取網(wǎng)頁，有了urllib庫，真是太簡單了。另外網(wǎng)頁的解析也有相應(yīng)的庫sgmllib可以使用。不過還不知道python的sgmllib有沒有類似 Jtidy 的規(guī)范html代碼的能，或者是有另外的庫來干這事。

比較有名氣的：

Harvest Man------http://code.google.com/p/harvestman-crawler/

HarvestMan is a modular, extensible and flexible web crawler program cum framework written in pure Python. HarvestMan can be used to download files from websites according to a number of customized rules and constraints. It can be used to find information from websites matching keywords or regular expressions.

The final goal of the project is to develop a full-fledged semantic personal data mining platform which can be used to retrieve information from the Internet in a highly customizable manner, so that one can fetch information from the web the way he wants it, when he wants it. For this, HarvestMan project will provide support for Web 2.0 and 3.0 technologies such as RSS, RDF, OWL etc. （這個目標還真是大啊，要是真的可以做到那就真是牛逼。）

另外，還有一些小的項目，用Google code或者 sourceforge.net搜索，就可以找到。

2、對pdf文件的操作，C++,c#和java都有一些python開源的類庫可以使用。比如：pdflib,itext,pdfclown,pdfbox他們可以實現(xiàn)pdf文件的解析，并實現(xiàn)pdf與rtf html xml等格式之間的相互轉(zhuǎn)換。

今天發(fā)現(xiàn)了一個可以操作pdf的python庫： pdfminer.

不知道有沒有其他的庫。希望高手補充。

3、有了pdf的操作庫，可以輕松實現(xiàn)對pdf文件內(nèi)容的有目的的抽取。

關(guān)于Python開源的開放協(xié)作組織,介紹Python語言的技術(shù)知識和使用,在中國進行python 開源的應(yīng)用/推廣/學(xué)習(xí)…,分享Python體驗知識，經(jīng)驗，技巧。

責(zé)任編輯：佚名

51CTO技術(shù)棧公眾號

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營

<thead id="fifay"><rt id="fifay"></rt></thead>