自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學堂

全部課程軟考華為認證廠商認證 IT技術(shù)PMP項目管理免費題庫

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學堂APP

51CTO學堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設(shè)置退出

如何用Python來找你喜歡的妹子？

作者：Python綠色通道 2018-05-13 21:34:00

開發(fā) 后端服務(wù)器

我之前寫了一個抓取妹子資料的文章，主要是使用selenium來模擬網(wǎng)頁操作，然后使用動態(tài)加載，再用xpath來提取網(wǎng)頁的資料，但這種方式效率不高。所以今天我再補一個高效獲取數(shù)據(jù)的辦法.由于并沒有什么模擬的操作，一切都可以人工來控制，所以也不需要打開網(wǎng)頁就能獲取數(shù)據(jù)！

先上效果圖吧，no pic say bird!

我之前寫了一個抓取妹子資料的文章，主要是使用selenium來模擬網(wǎng)頁操作，然后使用動態(tài)加載，再用xpath來提取網(wǎng)頁的資料，但這種方式效率不高。

所以今天我再補一個高效獲取數(shù)據(jù)的辦法.由于并沒有什么模擬的操作，一切都可以人工來控制，所以也不需要打開網(wǎng)頁就能獲取數(shù)據(jù)！

但我們需要分析這個網(wǎng)頁，打開網(wǎng)頁 http://www.lovewzly.com/jiaoyou.html 后，按F12,進入Network項中

url在篩選條件后，只有page在發(fā)生變化，而且是一頁頁的累加，而且我們把這個url在瀏覽器中打開，會得到一批json字符串,所以我可以直接操作這里面的json數(shù)據(jù)，然后進行存儲即可！

代碼結(jié)構(gòu)圖：

操作流程：

headers 一定要構(gòu)建反盜鏈以及模擬瀏覽器操作，先這樣寫，可以避免后續(xù)問題！
條件拼裝
然后記得數(shù)據(jù)轉(zhuǎn)json格式
然后對json數(shù)據(jù)進行提取，
把提取到的數(shù)據(jù)放到文件或者存儲起來

主要學習到的技術(shù)：

學習requests+urllib
操作execl
文件操作
字符串
異常處理
另外其它基礎(chǔ)

請求數(shù)據(jù)：

def craw_data(self): 
        '''數(shù)據(jù)抓取''' 
        headers = { 
            'Referer': 'http://www.lovewzly.com/jiaoyou.html', 
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4620.400 QQBrowser/9.7.13014.400' 
        } 
        page = 1 
        while True: 
 
            query_data = { 
                'page':page, 
                'gender':self.gender, 
                'starage':self.stargage, 
                'endage':self.endgage, 
                'stratheight':self.startheight, 
                'endheight':self.endheight, 
                'marry':self.marry, 
                'salary':self.salary, 
            } 
            url = 'http://www.lovewzly.com/api/user/pc/list/search?'+urllib.urlencode(query_data) 
            print url 
            req = urllib2.Request(url, headers=headers) 
            response = urllib2.urlopen(req).read() 
            # print response 
            self.parse_data(response) 
            page += 1

字段提?。?/p>

def parse_data(self,response): 
      '''數(shù)據(jù)解析''' 
      persons = json.loads(response).get('data').get('list') 
      if persons is None: 
          print '數(shù)據(jù)已經(jīng)請求完畢' 
          return 
 
      for person in persons: 
          nick = person.get('username') 
          gender = person.get('gender') 
          age = 2018 - int(person.get('birthdayyear')) 
          address = person.get('city') 
          heart = person.get('monolog') 
          height = person.get('height') 
          img_url = person.get('avatar') 
          education = person.get('education') 
          print nick,age,height,address,heart,education 
          self.store_info(nick,age,height,address,heart,education,img_url) 
          self.store_info_execl(nick,age,height,address,heart,education,img_url)

文件存放：

def store_info(self, nick,age,height,address,heart,education,img_url): 
        ''' 
        存照片,與他們的內(nèi)心獨白 
        ''' 
        if age < 22: 
            tag = '22歲以下' 
        elif 22 <= age < 28: 
            tag = '22-28歲' 
        elif 28 <= age < 32: 
            tag = '28-32歲' 
        elif 32 <= age: 
            tag = '32歲以上' 
        filename = u'{}歲_身高{}_學歷{}_{}_{}.jpg'.format(age,height,education, address, nick) 
 
        try: 
            # 補全文件目錄 
            image_path = u'E:/store/pic/{}'.format(tag) 
            # 判斷文件夾是否存在。 
            if not os.path.exists(image_path): 
                os.makedirs(image_path) 
                print image_path + ' 創(chuàng)建成功' 
 
            # 注意這里是寫入圖片，要用二進制格式寫入。 
            with open(image_path + '/' + filename, 'wb') as f: 
                f.write(urllib.urlopen(img_url).read()) 
 
            txt_path = u'E:/store/txt' 
            txt_name = u'內(nèi)心獨白.txt' 
            # 判斷文件夾是否存在。 
            if not os.path.exists(txt_path): 
                os.makedirs(txt_path) 
                print txt_path + ' 創(chuàng)建成功' 
 
            # 寫入txt文本 
            with open(txt_path + '/' + txt_name, 'a') as f: 
                f.write(heart) 
        except Exception as e: 
            e.message

execl操作：

def store_info_execl(self,nick,age,height,address,heart,education,img_url): 
       person = [] 
       person.append(self.count)   #正好是數(shù)據(jù)條 
       person.append(nick) 
       person.append(u'女' if self.gender == 2 else u'男') 
       person.append(age) 
       person.append(height) 
       person.append(address) 
       person.append(education) 
       person.append(heart) 
       person.append(img_url) 
 
       for j in range(len(person)): 
           self.sheetInfo.write(self.count, j, person[j]) 
 
       self.f.save(u'我主良緣.xlsx') 
       self.count += 1 
       print '插入了{}條數(shù)據(jù)'.format(self.count)

***展現(xiàn)！

源碼地址：https://github.com/pythonchannel/python27/blob/master/test/meizhi.py

責任編輯：武曉燕來源：養(yǎng)碼場

Python 妹子網(wǎng)頁

51CTO技術(shù)棧公眾號

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營

<blockquote id="4ytcj"></blockquote>

<center id="4ytcj"></center>