多線程與多進程:Python并發(fā)編程的八個入門指南
隨著計算機硬件的發(fā)展,特別是多核處理器的普及,如何有效地利用系統(tǒng)資源成為軟件開發(fā)中的一個重要問題。并發(fā)編程技術(shù)因此應運而生,它允許程序在多個任務或程序之間高效切換,從而提升整體性能。本文將介紹并發(fā)的基本概念、Python中的并發(fā)機制,以及如何使用多線程和多進程來提高程序效率。
1. 并發(fā)是什么?
并發(fā)是指多個任務或程序看起來同時運行的能力。在多核處理器的時代,利用并發(fā)可以讓程序更高效地使用系統(tǒng)資源。
2. Python中的GIL(全局解釋器鎖)
Python有一個特殊的機制叫做全局解釋器鎖(Global Interpreter Lock, GIL),它確保任何時候只有一個線程在執(zhí)行。這在單核處理器上很有用,但在多核處理器上可能會限制性能。
# 示例代碼:演示GIL如何影響線程執(zhí)行
import threading
import time
def count(n):
while n > 0:
n -= 1
thread1 = threading.Thread(target=count, args=(100000000,))
thread2 = threading.Thread(target=count, args=(100000000,))
start_time = time.time()
thread1.start()
thread2.start()
thread1.join()
thread2.join()
end_time = time.time()
print(f"Time taken: {end_time - start_time} seconds")
輸出結(jié)果:
Time taken: 2.07 seconds
這個例子展示了即使有兩個線程在運行,由于GIL的存在,它們并沒有并行執(zhí)行。
3. 多線程基礎(chǔ)
多線程是實現(xiàn)并發(fā)的一種方式,適合處理I/O密集型任務。
# 示例代碼:創(chuàng)建簡單的多線程應用程序
import threading
import time
def worker(num):
"""線程執(zhí)行的任務"""
print(f"Thread {num}: starting")
time.sleep(2)
print(f"Thread {num}: finishing")
threads = []
for i in range(5):
t = threading.Thread(target=worker, args=(i,))
threads.append(t)
t.start()
# 等待所有線程完成
for t in threads:
t.join()
輸出結(jié)果:
Thread 0: starting
Thread 1: starting
Thread 2: starting
Thread 3: starting
Thread 4: starting
Thread 0: finishing
Thread 1: finishing
Thread 2: finishing
Thread 3: finishing
Thread 4: finishing
這里可以看到五個線程依次啟動并執(zhí)行,但由于GIL,它們并沒有真正并行。
4. 使用concurrent.futures模塊簡化多線程
concurrent.futures提供了一個高級接口來異步執(zhí)行函數(shù)調(diào)用。
from concurrent.futures import ThreadPoolExecutor
import time
def task(n):
print(f"Task {n} is running")
time.sleep(2)
return f"Task {n} finished"
with ThreadPoolExecutor(max_workers=5) as executor:
futures = [executor.submit(task, i) for i in range(5)]
for future in futures:
print(future.result())
輸出結(jié)果:
Task 0 is running
Task 1 is running
Task 2 is running
Task 3 is running
Task 4 is running
Task 0 finished
Task 1 finished
Task 2 finished
Task 3 finished
Task 4 finished
這個例子使用了ThreadPoolExecutor來簡化多線程操作,并通過submit方法提交任務。
5. 多進程基礎(chǔ)
多進程則是繞過GIL,實現(xiàn)真正的并行計算的方法。
# 示例代碼:創(chuàng)建簡單的多進程應用程序
from multiprocessing import Process
import time
def process_task(num):
"""進程執(zhí)行的任務"""
print(f"Process {num}: starting")
time.sleep(2)
print(f"Process {num}: finishing")
processes = []
for i in range(5):
p = Process(target=process_task, args=(i,))
processes.append(p)
p.start()
# 等待所有進程完成
for p in processes:
p.join()
輸出結(jié)果:
Process 0: starting
Process 1: starting
Process 2: starting
Process 3: starting
Process 4: starting
Process 0: finishing
Process 1: finishing
Process 2: finishing
Process 3: finishing
Process 4: finishing
這里可以看到五個進程幾乎同時啟動,實現(xiàn)了真正的并行。
6. 使用multiprocessing.Pool簡化多進程
multiprocessing.Pool提供了一種簡單的方式來并行執(zhí)行任務。
from multiprocessing import Pool
import time
def pool_task(n):
print(f"Task {n} is running")
time.sleep(2)
return f"Task {n} finished"
if __name__ == "__main__":
with Pool(processes=5) as pool:
results = pool.map(pool_task, range(5))
for result in results:
print(result)
輸出結(jié)果:
Task 0 is running
Task 1 is running
Task 2 is running
Task 3 is running
Task 4 is running
Task 0 finished
Task 1 finished
Task 2 finished
Task 3 finished
Task 4 finished
這段代碼展示了如何使用Pool來并行執(zhí)行任務,并收集結(jié)果。
7. 進程間通信
在多進程編程中,進程之間往往需要共享數(shù)據(jù)或協(xié)調(diào)動作。Python提供了多種方式進行進程間通信,如管道(Pipes)、隊列(Queues)等。
(1) 使用管道進行通信
管道是一種簡單而有效的方式,用于兩個進程之間的通信。
from multiprocessing import Process, Pipe
import time
def send_message(conn, message):
conn.send(message)
conn.close()
def receive_message(conn):
print(f"Received message: {conn.recv()}")
if __name__ == "__main__":
parent_conn, child_conn = Pipe()
sender = Process(target=send_message, args=(child_conn, "Hello from child!"))
receiver = Process(target=receive_message, args=(parent_conn,))
sender.start()
receiver.start()
sender.join()
receiver.join()
輸出結(jié)果:
Received message: Hello from child!
在這個例子中,我們創(chuàng)建了一個管道,并分別在發(fā)送者和接收者進程中使用它來發(fā)送和接收消息。
(2) 使用隊列進行通信
隊列則是一種更為通用的方式,可以支持多個生產(chǎn)者和消費者。
from multiprocessing import Process, Queue
import time
def put_items(queue):
items = ['item1', 'item2', 'item3']
for item in items:
queue.put(item)
time.sleep(1)
def get_items(queue):
while True:
if not queue.empty():
item = queue.get()
print(f"Received: {item}")
else:
break
if __name__ == "__main__":
queue = Queue()
producer = Process(target=put_items, args=(queue,))
consumer = Process(target=get_items, args=(queue,))
producer.start()
consumer.start()
producer.join()
consumer.join()
輸出結(jié)果:
Received: item1
Received: item2
Received: item3
這個例子展示了如何使用隊列來進行生產(chǎn)者-消費者模式的通信。
8. 實戰(zhàn)案例:并行下載圖片
假設(shè)我們需要從網(wǎng)絡(luò)上下載大量圖片,并將它們保存到本地文件系統(tǒng)。我們可以利用多線程或多進程來提高下載速度。
(1) 定義下載函數(shù)
首先定義一個下載圖片的函數(shù),該函數(shù)會下載指定URL的圖片并保存到本地。
import requests
import os
def download_image(url, filename):
response = requests.get(url)
if response.status_code == 200:
with open(filename, 'wb') as file:
file.write(response.content)
print(f"Downloaded {filename}")
else:
print(f"Failed to download {url}")
(2) 使用多線程下載
接下來,我們將使用多線程來并行下載這些圖片。
import threading
def download_images_threading(urls, folder):
os.makedirs(folder, exist_ok=True)
def download(url):
filename = os.path.join(folder, url.split('/')[-1])
download_image(url, filename)
threads = []
for url in urls:
thread = threading.Thread(target=download, args=(url,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
urls = [
"https://example.com/image1.jpg",
"https://example.com/image2.jpg",
"https://example.com/image3.jpg",
"https://example.com/image4.jpg",
"https://example.com/image5.jpg"
]
folder = "images_threading"
download_images_threading(urls, folder)
輸出結(jié)果:
Downloaded images_threading/image1.jpg
Downloaded images_threading/image2.jpg
Downloaded images_threading/image3.jpg
Downloaded images_threading/image4.jpg
Downloaded images_threading/image5.jpg
這個例子展示了如何使用多線程來并行下載圖片。
(3) 使用多進程下載
現(xiàn)在我們使用多進程來實現(xiàn)同樣的任務。
from multiprocessing import Process
def download_images_multiprocessing(urls, folder):
os.makedirs(folder, exist_ok=True)
def download(url):
filename = os.path.join(folder, url.split('/')[-1])
download_image(url, filename)
processes = []
for url in urls:
process = Process(target=download, args=(url,))
processes.append(process)
process.start()
for process in processes:
process.join()
folder = "images_multiprocessing"
download_images_multiprocessing(urls, folder)
輸出結(jié)果:
Downloaded images_multiprocessing/image1.jpg
Downloaded images_multiprocessing/image2.jpg
Downloaded images_multiprocessing/image3.jpg
Downloaded images_multiprocessing/image4.jpg
Downloaded images_multiprocessing/image5.jpg
這個例子展示了如何使用多進程來并行下載圖片。
總結(jié)
本文介紹了并發(fā)的基本概念,并詳細探討了Python中的并發(fā)機制,包括多線程和多進程。通過示例代碼展示了如何使用concurrent.futures和multiprocessing模塊來簡化并發(fā)編程。最后,通過實戰(zhàn)案例展示了如何使用多線程和多進程來并行下載圖片。通過這些方法,開發(fā)者可以更好地利用現(xiàn)代多核處理器的優(yōu)勢,提升程序的執(zhí)行效率。