网络爬虫必备知识之concurrent.futures库

 阅读目录

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor  print('ThreadPoolExecutor继承关系:',ThreadPoolExecutor.__mro__)     print('ThreadPoolExecutor属性:',[attr for attr in dir(ThreadPoolExecutor) if not attr.startswith('_')])     print('ProcessPoolExecutor继承关系:',ProcessPoolExecutor.__mro__)     print('ThreadPoolExecutor属性:',[attr for attr in dir(ProcessPoolExecutor) if not attr.startswith('_')])
复制代码

  都继承自futures._base.Executor类,拥有三个重要方法map、submit和shutdow,这样看起来就很简单了

(2)再看下futures._base.Executor基类实现

 View Code

  提供了map、submit、shutdow和with方法,下面首先对这个几个方法的使用进行说明

2. map函数

  函数原型:def map(self, fn, *iterables, timeout=None, chunksize=1)

  map函数和python自带的map函数用能类型,只不过该map函数从迭代器获取参数后异步执行,timeout用于设置超时时间

  参数chunksize的理解

复制代码
The size of the chunks the iterable will be broken into  before being passed to a child process. This argument is only  used by ProcessPoolExecutor; it is ignored by ThreadPoolExecutor.
复制代码

  例:

复制代码
from concurrent.futures import ThreadPoolExecutor import time  import requests  def download(url):     headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0',                 'Connection':'keep-alive',                 
                        
关键字:
50000+
5万行代码练就真实本领
17年
创办于2008年老牌培训机构
1000+
合作企业
98%
就业率

联系我们

电话咨询

0532-85025005

扫码添加微信