Python异步之aiohttp _aiohttp

aiohttp什么是 aiohttp？一个异步的 HTTP 客户端服务端框架，基于 asyncio 的异步模块。可用于实现异步爬虫，更快于 requests 的同步爬虫。
安装pip install aiohttpaiohttp 和 requestsrequests 版爬虫requests 同步方式连续 30 次简单爬取 http://httpbin.org 网站
import requestsfrom datetime import datetimedef fetch(url):r = requests.get(url)print(r.text)start = datetime.now()for i in range(30):fetch('http://httpbin.org/get')end = datetime.now()print("requests版爬虫花费时间为：")print(end - start)示例结果
# 打印网站返回的内容....requests版爬虫花费时间为：0:00:43.248761从爬取结果可以看出，同步爬取30次网站将花费43秒左右的时间，耗时非常长。
aiohttp 版爬虫使用 aiohttp 和 asyncio 异步方式简单爬取30次网站
import aiohttpimport asynciofrom datetime import datetimeasync def fetch(client):async with client.get('http://httpbin.org/get') as resp:assert resp.status == 200return await resp.text()async def main():async with aiohttp.ClientSession() as client:html = await fetch(client)print(html)loop = asyncio.get_event_loop()tasks = []for i in range(30):task = loop.create_task(main())tasks.Append(task)start = datetime.now()loop.run_until_complete(main())end = datetime.now()print("aiohttp版爬虫花费时间为：")print(end - start)示例结果
# 打印网站返回的内容....aiohttp版爬虫花费时间为：0:00:00.539416从爬取时间可以看出，aiohttp 异步爬取网站只用了0.5秒左右的时间，比 requests 同步方式快了80倍左右，速度非常之快。
同一个 sessionaiohttp.ClientSession() 中封装了一个 session 的连接池，并且在默认情况下支持 keepalives，官方建议在程序中使用单个 ClientSession 对象，而不是像上面示例中的那样每次连接都创建一个 ClientSession 对象，除非在程序中遇到大量的不同的服务。
将上面的示例修改为：
import aiohttpimport asynciofrom datetime import datetimeasync def fetch(client):print("打印 ClientSession 对象")print(client)async with client.get('http://httpbin.org/get') as resp:assert resp.status == 200return await resp.text()async def main():async with aiohttp.ClientSession() as client:tasks = []for i in range(30):tasks.append(asyncio.create_task(fetch(client)))await asyncio.wait(tasks)loop = asyncio.get_event_loop()start = datetime.now()loop.run_until_complete(main())end = datetime.now()print("aiohttp版爬虫花费时间为：")print(end - start)示例结果
# 重复30遍打印 ClientSession 对象<aiohttp.client.ClientSession object at 0x1094aff98>aiohttp版爬虫花费时间为：0:00:01.778045从上面爬取的时间可以看出单个 ClientSession 对象比多个 ClientSession 对象多花了3倍时间。ClientSession 对象一直是同一个 0x1094aff98 。
返回值Json 串在上面的示例中使用 response.text() 函数返回爬取到的内容，aiohttp 在处理 Json 返回值的时候，可以直接将字符串转换为 Json 。
async def fetch(client):async with client.get('http://httpbin.org/get') as resp:return await resp.json()示例结果
{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'Python/3.7 aiohttp/3.6.2'}, 'origin': '49.80.42.33, 49.80.42.33', 'url': 'https://httpbin.org/get'}当返回的 Json 串不是一个标准的 Json 时，resp.json() 函数可以传递一个函数对json进行预处理，如：resp.json(replace(a, b))，replace()函数表示 a 替换为 b 。
字节流aiohttp 使用 response.read() 函数处理字节流，使用 with open() 方式保存文件或者图片
async def fetch(client):async with client.get('http://httpbin.org/image/png') as resp:return await resp.read()async def main():async with aiohttp.ClientSession() as client:image = await fetch(client)with open("/Users/xxx/Desktop/image.png", 'wb') as f:f.write(image)response.read() 函数可以传递数字参数用于读取多少个字节，如：response.read(3)读取前3个字节。
参数aiohttp 可以使用3种方式在 URL 地址中传递参数
async def fetch(client):params = [('a', 1), ('b', 2)]async with client.get('http://httpbin.org/get',params=params) as resp:return await resp.text()示例URL地址
http://httpbin.org/get?a=1&b=2async def fetch(client):params = {"a": 1,"b": 2}async with client.get('http://httpbin.org/get',params=params) as resp:return await resp.text()

Python异步之aiohttp

推荐阅读

时尚丽人风行|色彩搭配总是在踩雷？值得一看的三点心机想不美都难，回头率爆表

父母在彩礼嫁妆上总是出尔反尔让我难做，该咋办

剧院等演出场所限流提至50%-剧院等演出场所恢复第三版

人民日报客户端广东频道|格兰仕筹划部分要约收购惠而浦（中国）

一个爱炫耀的男人值得考虑交往吗

散布武汉汛情虚假信息，2名网民被警方依法处理

华为荣耀|双11大屏手机推荐：荣耀这款手机性价比高，各项性能都很出色！

赵丽颖|收视女王转型三部曲：《知否》人生哲学神剧《楚乔传》结局最虐

首都机场▲22岁女孩刚结婚一个多月跳湖自杀，母亲称女儿婚前有很多追求者

苹果自研高端基带曝光：支持5G毫米波

趣事知多D|太羡慕番禺人！打卡南村人气第一西餐，云顶餐吧很适合约会聚餐

「试管婴儿」当初那个爱上28岁小鲜肉的老太太，不顾一切做试管婴儿，如今怎么样了？

月经血突然变黑？月经发黑是什么原因

国防部,军事|国防部通报，有重要信号

怎样合理布置餐客区

像素之源|戴安娜王妃的“鸟笼面纱帽”造型神秘高贵，朦胧美很高级

采采搞笑段子| 爆笑囧图，超市遇见一位令我心疼的男人…

【什么是旧粗布】什么是旧粗布

农村即将消失的老物件，如今成值钱老古董，第3个最贵能卖60万

贾母和宝玉是什么关系贾母想让宝玉娶谁