欢迎您访问365答案网,请分享给你的朋友!
生活常识 学习资料

爬虫手册02Requests的使用

时间:2023-05-29
Requests的使用

目标: 列举Requests常用的功能,方便查阅。

一、GET请求 1、基本用法

import requestsr = requests.get('https://www.baidu.com/')print(type(r))print(r.status_code)print(type(r.text))print(r.text[:100])print(r.cookies)

运行结果:

200 ]>

2、携带参数 (params参数)

import requests data = { 'name': 'germey', 'age': 25} r = requests.get('https://httpbin.org/get', params=data) print(r.text)print(type(r.text))print(r.json())print(type(r.json()))

运行结果:返回的是json字符串, 可以直接调用json()转为字典

{ "args": { "age": "25", "name": "germey" }, "headers": { "Accept": "**', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.22.0', 'X-Amzn-Trace-Id': 'Root=1-620913df-3f5acf216ce4775c03354c66'}, 'origin': '192.168.1.1', 'url': 'https://httpbin.org/get?name=germey&age=25'}

3、抓取二进制数据 (图片, 音频, 视频)

import requestsr = requests.get('https://github.com/favicon.ico')with open('favicon.ico', 'wb') as f: f.write(r.content)

往文件里写就可以保存二进制数据。

4、添加请求头 (添加UA, 代理, 防盗链等参数)

import requestsheaders = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'}r = requests.get('https://httpbin.org/get', headers=headers)print(r.text)

运行结果:

{ "args": {}, "headers": { "Accept": "**", "Accept-Encoding": "gzip, deflate", "Content-Length": "18", "Content-Type": "application/x-www-form-urlencoded", "Host": "www.httpbin.org", "User-Agent": "python-requests/2.22.0", "X-Amzn-Trace-Id": "Root=1-62091826-3152bf7a1515a77d072a49b8" }, "json": null, "origin": "192.168.1.1", "url": "https://www.httpbin.org/post"}

三、响应 1、获取响应信息

import requestsr = requests.get('https://httpbin.org/get')print(type(r.status_code), r.status_code)print(type(r.headers), r.headers)print(type(r.cookies), r.cookies)print(type(r.url), r.url)print(type(r.history), r.history)

运行结果:

200 {'Date': 'Sun, 13 Feb 2022 14:48:27 GMT', 'Content-Type': 'application/json', 'Content-Length': '308', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true'} https://httpbin.org/get []

2、根据响应状态码执行不同操作

import requestsr = requests.get('https://ssr1.scrape.center/')exit() if not r.status_code == requests.codes.ok else print('Request Successfully')

运行结果:三元表达式, 语句1 if 条件 else 语句2, 条件为真执行语句1, 条件为假执行语句2

Request Successfully

四、高级用法 1、文件上传

import requestsfiles = {'file': open('favicon.ico', 'rb')}r = requests.post('https://www.httpbin.org/post', files=files)print(r.text)

运行结果:file部分就是上传的文件信息, 二进制内容部分省略

{ "args": {}, "data": "", "files": { "file": "data:application/octet-stream;base64,AAABAAI..." }, "form": {}, "headers": { "Accept": "**", "Accept-Encoding": "gzip, deflate", "Content-Length": "11", "Content-Type": "application/x-www-form-urlencoded", "Host": "httpbin.org", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36", "X-Amzn-Trace-Id": "Root=1-62092798-72199d8761a339441af9b39c" }, "json": null, "origin": "192.168.1.1", "url": "https://httpbin.org/post"}

Copyright © 2016-2020 www.365daan.com All Rights Reserved. 365答案网 版权所有 备案号:

部分内容来自互联网,版权归原作者所有,如有冒犯请联系我们,我们将在三个工作时内妥善处理。