前因
原本想爬取点股票的数据分析分析,然后就遇到了这个坑,已经有段时间没再接触python,语法都差不多忘光了,所幸python简单的东西不难。
教程
python requests.get(…).json()方法获取失败
错误日志:
Traceback (most recent call last): File "D:/E/code/python/stock/demo/demo.py", line 66, in
明明我用浏览器可以得到json数据,为什么这里就不行了呢?
这实在令人恼火,百度了很久,也用Google了,仍然没有找到理想的解决方法。
于是通过response信息,确定问题所在
print(response.url) print("n") print(response.cookies) print("n") print(response.content) print('n') print(response.ok)
输出:
https://xueqiu.com/service/v5/stock/screener/quote/list ...
这就json一直解析的不成功的表象。
于是就产生了这样的疑问,其实不是Json的问题。
Python问题-requests库爬虫报403
就是访问需要添加header的User-agent
源码放送:
import requestsimport csvimport jsonimport timeimport datetimeurl = 'https://xueqiu.com/service/v5/stock/screener/quote/list?'headers = { 'Content-Type': 'application/json; charset=utf-8', 'User-Agent': 'xxxx',}def getBaidu(): rq = requests.get('http://httpbin.org/get') print(rq.json())def getStock(): # for i in 20: t = time.time() nowTime = lambda: int(round(t * 1000)) print(nowTime()); # 毫秒级时间戳,基于lambda params = { 'page': 1, 'size': 1, 'order': 'desc', 'order_by': 'amount', 'exchange': 'CN', 'market': 'CN', 'type': 'sha', '_': nowTime } response = requests.get(url=url, params=params, headers=headers) print(response.url) print("n") print(response.cookies) print("n") print(response.content) print('n') print(response.ok) html_data = response.json() data_list = html_data['data']['list'] for i in data_list: dit = {} dit['股票代码'] = i['symbol'] dit['股票名字'] = i['name'] dit['当前价'] = i['current'] dit['涨跌额'] = i['chg'] dit['涨跌幅/%'] = i['percent'] dit['年初至今/%'] = i['current_year_percent'] dit['成交量'] = i['volume'] dit['成交额'] = i['amount'] dit['换手率/%'] = i['turnover_rate'] dit['市盈率TTM'] = i['pe_ttm'] dit['股息率/%'] = i['dividend_yield'] dit['市值'] = i['market_capital'] print(dit)if __name__ == '__main__': # getBaidu() getStock()
这里的User-Agent 自己查一下就知道了。方法其实也不难,复制一下请求信息就知道了。
包括请求参数啥的,这里其实基本上都有,在学校时学网站开发时竟然没发现,可惜可惜。