原文链接: http://www.kylin-ux.com/2017/04/19/language-python-爬虫-urllib-HTTP-POST-cookie-json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import urllib.request
from http.cookiejar import CookieJar
import json
url = 'http://www.baidu.com'
req_dict = {'k': 'v'}
cj = CookieJar()
handler = urllib.request.HTTPCookieProcessor(cj)
opener = urllib.request.build_opener(handler)
req_json = json.dumps(req_dict)
req_post = req_json.encode('utf-8')
headers = {}
#headers['Content-Type'] = 'application/json'
req = urllib.request.Request(url=url, data=req_post, headers=headers)
#urllib.request.install_opener(opener)
#res = urllib.request.urlopen(req)
# 或
res = opener.open(req)
res = res.read().decode('utf-8')
print(res)

发送json, 去掉headers['Content-Type'] = 'application/json'注释即可

相关知识:

  1. 默认的urllib.request.urlopen()不带cookie信息
  2. urlopen是一个封装好的OpenerDirector实例,参数为(url, data, timeout)
  3. 通过build_opener可以创建OpenerDirector实例, build_opener(*handlers), 将handler类实例化增加到OpenerDirector中
  4. 使用urllib2.install_opener()会设置urllib2的全局opener, 也可以直接调用opener.open()代替全局的urlopen方法
  5. 如果已知cookie内容, 且固定不变, 可以在header中直接添加cookie内容发送请求

    1
    2
    headers["Cookie"] = "xxxxx"
    req = urllib.request.Request(url=url, headers=headers, data=req_post)
  6. python3与python2.7部分区别:
    python 3.x中urllib库和urilib2库合并为urllib库
    urllib2.urlopen变为urllib.request.urlopen
    urllib2.Request变为urllib.request.Request
    urllib2.urlencode()变为urllib.parse.urlencode
    cookielib变为http.cookiejar