api的使用:
>>> import httplib, urllib
>>> params = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 'show'})
>>> headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain"}
>>> conn = httplib.HTTPConnection("bugs.python.org")
>>> conn.request("POST", "", params, headers)
>>> response = conn.getresponse()
>>> print response.status, response.reason
302 Found
>>> data = response.read()
>>> data
'Redirecting to <a href="
>>> conn.close()
HTTPConnection必须以server location来初始化,意图就是一个HTTPConnection表示,只能对一个location请求。
用户调用conn.request指定method,path,body,headers,发起请求。
调用conn.getresponse返回HTTPResponse响应。
再来看看它有什么其他的接口,
connect:更新self.sock属性。
putrequest:构建起始行和HOST和Accept-Encoding头部,因为这两个和http的version有关。
putheader:构建头部行
endheaders:发送起始行,headers和body
close:关闭连接
set_tunnel:设置隧道
可以看出HTTPConnection的接口,是从业务流程来设计的。
首先建立socket连接,
然后构建起始行,
构建headers,
发送request的请求,
然后返回http响应。
然后看看HTTPSConnection是基于HTTPConnection,怎么实现的:
def connect(self):
"Connect to a host on a given (SSL) port."
sock = socket.create_connection((self.host, self.port),
self.timeout, self.source_address)
if self._tunnel_host:
self.sock = sock
self._tunnel()
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
它复写了connect方法,https需要key_file, cert_file来建立连接。但没有使用connect的参数传递,而是通过类的__init__方法传递,通过属性。
这种形式比connect参数传递会好,因为接口的设计,如果兼顾到很多功能,会有许多默认参数。而且对以后的扩展,也不好。但这种__init__方法,也需要考虑到许多默认参数,而且参数的作用相比没那么直接。
再接着看看它是如何发送数据的。
def _output(self, s):
"""Add a line of output to the current request buffer.
Assumes that the line does *not* end with \\r\\n.
"""
self._buffer.append(s)
self._buffer = [],它的元素是http头部的每一行。在_send_output方法中,会被格式化成标准http格式。
def _send_output(self, message_body=None):
"""Send the currently buffered request and clear the buffer.
Appends an extra \\r\\n to the buffer.
A message_body may be specified, to be appended to the request.
"""
self._buffer.extend(("", ""))
msg = "\r\n".join(self._buffer)
del self._buffer[:]
# If msg and message_body are sent in a single send() call,
# it will avoid performance problems caused by the interaction
# between delayed ack and the Nagle algorithm.
if isinstance(message_body, str):
msg += message_body
message_body = None
self.send(msg)
if message_body is not None:
#message_body was not a string (i.e. it is a file) and
#we must run the risk of Nagle
self.send(message_body)
可以看到msg变量是由self._buffer通过\r\n来连接起来的,格式化成标准的http头部。然后调用send方法,把http头部和http实体发送出去。
def send(self, data):
"""Send `data' to the server."""
if self.sock is None:
if self.auto_open:
self.connect()
else:
raise NotConnected()
if self.debuglevel > 0:
print "send:", repr(data)
blocksize = 8192
if hasattr(data,'read') and not isinstance(data, array):
if self.debuglevel > 0: print "sendIng a read()able"
datablock = data.read(blocksize)
while datablock:
self.sock.sendall(datablock)
datablock = data.read(blocksize)
else:
self.sock.sendall(data)
send方法,只是负责向socket发送数据。它支持data的read属性,会不断的从data中获取数据,然后发送出去。
def putheader(self, header, *values):
"""Send a request header line to the server.
For example: h.putheader('Accept', 'text/html')
"""
if self.__state != _CS_REQ_STARTED:
raise CannotSendHeader()
hdr = '%s: %s' % (header, '\r\n\t'.join([str(v) for v in values]))
self._output(hdr)
putheader方法很简单,只是简单的构建头部。
def request(self, method, url, body=None, headers={}):
"""Send a complete request to the server."""
self._send_request(method, url, body, headers)
_send_request方法的定义:
def _send_request(self, method, url, body, headers):
# Honor explicitly requested Host: and Accept-Encoding: headers.
header_names = dict.fromkeys([k.lower() for k in headers])
skips = {}
if 'host' in header_names:
skips['skip_host'] = 1
if 'accept-encoding' in header_names:
skips['skip_accept_encoding'] = 1
self.putrequest(method, url, **skips)
if body is not None and 'content-length' not in header_names:
self._set_content_length(body)
for hdr, value in headers.iteritems():
self.putheader(hdr, value)
self.endheaders(body)
首先是调用putrequest构建起始行
然后调用putheader构建头部
最后调用endheaders构建实体,并且发送。
def getresponse(self, buffering=False):
"Get the response from the server."
if self.__state != _CS_REQ_SENT or self.__response:
raise ResponseNotReady()
args = (self.sock,)
kwds = {"strict":self.strict, "method":self._method}
if self.debuglevel > 0:
args += (self.debuglevel,)
if buffering:
#only add this keyword if non-default, for compatibility with
#other response_classes.
kwds["buffering"] = True;
response = self.response_class(*args, **kwds)
response.begin()
assert response.will_close != _UNKNOWN
self.__state = _CS_IDLE
if response.will_close:
# this effectively passes the connection to the response
self.close()
else:
# remember this, so we can tell when it is complete
self.__response = response
return response
getresponse方法,使用self.sock实例化HTTPResponse对象,然后调用HTTPResponse的begin方法。HTTPResponse主要负责基于socket,对http响应的解析。在后面有讲解。