This module defines a standard interface to break Uniform Resource Locator (URL) strings up in components (addressing scheme, network location, path etc.)
Returns a tuple (
fragment) derived from
This corresponds to the general structure of a URL:
scheme://netloc/path?query#fragment. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the
path component, which is retained if present. For example:
import urlparse o = urlparse(‘http://www.cwi.nl:80/%7Eguido/Python.html’)
# result is (‘http’, ‘www.cwi.nl:80’, ‘/%7Eguido/Python.html’, ‘’, ‘’)
Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.
netloc as parsed by
parse(), breaks it in its component returning a tuple (
port). Each component of the returned tuple is a string.
Return the urlencoded version of
Urlencoding transforms unsafe bytes to their %XX representation where XX is the hex value of the byte.
Safe bytes are:
lowercase letters from “a” to “z” (bytes from 0x61 to 0x7a)
uppercase letters from “A” to “Z” (bytes from 0x41 to 0x5a)
numbers from “0” to “9” (bytes from 0x30 to 0x39)
the following symbols: $-_.+!*’()
quote(), but also escapes
s is urlencoded, returns
s with every
+ substituted with a space and every
%xx substituted with the corresponding character.
Parses a query string
qs and returns a dictionary containing the association between keys and values of
qs. Values are urldecoded by
Tranforms data dictionary in a urlencoded query string and returns the query string. Each pair (key, value) is encoded by
quote_via function. By default,
quote_plus() is used to quote the values. An alternate function that can be passed as quote_via is