URL components and their names
This post explains the names of different parts of a URL. For a given URL like
https://blog.mysite.com:8080/marketing/parts-url;foo=bar?key1=hello&key2=world#qux
Names for individual components:
https: Schemeblog: Subdomainmysite: 2nd-level domaincom: Top-level domainblog.mysite.com: Host8080: Port/marketing/parts-url: Pathfoo=bar: Params.key1=hello&key2=world: Queryqux: Fragment.
Names for combination of multiple components:
blog.mysite.com:8080: i.e.<host>:<port>, Network location/Sockethttps://blog.mysite.com:8080: i.e.<scheme>://<host>:<port>, Origin
Parsing example in Python:
>>> from urllib.parse import urlparse
>>> urlparse("https://blog.example.com:8080/marketing/parts-url;foo=bar?key1=hello&key2=world#qux")
ParseResult(scheme='https', netloc='blog.example.com:8080', path='/marketing/parts-url', params='foo=bar', query='key1=hello&key2=world', fragment='qux')
References:
- RFC 1808 Relative Uniform Resource
Locators
- Generic-RL syntax:
<scheme>://<net_loc>/<path>;<params>?<query>#<fragment>
- Generic-RL syntax: