diff --git a/GettingStarted.md b/GettingStarted.md
index 60448327e..a3340bb13 100644
--- a/GettingStarted.md
+++ b/GettingStarted.md
@@ -1,117 +1,630 @@
-# WebURL Quickstart Guide
+# Contents
-WebURL is a new URL type for Swift which is compatible with the WHATWG's URL Living Standard.
-To get started using WebURL, first add the package as a dependency (see the README for more information).
-Next, import the `WebURL` package:
+- [Welcome to WebURL](#welcome-to-weburl)
+- [A brief overview of URLs](#a-brief-overview-of-urls)
+- [The WebURL Type](#the-weburl-type)
+ * [A Closer Look At: Parsing and Web Compatibility](#a-closer-look-at-parsing-and-web-compatibility)
+- [Reading And Writing URL Components](#reading-and-writing-url-components)
+ * [A Closer Look At: Percent Encoding](#a-closer-look-at-percent-encoding)
+- [Relative References](#relative-references)
+ * [A Closer Look At: The Foundation URL object model vs. WebURL](#a-closer-look-at-the-foundation-url-object-model-vs-weburl)
+- [Path Components](#path-components)
+- [Query Items](#query-items)
+- [File URLs](#file-urls)
+- [Wrapping up](#wrapping-up)
+
+_Estimated reading time: About 30 minutes._
+
+
+# Welcome to WebURL
+
+
+Welcome! WebURL is a new URL library for Swift, featuring:
+
+- A modern URL model **compatible with the web**,
+
+- An intuitive, expressive API **designed for Swift**, and
+
+- A **fast** and memory-efficient implementation.
+
+This guide will introduce the core WebURL API, share some insights about how working with `WebURL` is different from Foundation's `URL`, and explain how WebURL helps you write more robust, interoperable, and performant code. After reading this guide, you should feel just as comfortable using `WebURL` as you are with Foundation's familiar `URL` type.
+
+WebURL is a model-level library, meaning it handles parsing and manipulating URLs. It comes with an integration library for `swift-system` and Apple's `System.framework` for processing file URLs, and we maintain a [fork]((https://github.com/karwa/async-http-client)) of `async-http-client` for making http(s) requests. We hope to expand this in the near future with support for Foundation's `URLSession`.
+
+To use WebURL in a SwiftPM project, begin by adding the package as a dependency:
+
+```swift
+// swift-tools-version:5.3
+import PackageDescription
+
+let package = Package(
+ name: "MyPackage",
+ dependencies: [
+ .package(
+ url: "https://github.com/karwa/swift-url",
+ .upToNextMajor(from: "0.2.0") // or `.upToNextMinor`
+ )
+ ],
+ targets: [
+ .target(
+ name: "MyTarget",
+ dependencies: [
+ .product(name: "WebURL", package: "swift-url")
+ ]
+ )
+ ]
+)
+```
+
+
+# A brief overview of URLs
+
+
+Most of us can recognize a URL when we see one, but let's briefly go in to a bit more detail about what they are and how they work.
+
+The standard describes a URL as "a universal identifier". It's a very broad definition, because URLs have a lot of diverse uses. They are used for identifying resources on networks (such as this document on the internet), for identifying files on your computer, locations within an App, books, people, places, and everything in-between.
+
+A URL may be split in to various _components_:
+
+```
+ userinfo hostname port
+ ┌──┴───┐ ┌──────┴──────┐ ┌┴┐
+ https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top
+ └─┬─┘ └───────────┬──────────────┘└───────┬───────┘ └───────────┬─────────────┘ └┬┘
+ scheme authority path query fragment
+```
+
+The most important of these are:
+
+- The **scheme**. Identifies the type of URL and can be used to dispatch the URL for further processing. For example, `http` URLs should be processed in requests according to the HTTP protocol, but `file` URLs map to a local filesystem. Your application might use a custom scheme with bespoke URL processing. All URLs **must** have a scheme.
+
+- The **hostname** and **port**. Typically a hostname is a network address, but is sometimes an opaque identifier in URLs where a network address is not necessary. `www.swift.org`, `192.168.0.1`, and `living-room-pc` are all examples of hostnames. Together with the userinfo components (**username** and **password**), they make up the URL's _authority_ section.
+
+- The **path** is either an opaque string or a list of zero or more strings, usually identifying a location. List-style paths begin with a forward-slash ("/"), and their internal components are delimited by forward-slashes.
+
+- The **query** and **fragment** are opaque strings, whose precise structure is not standardized. A common convention is to include a list of key-value pairs in the query string, and some protocols reserve the fragment for client-side information and ignore its value when servicing requests.
+
+To process this URL, we look at the scheme ("https") and hand it off to a request library that knows how to deal with https URLs. That request library knows that HTTPS is a network protocol, so it will read the hostname and port from the URL and attempt to establish a connection to the network address they describe. The HTTP protocol requires that requests specify a "request target", which it says is made by joining a URL's path ("/forum/questions/") with its query string ("?tag=networking&order=newest"), and discarding the fragment. This results in a request such as the following being issued over the connection:
+
+`GET /forum/questions/?tag=networking&order=newest HTTP/1.1`
+
+Note that most of the process from URL to request is scheme/protocol specific; HTTP(S) works like this, but URLs with other schemes may be processed in a very different way. This flexibility is very important; URLs are, after all, "universal", which means that for the most part they work at an abstract level. By itself, a URL doesn't know what it points to, how it will be processed, or what its components mean to whoever processes them.
+
+Almost all URLs identify a location _hierarchically_, meaning they have:
+
+- A hostname, and/or
+- A list-style path (a path starting with "/")
+
+For example, let's imagine a simple URL scheme for identifying places by their postal addresses. In our scheme, the address:
+
+```
+221b Baker St,
+London NW1 6XE,
+UK
+```
+
+Might become:
+
+`postal-address://UK/London/NW1 6XE/Baker St/221b`
+
+Here, the components are:
+
+- **hostname:** `UK`
+- **path:** `/London/NW1 6XE/Baker St/221b`
+
+This describes a simple hierarchical structure - with a country being at the top level, followed by the city, post-code, street, and finally building number. If we wanted to resolve a neighboring address on the same street, we could change the final component to "222" or "239". To process this URL, we would follow a similar set of steps to processing an HTTP URL: firstly, examining the scheme, then contacting the resource's authority ("UK"), and asking it to resolve the path "/London/NW1 6XE/Baker St/221b".
+
+Less commonly, a URL may have an _opaque path_. For example:
+
+- `mailto:bob@example.com`
+- `data:image/png;base64,iVBORw0KGgoAAA...`
+- `javascript:alert("hello, world!")`
+
+These URLs do not identify their locations hierarchically: they lack an authority, and their paths are simple opaque strings rather than a list of components. You can recognize them by noting that the character after the scheme delimiter (":") is not a forward-slash. Whilst the lack of hierarchy limits what we can do with the path, these URLs are processed in a familiar way: use the scheme to figure out which kind of URL it is, and interpret the other components based on their meaning for that scheme.
+
+OK, so that's a _very, very_ brief look at what URLs are, how they're used, and which flavors they come in.
+
+
+# The WebURL Type
+
+
+A URL is represented by a type also named `WebURL`. You can create a `WebURL` value by parsing a string:
```swift
import WebURL
+
+let url = WebURL("https://github.com/karwa/swift-url")!
+url.scheme // "https"
+url.hostname // "github.com"
+url.path // "/karwa/swift-url"
```
-To parse a URL from a `String`, use the initializer:
+`WebURL` supports both URLs with hierarchical, and opaque paths:
```swift
-let url = WebURL("https://github.com/karwa/swift-url/")!
+let url = WebURL("mailto:bob@example.com")!
+url.scheme // "mailto"
+url.hostname // nil
+url.path // "bob@example.com"
+url.hasOpaquePath // true
```
-Note that this initializer expects an _absolute_ URL string - i.e. something which begins with a scheme (`"http:"`, `"file:"`, `"myapp:"`, etc).
+`WebURL` is a value type, meaning that each variable is isolated from changes made to other variables. They are light-weight, inherently thread-safe, and conform to many protocols from the standard library you may be familiar with:
-`WebURL` objects conform to many protocols from the standard library you may be familiar with:
- `Equatable` and `Hashable`, so they may be used as keys in a `Dictionary` or as members of a `Set`,
- `Comparable`, so they may be sorted,
- - `Codable`, so they may be serialized/deserialized from JSON or other formats, and
- - `LosslessStringConvertible`, as `WebURL` abides by the URL Standard's requirement that
- converting a URL to/from a `String` must never change how the URL is interpreted.
+ - `Codable`, so they may be serialized/deserialized from JSON and other formats,
+ - `Sendable`, as they are thread-safe,
+ - `LosslessStringConvertible`, as `WebURL` objects can be converted to a `String` and back without losing information.
+
+Next, we're going to take a closer look at how parsing behaves, and how it differs from Foundation's `URL`.
+
+
+## A Closer Look At: Parsing and Web Compatibility
+
+
+The previous section introduced parsing a `WebURL` from a URL string.
+
+The URL Standard defines how URL strings are parsed to create an object, and how that object is later serialized to a URL string. This means that constructing a `WebURL` from a string doesn't only parse - it also _normalizes_ the URL, based on how the parser interpreted it. There are some significant benefits to this; whilst parser is very lenient (literally, as lenient as a web browser), the result is a clean, simplified URL, free of many of the "quirks" required for web compatibility.
+
+This is a very different approach to Foundation's `URL`, which tries hard to preserve the string exactly as you provide it (even if it ends up being needlessly strict), and offers operations such as `.standardize()` to clean things up later. Let's take a look at some examples which illustrate this point:
+
+```swift
+import Foundation
+import WebURL
+
+// Foundation requires your strings to be properly percent-encoded in advance.
+// WebURL is more lenient, adds encoding where necessary.
+
+URL(string: "http://example.com/some path/") // nil, fails to parse
+WebURL("http://example.com/some path/") // "http://example.com/some%20path/
+
+// This can be a particular problem for developers if their strings might contain
+// Unicode characters.
+
+URL(string: "http://example.com/search?text=банан") // nil, fails to parse
+WebURL("http://example.com/search?text=банан") // "http://example.com/search?text=%D0%B1%D0%B0%D0%BD%D0%B0%D0%BD"
+
+// Common syntax error: too many slashes. Browsers are quite forgiving about this,
+// because HTTP URLs with empty hosts aren't even valid.
+// WebURL is as lenient as a browser.
+
+URL(string: "http:///example.com/foo") // "http:///example.com/foo", .host = nil
+WebURL("http:///example.com/foo") // "http://example.com/foo", .host = "example.com"
+
+// Lots of normalization:
+// - IP address rewritten in canonical form,
+// - Default port removed,
+// - Path simplified.
+// The results look nothing alike.
+
+URL(string: "http://0x7F.1:80/some_path/dir/..") // "http://0x7F.1:80/some_path/dir/.."
+WebURL("http://0x7F.1:80/some_path/dir/..") // "http://127.0.0.1/some_path/"
+```
+
+One of the issues developers sometimes discover (after hours of debugging!) is that while types like Foundation's `URL` conform to _a_ URL standard, there are actually **multiple, incompatible URL standards**(!). Whichever URL library you are using, whether Foundation's `URL`, cURL, or Python's urllib, may not match how your server, browser, or Java/Rust/C++/etc clients interpret URLs.
-## Basic Components
+"Running different parsers and assuming that they end up with the exact same result is futile and, unfortunately, naive" says Daniel Stenberg, lead developer of the cURL library. It sounds incredible, but it's absolutely true, and it should surprise nobody that these varying and ambiguous standards lead to bugs - some of which are just annoying, but others can be catastrophic, [exploitable](https://www.youtube.com/watch?v=voTHFdL9S2k) vulnerabilities.
-Once you have constructed a `WebURL` object, you can inspect its components, such as its `scheme`, `hostname` or `path`. Additionally, the entire URL string (its "serialization") is available via the `serialized` property:
+And multiple, incompatible standards are just part of the problem; the other part is that those standards fail to match reality on the web today, meaning browsers can't conform to them without _breaking the web_. The quirks shown above (e.g. being lenient about spaces and slashes) aren't limited to user input via the address bar - there are reports of servers sending URLs in HTTP redirects which include spaces, have too many slashes, or which include non-ASCII characters. Browsers are fine with those things, but then you try the same in your App and it doesn't work.
+
+All of this is why most URL libraries abandoned formal standards long ago - "Not even curl follows any published spec very closely these days, as we’re slowly digressing for the sake of 'web compatibility'" (Stenberg). To make things worse, each library incorporated different ad-hoc compatibility hacks, because there wasn't any standard describing what, precisely, "web compatibility" even meant.
+
+So URLs are in a pretty sorry state. But how do we fix them? Yet another standard? Well, admittedly, yes 😅 - **BUT** one developed for the web, as it really is, which browsers can also conform to. No more guessing or ad-hoc compatibility hacks.
+
+`WebURL` brings web-compatible URL parsing to Swift. It conforms to the new URL standard developed by major browser vendors, its author is an active participant in the standard's development, and it is validated using the shared web-platform-tests browsers use to test their own standards compliance. That means we can say with confidence that `WebURL` precisely matches how the new URL parser in Safari 15 behaves, and Chrome and Firefox [are working to catch up](https://wpt.fyi/results/url/url-constructor.any.html?label=experimental&label=master&aligned). This standard is also used by JavaScript's native `URL` class (including NodeJS), and new libraries are being developed for many other languages which also align to the new standard.
+
+By using `WebURL` in your application (especially if your request library uses `WebURL` for all URL handling, as [our `async-http-client` fork](https://github.com/karwa/async-http-client) does), you can guarantee that your application handles URLs just like a browser, with the same, high level of interoperability with legacy and "quirky" systems. The lenient parsing and normalization behavior shown above is a huge part of it; this is what "web compatibility" means.
+
+So that's a look at parsing, web compatibility, and how important they both are. Next, let's take a look at what you can do once you've successfully parsed a URL string.
+
+
+# Reading And Writing URL Components
+
+
+Once you have created a `WebURL` value, the core components you need to process it can be accessed as properties:
```swift
-url.scheme // "https"
+import WebURL
+
+let url = WebURL("https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top")!
+
+url.scheme // "https"
+url.username // "john.doe"
+url.password // nil
+url.hostname // "www.example.com"
+url.port // 123
+url.path // "/forum/questions/"
+url.query // "tag=networking&order=newest"
+url.fragment // "top"
+```
+
+Furthermore, the URL's string representation is available by calling the `serialized()` function or by simply constructing a `String`:
+
+```swift
+import WebURL
+
+let url = WebURL("https://github.com/karwa/swift-url")!
url.hostname // "github.com"
-url.path // "/karwa/swift-url/"
+url.path // "/karwa/swift-url/"
url.serialized() // "https://github.com/karwa/swift-url/"
+String(url) // "https://github.com/karwa/swift-url/"
+```
+
+As well as reading the value of a component, you may also use these properties to modify a component's value:
+
+```swift
+import WebURL
+
+var url = WebURL("http://github.com/karwa/swift-url/")!
+
+// Upgrade to https:
+url.scheme = "https"
+url // "https://github.com/karwa/swift-url/"
+
+// Change the path:
+url.path = "/apple/swift/"
+url // "https://github.com/apple/swift/"
+```
+
+`WebURL` is always in a normalized state, so any values you modify are parsed and normalized in the same way that the URL string parser does. Note that when you set a value to these core URL properties, `WebURL` assumes the value you set is also percent-encoded. We'll take a closer look at what this means in the next section.
+
+```swift
+var url = WebURL("http://example.com/my_files/")!
+
+// Change the hostname using a non-canonical IPv4 address format:
+url.hostname = "0x7F.1"
+url // "http://127.0.0.1/my_files/"
+
+// Change the path:
+url.path = "/some path/dir/.."
+url // "http://127.0.0.1/some%20path/"
+
+// Set a partially percent-encoded value:
+url.path = "/swift%2Durl/some path"
+url // "http://127.0.0.1/swift%2Durl/some%20path"
+```
+
+Although the setters are very permissive, they can sometimes fail if an operation is not valid:
+
+```swift
+var url = WebURL("https://example.com/")!
+url.hostname = ""
+url // "https://example.com/" - didn't change
+```
+
+> Note: The silent failure is something we'd like to improve in the future. Ideally, Swift would support throwing property setters, but until it does, it's still worth keeping this convenient syntax because failure is generally quite rare.
+
+If you need to respond to setter failures, WebURL provides throwing setter methods as an alternative. The errors thrown by these methods provide helpful diagnostics about why the failure occurred - in this case, "https" URLs are not allowed to have an empty hostname, and this is enforced at the URL level.
+
+```swift
+var url = WebURL("https://example.com/")!
+try url.setHostname(to: "") // Throws. Error description:
+// "Attempt to set the hostname to the empty string, but the URL's scheme requires a non-empty hostname..."
+```
+
+> Note: WebURL **never** includes specific URL details in error descriptions, so you may log errors without compromising user privacy.
+
+That covers the basics of getting/setting a URL's components. But there's still an important piece we haven't touched on yet, and that is percent-encoding. Let's take a closer look at how `WebURL` handles that in the next section.
+
+
+## A Closer Look At: Percent Encoding
+
+
+Because URLs are a string format, certain characters may have special meaning depending on their position in the URL: for example, a "/" in the path is used to separate the path's components, and the first "?" in the string marks the start of the query. Additionally, some characters (like NULL bytes or newlines) could be difficult to process if used directly in a URL string. This poses an interesting question: what it I have a path component that needs to contain a _literal_ forward-slash?
+
+```
+https://example.com/music/bands/AC/DC // "AC/DC" should be a single component!
+```
+
+When these situations occur, we need to encode (or "escape") our path component's slash, so it won't be confused with the other slashes around it. The encoding URLs use is called **percent-encoding**, because it replaces the character with one or more "%XX" sequences, where "XX" is a byte value in hexadecimal. For the ASCII forward-slash character, the byte value is 2F, so the URL shown above would become:
+
+```
+https://example.com/music/bands/AC%2FDC
+```
+
+Of course, the machine processing this request will have to decode the path component to recover its intended value. The set of characters which need to be encoded depends on the component, although it is sometimes necessary to encode additional characters if you are embedding the URL within a larger document.
+
+`WebURL` returns the core URL components described in the previous section (`scheme`, `username`, `password`, `hostname`, `port`, `path`, `query`, and `fragment`) as they appear in the URL string, including percent-encoding (sometimes called the "raw" value). However, the library also includes a number of extensions to types and protocols from the Swift standard library, so you can encode and decode values as needed. For example, to decode a percent-encoded `String`, use the `.percentDecoded()` function:
+
+```swift
+import WebURL
+
+// Note: "%20" is a percent-encoded space character.
+let url = WebURL("https://github.com/karwa/swift%20url/")!
+
+url.path // "/karwa/swift%20url/"
+url.path.percentDecoded() // "/karwa/swift url/"
+```
+
+> Tip: The `scheme` is never percent-encoded (the standard forbids it), and neither is the port `port` (it's just a number, not a string).
+
+Since the values returned by these properties are percent-encoded, if you set these properties to a new value, that value must also be percent-encoded. When constructing a component value using arbitrary strings at runtime, we need to consider that it might contain a forward-slash, or the substring "%2F", which should be interpreted _literally_ and not as a path separator or percent-encoding. That means we need to encode values ourselves as we build the URL component's value, which we can do by using the `.percentEncoded(using:)` function and specifying the set of characters to encode:
+
+```swift
+import WebURL
+
+// If we don't percent-encode 'bandName', it might be misinterpreted
+// once we set the URL's path.
+
+func urlForBand_bad(_ bandName: String) -> WebURL {
+ var url = WebURL("https://example.com/")!
+ url.path = "/music/bands/" + bandName
+ return url
+}
+
+urlForBand_bad("AC/DC") // "https://example.com/music/bands/AC/DC" ❌
+urlForBand_bad("Blink-%182") // "https://example.com/music/bands/Blink-%182" ❌
+
+// Percent-encoding allows us to preserve 'bandName' exactly.
+// Note: "%25" is a percent-encoded ASCII percent sign.
+
+func urlForBand_good(_ bandName: String) -> WebURL {
+ var url = WebURL("https://example.com/")!
+ url.path = "/music/bands/" + bandName.percentEncoded(using: .urlComponentSet)
+ return url
+}
+
+urlForBand_good("AC/DC") // "https://example.com/music/bands/AC%2FDC" ✅
+urlForBand_good("Blink-%182") // "https://example.com/music/bands/Blink-%25182" ✅
+```
+
+> Note: The `.pathComponents` and `.formParams` views (discussed later) handle percent-encoding for you, and are the preferred way to construct URL paths and query strings. In this example, we're building a URL component using string concatenation, so we need to encode the pieces manually as we build the string.
+
+The URL standard defines several percent-encode sets, and you can also define your own. `.urlComponentSet` is usually a good choice, since it encodes all special characters used by all URL components. Strings encoded with the component set can be spliced in to any other component string without affecting the component's structure, and preserve the encoded value exactly. It is equivalent to encoding using the JavaScript function `encodeURIComponent()`.
+
+This behavior, where getting a component includes its percent-encoding, matches most other URL libraries including JavaScript's URL class, rust-url, Python's urllib, etc. However, it's important to point out that it does **not** match Foundation, which automatically decodes URL components:
+
+```swift
+import Foundation
+
+let url = URL(string: "https://example.com/music/bands/AC%2FDC")!
+
+url.path // "/music/bands/AC/DC"
+```
+
+When transitioning from Foundation to `WebURL`, this is an area where you may need to think carefully and make some adjustments to your code, so you deserve a full explanation of what's happening here, and why `WebURL` has decided not to match Foundation's behavior.
+
+As we explained at the start of this section, URLs percent-encode their components in order to maintain an unambiguous structure - but decoding is a lossy process which can erase vital structural information. Look at the returned path in the above example; it's impossible to tell that "AC/DC" should actually be a single path component. That information is lost forever, and the value returned by Foundation.URL's `.path` property is just _not the same_ as the path actually represented by the URL. It's just wrong.
+
+This doesn't just apply to the path; it applies to any component which can have internal structure (which is basically _every component_). The query and fragment are just opaque strings, and can have any internal structure you like (e.g. key-value pairs), and technically there's nothing stopping you doing the same for the username, password, hostname, or even the port number. As explained in [A brief overview of URLs](#a-brief-overview-of-urls), custom schemes have wide authority to interpret URL components as they see fit.
+
+The correct way to handle percent-encoded URL components is to: (i) keep the encoding intact, (ii) parse the component in to its smallest units, and (iii) percent-decode each unit. For example, when splitting a path, split the percent-encoded path and decode each path component individually; or when splitting a query in to key-value pairs, split the percent-encoded query and decode each key and value separately.
+
+If decoding is automatic, it can be really easy to forget that it's happening -- so easy, in fact, that `URL` itself sometimes forgets! The following example demonstrates a bug in the implementation of `URL.pathComponents` (macOS 11.6, Xcode 13 GM), which was discovered while writing this guide. Internally, the `URL.pathComponents` property [gets the URL's `.path`](https://github.com/apple/swift-corelibs-foundation/blob/dab95988ca12904e320975c7ed3c4f435552f14e/Sources/Foundation/NSURL.swift#L41), and splits it at each forward-slash. Unfortunately, since `URL.path` is automatically decoded, the values returned by `URL.pathComponents` do not match the components seen by other methods on `URL`, which correctly use the raw path:
+
+```swift
+// filed as: https://bugs.swift.org/browse/SR-15363
+import Foundation
+var url = URL(string: "https://example.com/music/bands/AC%2FDC")!
+
+// URL.pathComponents splits the percent-decoded path.
+
+for component in url.pathComponents {
+ component // "/", "music", "bands", "AC", "DC"
+}
+
+// URL.deleteLastPathComponent() uses the raw path instead.
+// It looks like it deleted 2 components.
+
+url.deleteLastPathComponent() // "https://example.com/music/bands/"
+```
+
+Consider also what happens if you want to write something like `urlA.path = urlB.path`. The following example demonstrates a simple file server; it accepts a request URL, simplifies the path using Foundation's `.standardize()` method (to resolve ".." components), authenticates the request, and fetches the file from an internal server:
+
+```swift
+import Foundation
+
+var requestedUrl = URL(string: "https://example.com/karwa/files/..%2F..%2Fmonica/files/")!
+
+// Normalize the path and check authentication for the subtree.
+requestedUrl.standardize()
+guard isAuthenticated(for: requestedUrl.pathComponents.first) else { throw InvalidAccess() }
+
+// Build the internal URL.
+var internalUrl = URLComponents(string: "https://data.internal/")!
+internalUrl.path = requestedUrl.path
+```
+
+Let's examine the result, in `internalUrl`:
+
+```swift
+// The "%2F" became a real slash when it was automatically decoded.
+
+internalUrl.url // "https://data.internal/karwa/files/../../monica/files"
+
+// Normalization (which intermediate software/caches/routers/etc are allowed to do)
+// may simplify this.
+
+internalUrl.url?.standardized // "https://data.internal/monica/files"
```
-Components are returned as they appear in the URL string, including any percent-encoding. The `WebURL` package includes a number of extensions to standard library types and protocols,
-to help you add and remove percent-encoding from strings. To remove percent-encoding, use the `percentDecoded` property, which is made available to all `String`s:
+The system authenticated for one user, but ends up requesting a file belonging to a different user. This sort of bug is not uncommon, and can be used to bypass software filters and access internal services or configuration files.
+
+`WebURL`'s model ensures that setting `urlA.path = urlB.path` does not add or remove percent-encoding. It keeps the component's structure intact, and makes these kinds of mistakes a lot more difficult.
```swift
-let url = WebURL("https://github.com/karwa/swift%2Durl/")!
-url.path // "/karwa/swift%2Durl/"
-url.path.percentDecoded // "/karwa/swift-url/"
+import WebURL
+
+let requestedUrl = WebURL("https://example.com/karwa/files/..%2F..%2Fmonica/files/")!
+
+// WebURL is already standardized =)
+// Check authentication for the subtree.
+guard isAuthenticated(for: requestedUrl.pathComponents.first) else { throw InvalidAccess() }
+
+// Build the internal URL.
+var internalUrl = WebURL("https://data.internal/")!
+internalUrl.path = requestedUrl.path
+
+// WebURL keeps the percent-encoding intact.
+internalUrl // "https://data.internal/karwa/files/..%2F..%2Fmonica/files/"
```
-## Relative URLs
+Okay, so - this was a bit of a long section, but I hope you found it useful. Percent-encoding can be difficult to get right, so don't worry if you didn't quite understand everything. If you find yourself unsure what to do while working on your application/library, refer back to this section or ask a question on the [Swift forums](https://forums.swift.org/c/related-projects/weburl/73) and we'll be happy to help.
+
+Now might be a good time to have a little break or make a cup of tea. Next stop is relative references.
-You can also create a URL by resolving a string relative to an existing, absolute URL (the "base URL").
-The result of this is another absolute URL, pointing to the same location as an HTML `` tag on the base URL's page:
+
+# Relative References
+
+
+Previously, we saw that we could construct a `WebURL` value by parsing a URL string. Another way to construct a `WebURL` is by resolving a relative reference using an existing `WebURL` as its "base". You can think of this as modelling where an HTML `` hyperlink on the base URL's 'page' would lead - for example, resolving the relative reference "/search?q=test" against the base URL "https://www.example.com/" produces "https://www.example.com/search?q=test".
```swift
let base = WebURL("https://github.com/karwa/swift-url/")!
-base.resolve("pulls/39")! // "https://github.com/karwa/swift-url/pulls/39"
-base.resolve("/apple/swift/")! // "https://github.com/apple/swift/"
-base.resolve("..?tab=repositories")! // "https://github.com/karwa/?tab=repositories"
-base.resolve("https://swift.org/")! // "https://swift.org"
+base.resolve("pulls/39") // "https://github.com/karwa/swift-url/pulls/39"
+base.resolve("/apple/swift/") // "https://github.com/apple/swift/"
+base.resolve("..?tab=repositories") // "https://github.com/karwa/?tab=repositories"
+base.resolve("https://swift.org/") // "https://swift.org"
```
-This is not limited to http(s) URLs; it works for every URL, including "file" URLs:
+Relative references can be resolved against any URL, including file URLs or URLs with custom schemes:
```swift
-let appData = WebURL("file:///tmp/")!.resolve("my_app/data/")!
-// appData = "file:///tmp/my_app/data/"
-let mapFile = appData.resolve("../other_data/map.json")!
-// mapFile = "file:///tmp/my_app/other_data/map.json"
+let appData = WebURL("file:///tmp/my_app/data/")! // "file:///tmp/my_app/data/"
+appData.resolve("metrics.json") // "file:///tmp/my_app/data/metrics.json"
+appData.resolve("../other_data/map.json") // "file:///tmp/my_app/other_data/map.json"
+
+let deeplink = WebURL("my-app:/profile/settings/language")! // "my-app:/profile/settings/language"
+deeplink.resolve("payment") // "my-app:/profile/settings/payment"
```
-## Modifying URLs
+References are resolved using the algorithm specified by the URL standard, meaning it matches what a browser would do. They are a powerful feature, with a wide variety of applications ranging from navigation to resolving HTTP redirects.
+
+One particular use-case is in server frameworks, where relative references can be used for routing. Currently, `WebURL` does not contain functionality for calculating the relative difference between 2 URLs, and is limited to references in string form. We'd like to explore adding this functionality and a richer object type in future releases.
+
+In the "closer look" section, we'll discuss some of the profound differences in how WebURL and Foundation approach relative references, and how these allow `WebURL` to be simpler, faster, and more intuitive.
+
+
+## A Closer Look At: The Foundation URL object model vs. WebURL
+
-`WebURL` does not need an intermediate type like `URLComponents`. Instead, components may be set directly.
+This topic gets to the heart of the `WebURL` object model, and some of its biggest differences from Foundation. According to the URL standard (hence `WebURL`), URLs must be absolute, and relative references themselves are not URLs. This makes intuitive sense; a set of directions, such as "second street on the right" only describes a location if you know where to start from, and a familial relative, such as "brother", only identifies that person if we know who the description is relative to ("my brother", "your brother", or "his/her brother"). Similarly, a relative URL reference, such as `"..?tab=repositories"`, is just a bag of components that doesn't point to or identify anything until it has been resolved using a base URL.
-Modifications are efficient, and occur in-place on the URL's existing storage object as capacity and value semantics allow.
+Once again, Foundation's `URL` has a significantly different model, which combines both absolute URLs and relative references in a single type. This leads to a situation where even strings which look nothing like URLs still parse successfully:
```swift
-var url = WebURL("http://github.com/karwa/swift-url/")!
+import Foundation
-// Upgrade to https:
-url.scheme = "https"
-url.serialized() // "https://github.com/karwa/swift-url/"
+URL(string: "foo") // "foo"
+```
-// Change the path:
-url.path = "/apple/swift/"
-url.serialized() // "https://github.com/apple/swift/"
+Consider what developers likely expect when they declare a variable or function argument to have the type "`URL`". They probably expect that the value contains something they can request - something which points to, or identifies something. By combining relative references with absolute URLs, Foundation's semantic guarantees are dramatically weakened, to the extent that they basically defeat the reason developers want a URL type in the first place. "foo" is not a URL.
+
+Moreover, if you initialize a Foundation `URL` by resolving a relative reference against a base URL, the resulting object _presents_ its components as being joined, but remembers that they are separate, storing separate `relativeString` and `baseURL` values which you can access later:
+
+```swift
+import Foundation
+let url = URL(string: "b/c", relativeTo: URL(string: "http://example.com/a/")!)!
+
+// The URL components appear to be joined.
+
+url.host // "example.com"
+url.path // "/a/b/c"
+
+// But underneath, the parts are stored separately.
+
+url.relativeString // "b/c"
+url.baseURL // "http://example.com/a/"
+```
+
+These decisions have serious detrimental effects to _almost every aspect_ of Foundation's `URL` API and performance:
+
+- **The baseURL is an independent object, with its own memory allocation and lifetime.**
+
+ Even if you don't use relative URLs, you still pay for them (another pointer to store, another release/destroy on deinit).
+ The base URL may even have its own side-allocations (e.g. for URLResourceKeys on Apple platforms).
+
+ And what if the baseURL has its own baseURL? Should we make a linked-list of every URL you visited on the way?
+ It turns out that Foundation caps chains of baseURLs to avoid this, but at the cost of allocating yet more objects.
+
+- **Resolving URL components and strings on-demand is slow.**
+
+ So `URL` sometimes allocates additional storage to cache the resolved string. Again, this means higher overheads and greater complexity; you effectively store the URL twice.
+
+- **The relativeString-baseURL split is fragile.**
+
+ Since the split is not really part of the URL string (it's just a stored property in Foundation's URL object), it is lost if you perform a simple task like encoding the URL to JSON.
+
+ What is even more interesting is that there is a specific hack in Foundation's `JSONEncoder/JSONDecoder` to accommodate this, but even that is not perfect - it [can fail](https://forums.swift.org/t/url-fails-to-decode-when-it-is-a-generic-argument-and-genericargument-from-decoder-is-used/36238) if you wrap the URL in a generic type, and places third-party `Encoder` and `Decoder`s in a position where they need to decide between a lossy encoding or serializing URLs as 2 strings. But that's not even the worst of it...
+
+- **URLs with a different relativeString-baseURL split are not interchangeable**.
+
+ This is huge. 2 `URL`s, with the same `.absoluteString` can appear as `!=` to each other; it depends on how you created each specific URL object.
+ This has a lot of ripple effects - for example, if you place a URL in a `Dictionary` or `Set`, it might not be found again unless you test it with a URL that was created using the exact same steps.
+
+Let's take a look at some code:
+
+```swift
+import Foundation
+
+let urlA = URL(string: "http://example.com/a/b/c")!
+let urlB = URL(string: "/a/b/c", relativeTo: URL(string: "http://example.com")!)!
+let urlC = URL(string: "b/c", relativeTo: URL(string: "http://example.com/a/")!)!
+
+// All of these URLs have the same .absoluteString.
+
+urlA.absoluteString == urlB.absoluteString // true
+urlB.absoluteString == urlC.absoluteString // true
+
+// But they are not interchangeable.
+
+urlA == urlB // false (!)
+urlB == urlC // false (!)
+URL(string: urlB.absoluteString) == urlB // false (!)
+
+// Let's imagine an application using URLs as keys in a dictionary:
+
+var operations: [URL: TaskHandle] = [:]
+operations[urlA] = TaskHandle { ... }
+operations[urlA] // TaskHandle
+operations[urlB] // nil (!)
```
-When you modify a component, the value you set will automatically be percent-encoded if it contains any illegal characters.
-This applies to the `username`, `password`, `path`, `query`, and `fragment` fields.
-Notably, it does not apply to the `scheme` or `hostname` - attempting to set an invalid `scheme` or `hostname` will fail.
+The cost of Foundation's object model is high, indeed; it results in unintuitive semantics, requires a much heavier object with several side allocations, and makes it more complex to deliver more valuable features like in-place mutation efficiently.
+
+`WebURL` takes a completely different approach.
+
+As mentioned previously, `WebURL`s are always absolute. They are always normalized, and are _entirely_ defined by their string representation, rather than their object representation or other side-data. It does not matter how you create a `WebURL` - you can parse an absolute URL string, resolve a relative reference, build it up in pieces using property setters, or decode a URL from JSON - if two `WebURL`s have the same serialized string, they are the same, and turning a `WebURL` in to a string and back results in lossless. As you would expect.
+
+This simpler model enables a raft of other improvements. `WebURL`s use less memory, have more predictable lifetimes, and are cheaper to create and destroy because they just _are_ simpler. Common operations like sorting, hashing and comparing URLs can be many times faster with `WebURL`, and have more intuitive semantics:
```swift
-var url = WebURL("https://example.com/my_files/secrets.txt")!
+import WebURL
+
+let urlA = WebURL("http://example.com/a/b/c")!
+let urlB = WebURL("http://example.com")!.resolve("/a/b/c")!
+let urlC = WebURL("http://example.com/a/")!.resolve("b/c")!
-url.username = "my username"
-url.password = "🤫"
-url.serialized() // "https://my%20username:%F0%9F%A4%AB@example.com/my_files/secrets.txt"
+// All of these URLs have the same serialization.
-url.hostname = "👾" // Fails, does not modify.
-url.serialized() // (unchanged)
+urlA.serialized() == urlB.serialized() // true
+urlB.serialized() == urlC.serialized() // true
+
+// And they are interchangeable, as you would expect.
+
+urlA == urlB // true
+urlB == urlC // true
+WebURL(urlB.serialized()) == urlB // true
+
+var operations: [WebURL: TaskHandle] = [:]
+operations[urlA] = TaskHandle { ... }
+operations[urlA] // TaskHandle
+operations[urlB] // TaskHandle
```
-In general, the setters are very permissive. However, if you do wish to detect and respond to failures to modify a component,
-use the corresponding throwing setter method instead. The thrown `Error`s contain specific information about why the operation failed,
-so it's easier for you to debug logic errors in your application.
+All of this leads to more robust code. It's delightfully boring; stuff just works, and you don't need to study a bunch of edge-cases to get the behavior you expect.
+
-## Path Components
+# Path Components
-You can access a URL's path components through the `pathComponents` property.
-This returns an object which conforms to Swift's `Collection` protocol, so you can use it in `for` loops and
-lots of other code directly, yet it efficiently shares storage with the URL it came from.
-The components returned by this view are automatically percent-decoded from their representation in the URL string.
+We've covered quite a lot already, and so far it's been pretty intense - deep discussions about percent-encoding, object models, etc. Thankfully, it's a bit lighter from here.
+
+We discussed earlier that a URL's path can represent locations hierarchically using a list of strings; but properly handling percent-encoding can be deceptively tricky. To make this easier, `WebURL` provides a convenient `.pathComponents` view which efficiently shares storage with the URL object, and conforms to Swift's `Collection` and `BidirectionalCollection` protocols so you can use it in `for` loops and generic algorithms.
```swift
let url = WebURL("file:///Users/karl/My%20Files/data.txt")!
@@ -125,8 +638,7 @@ if url.pathComponents.last!.hasSuffix(".txt") {
}
```
-Additionally, this view allows you to _modify_ a URL's path components.
-Any inserted components will be automatically percent-encoded in the URL string.
+Additionally, you can modify a URL's path components through this view.
```swift
var url = WebURL("file:///swift-url/Sources/WebURL/WebURL.swift")!
@@ -138,87 +650,163 @@ url.pathComponents.append("My Folder")
// url = "file:///swift-url/Sources/WebURL/My%20Folder"
url.pathComponents.removeLast(3)
+// url = "file:///swift-url"
url.pathComponents += ["Tests", "WebURLTests", "WebURLTests.swift"]
// url = "file:///swift-url/Tests/WebURLTests/WebURLTests.swift"
```
-Paths which end in a "/" (also called "directory paths"), are represented by an empty component at the end of the path.
-However, if you append to a directory path, `WebURL` will automatically remove that empty component for you.
-If you need to create a directory path, append an empty component, or use the `ensureDirectoryPath()` method.
+In contrast to top-level URL components such as the `.path` and `.query` properties, the elements of the `.pathComponents` view are automatically percent-decoded, and any inserted path components are encoded to preserve their contents _exactly_. Previously, we mentioned how to correctly handle percent-encoding when dealing with an entire URL component:
+
+> The correct way to handle percent-encoded URL components is to: (i) keep the encoding intact, (ii) parse the component in to its smallest units, and (iii) percent-decode each unit. For example, when splitting a path, split the percent-encoded path and decode each path component individually
+
+This is what `.pathComponents` (and other views) do for you.
```swift
-var url = WebURL("https://api.example.com/v1/")!
+// Remember that %2F is an encoded forward-slash.
+var url = WebURL("https://example.com/music/bands/AC%2FDC")
for component in url.pathComponents {
- ... // component = "v1", "".
+ ... // component = "music", "bands", "AC/DC".
}
-url.pathComponents += ["users", "karl"]
-// url = "https://api.example.com/v1/users/karl"
-// components = "v1", "users", "karl".
+// Inserted components are automatically encoded,
+// so they will be preserved exactly.
-url.pathComponents.ensureDirectoryPath()
-// url = "https://api.example.com/v1/users/karl/"
-// components = "v1", "users", "karl", "".
+url.pathComponents.removeLast()
+url.pathComponents.append("blink-%182")
+// url = "https://example.com/music/bands/blink-%25182"
+
+url.pathComponents.last // "blink-%182"
```
-## Form-Encoded Query Items
+In short, the `.pathComponents` view is the best way to read or modify a URL's path at the path-component level. It's efficient, and Swift's Collection protocols give it a large number of convenient operations out-of-the-box (such as map/reduce, slicing, pattern matching).
+
-You can also access the key-value pairs in a URL's query string using the `formParams` property.
-As with `pathComponents`, this returns an object which shares storage with the URL it came from.
+# Query Items
-You can use Swift's "dynamic member" feature to access query parameters as though they were properties.
-For example, in the query string `"from=EUR&to=USD"`, accessing `url.formParams.from` will return `"EUR"`.
-For parameters whose names cannot be used in Swift identifiers, the `get` method will also return the corresponding value for a key.
-Additionally, all of the query's key-value pairs are available as a Swift `Sequence` via the `allKeyValuePairs` property.
+Just as a URL's `path` may contain a list of path components, the `query` often contains a list of key-value pairs. WebURL provides a `.formParams` view which allows you to efficiently read and modify key-value pairs in a URL's query string.
-This view assumes that the query string's contents are encoded using `application/x-www-form-urlencoded` ("form encoding"),
-and all of the keys and values returned by this view are automatically decoded from form-encoding.
+Formally the query is just an opaque string; even if a URL has a query component, there is no guarantee that interpreting it as a list of key-value pairs makes sense, or which style of encoding they might use. The key-value pair convention began with HTML forms, which used an early version of percent-encoding (`application/x-www-form-urlencoded` or "form encoding"). As tends to be the way with these things, form-encoding looks very similar to, but is incompatible with, percent-encoding as we know it today, due to substituting spaces with the "+" character. These days, the convention has spread far beyond HTML forms, although other applications sometimes use percent-encoding rather than strictly form-encoding. It's up to you to know which sort of encoding applies to your URLs. This is... just life on the web, I'm afraid 😔.
+
+As with the `.pathComponents` view, returned keys and values are automatically decoded, and any inserted keys or values will be encoded so their contents are preserved exactly. The `.formParams` view assumes form-encoding, and writes values using form-encoding. In the future, we'll likely add a variant which uses percent-encoding (probably named `.queryParams` or something like that). The current implementation matches the `.searchParams` object from JavaScript's URL class.
+
+To read the value for a key, use the `get` function or access the value as a property:
```swift
let url = WebURL("https://example.com/currency/convert?amount=20&from=EUR&to=USD")!
-url.formParams.amount // "20"
-url.formParams.from // "EUR"
+url.formParams.amount // "20"
+url.formParams.from // "EUR"
url.formParams.get("to") // "USD"
+```
+
+Additionally, you can iterate all key-value pairs using the `allKeyValuePairs` property, which conforms to Swift's `Sequence` protocol:
+```swift
for (key, value) in url.formParams.allKeyValuePairs {
... // ("amount", "20"), ("from", "EUR"), ("to", "USD").
}
```
-And again, as with `pathComponent`, you can modify a URL's query string using `formParams`.
-To set a parameter, assign a new value to its property or use the `set` method. Setting a key to `nil` will remove it from the query.
-
-Also, any modification will re-encode the entire query string so that it is consistently encoded as `application/x-www-form-urlencoded`,
-if it is not already.
+To modify a value, assign to its property or use the `set` method. Setting a key that does not already exist insert a new key-value pair, and setting a key's value to `nil` will remove it from the query.
```swift
var url = WebURL("https://example.com/currency/convert?amount=20&from=EUR&to=USD")!
-url.formParams.amount // "20"
-url.formParams.to // "USD
+url.formParams.from = nil
+url.formParams.to = nil
+// url = "https://example.com/currency/convert?amount=20"
url.formParams.amount = "56"
-url.formParams.to = "Pound Sterling"
-// url = "https://example.com/currency/convert?amount=56&from=EUR&to=Pound+Sterling"
+url.formParams.from = "USD"
+url.formParams.to = "Pound Sterling"
+// url = "https://example.com/currency/convert?amount=56&from=USD&to=Pound+Sterling"
+
+url.formParams.set("format", to: "json")
+// url = "https://example.com/currency/convert?amount=56&from=USD&to=Pound+Sterling&format=json"
+```
+
+
+# File URLs
+
+
+Integration with the `swift-system` package (and Apple's `System.framework`) allows you to create file paths from URLs and URLs from file paths, and supports both POSIX and Windows-style paths. To enable this integration, add the `WebURLSystemExtras` product to your target's dependencies:
+
+```swift
+// swift-tools-version:5.3
+import PackageDescription
+
+let package = Package(
+ name: "MyPackage",
+ dependencies: [
+ .package(
+ url: "https://github.com/karwa/swift-url",
+ .upToNextMajor(from: "0.2.0") // or `.upToNextMinor`
+ )
+ ],
+ targets: [
+ .target(
+ name: "MyTarget",
+ dependencies: [
+ .product(name: "WebURL", package: "swift-url"),
+ .product(name: "WebURLSystemExtras", package: "swift-url"), // <---- This.
+ ]
+ )
+ ]
+)
+```
+
+Next, import the integration library. We'll be using Apple's built-in `System.framework` here, but the API is exactly the same if using `swift-system` as a package:
-url.formParams.format = "json"
-// url = "https://example.com/currency/convert?amount=56&from=EUR&to=Pound+Sterling&format=json"
+```swift
+import System
+import WebURL
+import WebURLSystemExtras
+
+// - Create a WebURL from a file path:
+var fileURL = try WebURL(filePath: NSTemporaryDirectory())
+fileURL.pathComponents += ["My App", "cache.dat"]
+// fileURL = "file:///var/folders/ds/msp7q0jx0cj5mm9vjf766l080000gn/T/My%20App/cache.dat"
+
+// - Create a System.FilePath from a WebURL:
+let path = try FilePath(url: fileURL)
+// path = "/var/folders/ds/msp7q0jx0cj5mm9vjf766l080000gn/T/My App/cache.dat"
+
+let descriptor = try FileDescriptor.open(path, .readWrite)
+try descriptor.write(....)
```
-## Further Reading
+The `WebURL(filePath:)` and `FilePath(url:)` initializers throw detailed errors which you can use to provide users with detailed diagnostics should their file URLs not be appropriate for the platform.
+
+There are lots of misconceptions about file URLs. Using URLs instead of file paths does **not** make your application/library more portable; ultimately, the URL still needs to represent a valid path for the specific system you want to use it with. Windows paths require a drive letter and refer to remote files using servers and shares, and there are special rules for how to manipulate and normalize them (for example, ".." components should not escape the drive root or share). POSIX paths work entirely differently; they don't have drive letters and tend to mount remote filesystems as though they were local folders.
+
+URLs work using URL semantics, which has to be very generic because URLs are used for so many things on all kinds of platforms. A URL's path has some passing resemblance to a POSIX-style filesystem path, and the parser includes some (platform-independent) compatibility behavior, but it isn't a replacement for a true file path type.
+
+File URLs are still useful as a currency format when you support both local and remote resources (e.g. opening a local file in a web browser), but within your application, if you know you're dealing with a file path, you should prefer to store and manipulate it using a domain-expert such as swift-system's `FilePath`. This is different to the advice given by Foundation, but it is what the editors of the URL Standard recommend, and it makes sense. I've seen a fair few bug reports in the URL standard from people disappointed that URLs don't always normalize like file paths do (e.g. [this one, from the Node.js team](https://github.com/whatwg/url/issues/552)).
+
+So WebURL's APIs for file URLs are intentionally limited to converting to/from the `FilePath` type. No [`.resolveSymlinksInPath()`](https://developer.apple.com/documentation/foundation/url/1780208-resolvesymlinksinpath) over here. Of course, you can still use the full set of APIs with file URLs (including `.pathComponents` and even `.formParams`) but for the best, most accurate file path handling, we recommend `FilePath` (or similar type from the library of your choice).
+
+
+# Wrapping up
+
+
+And that's it for this guide! We've covered a lot here:
+
+- Creating a WebURL, by parsing a String or resolving a relative reference,
+- Reading and modifying the WebURL's top-level components,
+- How to correctly handle percent-encoding,
+- Working with path components and form parameters, and
+- File URLs and file paths
+
+Hopefully having read this, you feel confident that you understand how to use `WebURL`, and the benefits it can bring to your application:
-And that's your overview! We've covered creating, reading, and manipulating URLs using `WebURL`. Hopefully you agree that it makes
-great use of the expressivity of Swift, and are excited to `WebURL` for:
+- URLs based on the latest industry standard, aligning with modern web browsers,
+- An API that is more intuitive and helps avoid subtle mistakes,
+- An API which more closely matches libraries in other languages whilst leveraging the full expressive power of Swift,
+- A blazing fast implementation with better memory efficiency and simpler object lifetimes.
-- URLs based on the latest industry standard.
-- Better-defined behaviour, and better alignment with how modern web browsers behave.
-- Speed and memory efficiency, as well as
-- APIs designed for Swift
+And there's even more advanced topics that we didn't cover, like the `Host` enum, _lazy_ percent-encoding and decoding, `Origin`s, and more. If you'd like to continue reading about the APIs available in the `WebURL` package, see the [official documentation](https://karwa.github.io/swift-url/), or just go try it out for yourself!
-There's even more that we didn't cover, like `Host` objects, IP Addresses, _lazy_ percent encoding/decoding, `Origin`s, the `JSModel`,
-or our super-powered `UTF8View`. If you'd like to continue reading about the APIs available in the `WebURL` package,
-see the [official documentation](https://karwa.github.io/swift-url/), or just go try it out for yourself!
+If you have any questions or comments, please [file an issue on GitHub](https://github.com/karwa/swift-url/issues), or [post a thread on the Swift forums](https://forums.swift.org/c/related-projects/weburl/73), or send the author a private message (`@Karl` on the Swift forums). We're constantly looking to improve the API as we push towards a 1.0 release, so your feedback is very much appreciated!
diff --git a/README.md b/README.md
index 4069b5ca4..47d5cdd86 100644
--- a/README.md
+++ b/README.md
@@ -1,271 +1,200 @@
# WebURL
-This package contains a new URL type for Swift, written in Swift.
+A new URL type for Swift.
-- The [Getting Started](GettingStarted.md) guide contains an overview of how to use the `WebURL` type.
-- The [Full documentation](https://karwa.github.io/swift-url/) contains a detailed information about at the API.
+- **Compliant** with the [URL Living Standard](https://url.spec.whatwg.org/) for web compatibility. WebURL matches modern browsers and popular libraries in other languages.
+
+- **Fast**. Tuned for high performance and low memory use.
+
+- **Swifty**. The API makes liberal use of generics, in-place mutation, zero-cost abstractions, and other Swift features. It's a big step up from Foundation's `URL`.
-You may be interested in:
+- **Portable**. The core WebURL library has no dependencies other than the Swift standard library.
-- This prototype port of [async-http-client](https://github.com/karwa/async-http-client), which allows you to perform http(s)
- requests using `WebURL`, and shows how easy it can be to adopt in your library.
+- **Memory-safe**. WebURL uses carefully tuned bounds-checking techniques which the compiler is better able to reason about.
-This URL type is compatible with the [URL Living Standard](https://url.spec.whatwg.org/) developed by the WHATWG, meaning it more closely
-matches how modern web browsers behave. It has a fast and efficient implementation, and an API designed to meet the needs of Swift developers.
+And of course, it's written in **100% Swift**.
+
+- The [Getting Started](GettingStarted.md) guide contains an overview of how to use the `WebURL` type.
+- The [API Reference](https://karwa.github.io/swift-url/) contains more detail about specific functionality.
-## Using WebURL in your project
+# Using WebURL in your project
-To use `WebURL` in a SwiftPM project, add the following line to the dependencies in your Package.swift file:
+To use this package in a SwiftPM project, you need to set it up as a package dependency:
```swift
-.package(url: "https://github.com/karwa/swift-url", from: "0.1.0"),
+// swift-tools-version:5.3
+import PackageDescription
+
+let package = Package(
+ name: "MyPackage",
+ dependencies: [
+ .package(
+ url: "https://github.com/karwa/swift-url",
+ .upToNextMajor(from: "0.2.0") // or `.upToNextMinor`
+ )
+ ],
+ targets: [
+ .target(
+ name: "MyTarget",
+ dependencies: [
+ .product(name: "WebURL", package: "swift-url")
+ ]
+ )
+ ]
+)
```
-## Project Goals
+And with that, you're ready to start using `WebURL`:
-1. For parsing to match the URL Living Standard.
-
+```swift
+import WebURL
- The URL parser included in this project is derived from the reference parser implementation described by the standard, and should
- be fully compatible with it. The programmatic API for reading and manipulating URL components (via the `WebURL` type) may contain
- minor deviations from the JavaScript API defined in the standard in order to suit the expectations of Swift developers,
- (e.g. the `query` property does not contain the leading "?" as it does in JavaScript, setters are stricter about invalid inputs, etc).
- The full JavaScript API is available via the `JSModel` type, which is implemented entirely in terms of the Swift API.
+var url = WebURL("https://github.com/karwa/swift-url")!
+url.scheme // "https"
+url.hostname // "github.com"
+url.path // "/karwa/swift-url"
- The list of differences between the `WebURL` API and the JavaScript `URL` class are documented [here](https://karwa.github.io/swift-url/WebURL_JSModel/).
-
- Conformance to the standard is tested via the common [Web Platform Tests](https://github.com/web-platform-tests/wpt/tree/master/url)
- used by browser developers to validate their implementations. Currently this consists of close to 600 parser tests, and about 200 tests
- for setting individual properties. The project also contains additional test databases which are validated against the JSDOM reference
- implementation, with the intention to upstream them in the future.
-
- Conformance to a modern URL standard is the "killer feature" of this project, and other than the documented differences in APIs,
- any mismatch between this parser and the standard is, categorically, a bug (please report them if you see them!). Foundation's `URL` type
- conforms to RFC-1738, from 1994, and `URLComponents` conforms to a different standard, RFC-3986 from 2005. The 1994 standard contains many issues
- which subsequent standards have defined or fixed; this project allows Swift to match the behaviour of modern web browsers.
-
-
-2. To be safe, fast, and memory-efficient.
-
-
- Swift is designed to be a safe language, free of undefined behaviour and memory-safety issues. The APIs exposed by this library use
- a combination of static and runtime checks to ensure memory-safety, and use of unsafe pointers internally is kept to a minimum.
-
- Performance is also very important to this project, but communicating comparisons is tricky. The obvious comparison would be against our existing
- URL type, `Foundation.URL`; but as mentioned above, it conforms to an entirely different standard. The new standard's parser is very permissive
- and components are normalized and percent-encoded during parsing in order to cast as wide a compatibility net as possible and harmonize their representation.
- So in some sense comparing `WebURL` and `Foundation.URL` is apples-to-oranges, but it can be argued that parsing time is an important metric for developers,
- regardless.
-
- Despite the extra processing, the "AverageURLs" benchmark in this repository demonstrates performance that is slightly faster than Foundation,
- on an Intel Mac (Ivy Bridge). Depending on their size and structure, improvements for other URLs can range from 15% (IPv6 addresses)
- to 66% (very long query strings), while using less memory. Additionally, common operations such as hashing and testing for equality can be more
- than twice as fast as `Foundation.URL`. We'll also be exploring some ideas which could further increase parsing performance.
-
- On lower-end systems, such as a Raspberry Pi 4 8GB running 64-bit Ubuntu and the [swift-arm64 community toolchain (5.4)](https://github.com/futurejones/swift-arm64),
- the same benchmarks can demonstrate even greater improvements; "AverageURLs" going from about 1.85s using Foundation, to only 62.67ms with `WebURL`.
-
- As with all benchmark numbers, YMMV.
-
- Additionally:
-
- - The API supports efficient in-place mutation, so `URLComponents` is no longer needed in order to modify a component's value.
- - The API offers views of the URL's path components and query parameters which share the URL's storage, allowing fast and efficient iteration
- and inspection.
- - These views _also_ support in-place mutation, so when appending a path-component or setting a query parameter,
- the operation should be as fast, if not faster, than the equivalent string manipulation.
-
- _(Note that benchmarking and optimizing the setters is still a work-in-progress.)_
-
+url.pathComponents.removeLast(2)
+url.pathComponents += ["apple", "swift"]
+url // "https://github.com/apple/swift"
+```
-3. To leverage Swift's language features in order to provide a clean, convenient, and powerful API.
-
+Make sure to read the [Getting Started](GettingStarted.md) guide for an overview of what you can do with `WebURL`.
- This library makes extensive use of generics; almost every API which accepts a `String`, from the parser to the component setters,
- also have variants which accept user-defined `Collection`s of UTF-8 code-units. This can be valuable in performance-sensitive scenarios,
- such as when parsing large numbers of URLs from data files or network packets.
-
- It also makes extensive use of wrappers which share a URL's storage, for example to provide a `Collection` interface to a URL's path components.
- These wrappers also showcase the power of `_modify` accessors, allowing for a clean API with namespaced operations, which retain the ability to modify
- a URL in-place:
-
- ```swift
- var url = WebURL("file:///usr/foo")!
- url.pathComponents.removeLast()
- url.pathComponents += ["lib", "swift"]
- print(url) // file:///usr/lib/swift
- ```
-
- The view of a URL's form-encoded query parameters also supports `@dynamicMemberLookup` for concise get- and set- operations:
-
- ```swift
- var url = WebURL("http://example.com/currency/convert?amount=20&from=EUR&to=USD")!
- print(url.formParams.amount) // "20"
- url.formParams.to = "GBP"
- print(url) // http://example.com/currency/convert?amount=20&from=EUR&to=GBP
- ```
-
- Setters that can fail also have throwing sister methods, which provide rich error information about why a particular operation did not succeed.
- These error descriptions do not capture any part of the URL, so they do not contain any privacy-sensitive data.
-
- Take a look at the [Getting Started](GettingStarted.md) guide for a tour of this package's core API.
-
-
+## Integration with swift-system
-## Roadmap
+WebURL 0.2.0 includes a library called `WebURLSystemExtras`, which integrates with `swift-system` and Apple's `System.framework`. This allows you to create `file:` URLs from `FilePath`s, and to create `FilePath`s from `file:` URLs. It supports both POSIX and Windows paths.
-The implementation is extensively tested, but the interfaces have not had time to stabilise.
+```swift
+.target(
+ name: "MyTarget",
+ dependencies: [
+ .product(name: "WebURL", package: "swift-url"),
+ .product(name: "WebURLSystemExtras", package: "swift-url") // <--- Add this.
+ ]
+)
+```
-While the package is in its pre-1.0 state, it may be necessary to make source-breaking changes.
-I'll do my best to keep these to a minimum, and any such changes will be accompanied by clear documentation explaining how to update your code.
+```swift
+import WebURL
+import System
+import WebURLSystemExtras
+
+func openFile(at url: WebURL) throws -> FileDescriptor {
+ let path = try FilePath(url: url)
+ return try FileDescriptor.open(path, .readOnly)
+}
+```
-I'd love to see this library adopted by as many libraries and applications as possible, so if there's anything I can add to make that easier,
-please file a GitHub issue or write a post on the Swift forums.
+## Prototype port of async-http-client
-Aside from stabilising the API, the other priorities for v1.0 are:
+We have a prototype port of [async-http-client](https://github.com/karwa/async-http-client), based on version 1.7.0 (the latest release as of writing), which uses WebURL for _all_ of its URL handling. It allows you to perform http(s) requests with WebURL, including support for HTTP/2, and is a useful demonstration of how to adopt WebURL in your library.
-1. file URL <-> file path conversion
+We'll be updating the port periodically, so if you wish to use it in an application we recommend making a fork and pulling in changes as you need.
- Having a port of `async-http-client` is a good start for handling http(s) requests, but file URLs also require attention.
+```swift
+import AsyncHTTPClient
+import WebURL
- It would be great to add a way to create a file path from a file URL and vice-versa. This should be relatively straightforward;
- we can look to cross-platform browsers for a good idea of how to handle this. Windows is the trickiest case (UNC paths, etc),
- but since Microsoft Edge is now using Chromium, we can look to [their implementation](https://chromium.googlesource.com/chromium/src/net/+/master/base/filename_util.cc)
- for guidance. It's also worth checking to see if WebKit or Firefox do anything different.
+let client = HTTPClient(eventLoopGroupProvider: .createNew)
-2. Converting to/from `Foundation.URL`.
+func getTextFile(url: WebURL) throws -> EventLoopFuture {
+ let request = try HTTPClient.Request(url: url, method: .GET, headers: [:])
+ return client.execute(request: request, deadline: .none).map { response in
+ response.body.map { String(decoding: $0.readableBytesView, as: UTF8.self) }
+ }
+}
- This is a delicate area and needs careful consideration of the use-cases we need to support. Broadly speaking, there are 2 ways to approach it:
+let url = WebURL("https://github.com/karwa/swift-url/raw/main/README.md")!
+try getTextFile(url: url).wait() // "# WebURL A new URL type for Swift..."
+```
- - Re-parsing the URL string.
+# Project Status
- This is what [WebKit does](https://github.com/WebKit/WebKit/blob/99f5741f2fe785981f20fb1fee5869a2863d16d6/Source/WTF/wtf/cocoa/URLCocoa.mm#L79).
- The benefit is that it is straightforward to implement. The drawbacks are that Foundation refuses to accept a lot of URLs which the modern standards consider valid,
- so support could be limited. In at least one case that I know of, differences between the parsers have lead to exploitable security vulnerabilities
- (when conversion changes the URL's origin, which is why WebKit's conversion routine now includes a specific same-origin check).
+WebURL is a complete URL library, implementing the latest version of the URL Standard (as of writing, that is the August 2021 review draft). It is tested against the [shared `web-platform-tests`](https://github.com/web-platform-tests/wpt/) used by major browsers, and passes all constructor and setter tests other than those which rely on IDNA. The library includes a comprehensive set of APIs for working with URLs: getting/setting basic components, percent-encoding/decoding, reading and writing path components, form parameters, file paths, etc. Each has their own extensive sets of tests in addition to the shared web-platform-tests.
- Something like this, with appropriate checks on the re-parsed result, may be acceptable as an MVP, but ideally we'd want something more robust with better support
- for non-http(s) URLs in non-browser contexts.
+The project is regularly benchmarked using the suite available in the `Benchmarks` directory and fuzz-tested using the fuzzers available in the `Fuzzers` directory.
- - Re-writing the URL string based on `Foundation.URL`'s _components_.
+Being a pre-1.0 package, the interfaces have not had time to stabilize. If there's anything you think could be improved, your feedback is welcome - either open a GitHub issue or post to the [Swift forums](https://forums.swift.org/c/related-projects/weburl/73).
- This should ensure that the resulting URL contains semantically equivalent values for its username, password, hostname, path, query, etc., with the conversion
- procedure adding percent-encoding as necessary to smooth over differences in allowed characters (e.g. Foundation appears to refuse "{" or "}" in hostnames or
- query strings, while newer standards allow them, so we'd need to percent-encode those).
-
- The `WebURL` parser has been designed with half an eye on this; in theory we should be able to construct a `ScannedRangesAndFlags` over Foundation's URL string,
- using the range information from Foundation's parser, and `URLWriter` will take care of percent-encoding the components, simplifying the path, and assembling the
- components in to a URL string. That said, URLs are rarely so simple, and this process will need a _very thorough_ examination and database of tests.
+Prior to 1.0, it may be necessary to make source-breaking changes.
+I'll do my best to keep these to a minimum, and any such changes will be accompanied by clear documentation explaining how to update your code.
- Even after this is done, my intuition is that it would be unwise for developers to assume seamless conversions between `Foundation.URL` and `WebURL`.
- It should be okay to do it once at an API boundary - e.g. for an HTTP library built using `WebURL` to accept requests using `Foundation.URL` -
- but such libraries should convert to one URL type as soon as possible, and use that single type to provide all information used to make the request.
-
- As an example of the issues that may arise: if the conversion process adds percent-encoding, performing multiple conversions such as `WebURL -> URL -> WebURL`,
- or `URL -> WebURL -> URL`, will result in an object with the same type, but a different URL string (including a different hash value, and comparing as `!=` to the starting URL).
- That would be a problem for developers who expect a response's `.url` property to be the same as the URL they made the request with. That's why it's better to stick to
- a single type conversion; when a developer sees that the response's `.url` property has a different type, there is more of a signal that the content may have changed slightly.
+## Roadmap
+
+Aside from stabilizing the API, the other priorities for v1.0 are:
+
+1. `Foundation` interoperability.
+
+ Foundation's `URL` type is the primary type used for URLs on Swift today, and Foundation APIs such as `URLSession` are critical for many applications, in particular because of their system integration on Apple platforms.
-3. Benchmarking and optimizing setters, including modifications via `pathComponents` and `formParams` views.
+ We will provide a compatibility library which allows these APIs to be used together with `WebURL`.
-Post-1.0:
+Looking beyond v1.0, the other features I'd like to add are:
+
+2. Better APIs for `data:` URLs.
+
+ WebURL already supports them as generic URLs, but it would be nice to add APIs for extracting the MIME type and decoding base64-encoded data.
-4. Non-form-encoded query parameters.
+3. Non-form-encoded query parameters.
- Like the `formParams` view, this would interpret the `query` string as a string of key-value pairs, but _without_ assuming that the query should be form-encoded.
- Such an API [was pitched](https://github.com/whatwg/url/issues/491) for inclusion in the URL standard, but is not included since the key-value pair format was
- only ever codified by the form-encoding standard; its use for non-form-encoded content is just a popular convention.
+ A URL's `query` component is often used as a string of key-value pairs. This usage appears to have originated with HTML forms, which WebURL supports via its `formParams` view, but popular convention these days is also to use keys and values that are not _strictly_ form-encoded. This can lead to decoding issues.
- That said, it would likely be valuable to add library-level support to make this convention easier to work with.
+ Additionally, we may want to consider making key lookup Unicode-aware. It makes sense, but AFAIK is unprecedented in other libraries and so may be surprising. But it does make a lot of sense.
-5. Relative URLs.
+4. APIs for relative references.
- Have repeatedly [been pitched](https://github.com/whatwg/url/issues/531) for inclusion in the standard. Support can be emulated to some extent by
- using the `thismessage:` scheme reserved by IANA for this purpose, but it is still a little cumbersome, and is common enough outside of browser contexts to
- warrant its own type and independent test-suite. Implementation may be as simple as wrapping a `WebURL` with the `thismessage:` scheme, or as complex as the
- Saturn V rocket; it is really quite hard to tell, because URLs.
+ All `WebURL`s are absolute URLs (following the standard), and relative references are currently only supported as strings via the [`WebURL.resolve(_:)` method](https://karwa.github.io/swift-url/WebURL/#weburl.resolve(_:)).
+
+ It would be valuable to a lot of applications (e.g. server frameworks) to add a richer API for reading and manipulating relative references, instead of using only strings. We may also want to calculate the difference between 2 URLs and return the result as a relative reference.
-6. IDNA
+5. IDNA
- By far the biggest thing. See the FAQ for details.
+ This is part of the URL Standard, and its position on this list shouldn't be read as downplaying its importance. It is a high-priority item, but is currently blocked by other things.
-## Sponsorship
+ There is reason to hope this may be implementable soon. Native Unicode normalization was [recently](https://github.com/apple/swift/pull/38922) implemented in the Swift standard library for String, and there is a desire to expose this functionality to libraries such as this one. Once those APIs are available, we'll be able to use them to implement IDNA.
-I'm creating this library because I think that Swift is a great language, and it deserves a high-quality, modern library for handling URLs.
-It has taken a lot of time to get things to this stage, and there is an exciting roadmap ahead.
+# Sponsorship
-It demands a lot of careful study, a lot of patience, and a lot of motivation to bring something like this together. So if you
-(or the company you work for) benefit from this project, do consider donating to show your support and encourage future development.
-Maybe it saves you some time on your server instances, or saves you time chasing down weird bugs in your URL code.
+I'm creating this library because I think that Swift is a great language, and it deserves a high-quality, modern library for handling URLs. It has taken a lot of time to get things to this stage, and there is an exciting roadmap ahead. so if you
+(or the company you work for) benefit from this project, do consider donating to show your support and encourage future development. Maybe it saves you some time on your server instances, or saves you time chasing down weird bugs in your URL code.
-In any case, thank you for stopping by and checking it out.
+# FAQ
-## FAQ
+## How do I leave feedback?
-### What is the WHATWG URL Living Standard?
+Either open a GitHub issue or post to the [Swift forums](https://forums.swift.org/c/related-projects/weburl/73).
-It may be surprising to learn that there isn't a single way to interpret URLs. There have been several attempts to create such a thing,
-beginning with the IETF documents [RFC-1738](https://www.ietf.org/rfc/rfc1738.txt) in 1994, and the revised version
-[RFC-3986](https://www.ietf.org/rfc/rfc3986.txt) in 2005.
+## Are pull requests/code reviews/comments/questions welcome?
-Unfortunately, it's rare to find an application or URL library which completely abides by those specifications, and the specifications
-themselves contain ambiguitites which lead to divergent behaviour across implementations. Some of these issues were summarised
-in a draft [working document](https://tools.ietf.org/html/draft-ruby-url-problem-01) by Sam Ruby and Larry Masinter. As the web
-continued to develop, the WHATWG and W3C required a new definition of "URL" which matched how browsers _actually_ behaved.
-That effort eventually became the WHATWG's URL Living Standard.
+Most definitely!
-The WHATWG is an industry association led by the major browser developers (currently, the steering committee consists of
-representatives from Apple, Google, Mozilla, and Microsoft), and there is high-level approval for their browsers to align with the
-standards developed by that group. The standards developed by the WHATWG are "living standards":
+## Is this production-ready?
-> Despite the continuous maintenance, or maybe we should say as part of the continuing maintenance, a significant effort is placed on
-getting the standard and the implementations to converge — the parts of the standard that are mature and stable are not changed
-willy nilly. Maintenance means that the days where the standard are brought down from the mountain and remain forever locked,
-even if it turns out that all the browsers do something else, or even if it turns out that the standard left some detail out and the browsers
-all disagree on how to implement it, are gone. Instead, we now make sure to update the standard to be detailed enough that all the
-implementations (not just browsers, of course) can do the same thing. Instead of ignoring what the browsers do, we fix the standard
-to match what the browsers do. Instead of leaving the standard ambiguous, we fix the the standard to define how things work.
+Yes, it is being used in production.
-From [WHATWG.org FAQ: What does "Living Standard" mean?](https://whatwg.org/faq#living-standard)
+The implementation is extensively tested (including against the shared `web-platform-tests` used by the major browsers, which we have also made contributions to, and by fuzz-testing), so we have confidence that the behavior is reliable.
-While the WHATWG has [encountered criticism](https://daniel.haxx.se/blog/2016/05/11/my-url-isnt-your-url/) for being overly concerned with
-browsers over all other users of URLs (a criticism which, to be fair, is not _entirely_ without merit), I've personally found the process to
-be remarkably open, with development occurring in the open on GitHub, and the opportunity for anybody to file issues or submit improvements via pull-requests.
-While their immediate priority is, of course, to unify browser behaviour, it's still the industry's best starting point to fix the issues
-previous standards have faced and develop a modern interpretation of URLs. Not only that, but it seems to me that any future URL standards will have to
-consider consistency with web browsers to have any realistic chance of describing how other applications should interpret them.
+Additionally, the benchmarks package available in this repository helps ensure the performance is well-understood, and that operations maintain a consistent performance profile. Benchmarks are run on a variety of devices, from high-end modern x64 computers to the raspberry pi.
-### Does this library support IDNA?
+## Why the name `WebURL`?
-Not yet.
-It is important to note that IDNA is _not (just) Punycode_ (a lot of people seem to mistake the two).
+1. `WebURL` is short but still distinct enough from Foundation's `URL`.
-Actually supporting IDNA involves 2 main steps (well, actually more, but for this discussion we can pretend it's only 2):
+2. The WHATWG works on technologies for the web platform. By following the WHATWG URL Standard, `WebURL` could be considered a kind of "Web-platform URL".
-1. Unicode normalization
+## What is the WHATWG URL Living Standard?
- IDNA also requires a unique flavour of unicode normalization and case-folding, [defined by the Unicode Consortium](https://unicode.org/reports/tr46/).
- Part of this is just NFC normalization, but there are additional, domain-specific mapping tables as well (literally, specific to networking domains).
- The latest version of that mapping table can be found [here](https://www.unicode.org/Public/idna/latest/IdnaMappingTable.txt).
-
-2. Punycode
+It may be surprising to learn that there many interpretations of URLs floating about - after all, you type a URL in to your browser, and it just works! Right? Well, sometimes...
- It is the result of this normalization procedure which is encoded to ASCII via Punycode. In this respect, Punycode acts just like percent-encoding:
- it takes a Unicode string in, and outputs an ASCII string which can be used to recover the original content.
+This [memo](https://tools.ietf.org/html/draft-ruby-url-problem-01) from the IETF network working group has a good overview of the history. In summary, URLs were first specified in 1994, and there were a lot of hopeful concepts like URIs, URNs, and scheme-specific syntax definitions. Most of those efforts didn't get the attention they would have needed and were revised by later standards such as [RFC-2396](https://datatracker.ietf.org/doc/html/rfc2396) in 1998, and [RFC-3986](https://www.ietf.org/rfc/rfc3986.txt) in 2005. Also, URLs were originally defined as ASCII, and there were fears that Unicode would break legacy systems, hence yet more standards and concepts such as IRIs, which also ended up not getting the attention they would have needed. So there are all these different standards floating around.
- Why not just use percent-encoding? Because percent-encoding is seriously inefficient; it turns every non-ASCII byte from the input in to 3 bytes from the output.
- DNS imposes limits on the maximum length of domain names (253 bytes total, 63 bytes per label), so a more space-efficient encoding was needed.
- Only the hostname uses IDNA, because it is the only part affected by this DNS restriction.
+In the mean time, browsers had been doing their own thing. The RFCs are not only ambiguous in places, but would _break the web_ if browsers adopted them. For URL libraries (e.g. cURL) and their users, web compatibility is really important, so over time they also began to diverge from the standards. These days it's rare to find any application/library which strictly follows any published standard -- and that's pretty bad! When you type your URL in to a browser or use one in your application, you expect that everybody involved understands it the same way. Because when they don't, stuff doesn't work and it may even open up [exploitable bugs](https://www.youtube.com/watch?v=voTHFdL9S2k).
-That Unicode normalization step is really crucial - unforuntately, Swift's standard library doesn't expose its Unicode algorithms or data tables at the moment,
-meaning the only viable way to implement this would be to ship our own copy of ICU or introduce a platform dependency on the system's version of ICU.
+So we're at a state where there are multiple, incompatible standards. Clearly, there was only one answer: another standard! 😅 But seriously, this time, it had to be web-compatible, so browsers could adopt it. For a URL standard, matching how browsers behave is kinda a big deal, you know?
-As it stands, this URL library doesn't contain _any_ direct dependencies on system libraries, and I would dearly like to keep it that way. At the same time, it has
-long been recognised that the Swift standard library needs to provide access to these algorithms. So once this project has settled down a bit, my plan is to turn my attention
-towards the Swift standard library - at least to implement NFC and case-folding, and possibly even to expose IDNA as a standard-library feature, if the core team are amenable to that.
-I suspect that will largely be influenced by how well we can ensure that code which doesn't use IDNA doesn't pay the cost of loading those data tables. We'll see how it goes.
+This is where the WHATWG comes in to it. The WHATWG is an industry association led by the major browser developers (currently, the steering committee consists of representatives from Apple, Google, Mozilla, and Microsoft), and there is high-level approval for their browsers to align with the standards developed by the group.
-For the time being, we detect non-ASCII and Punycode domains, and the parser fails if it encounters them.
-Of those ~600 URL constructor tests in the WPT repository, we only fail 10, and all of them are because we refused to parse an address that would have required IDNA.
+The WHATWG URL Living Standard defines how actors on the web platform should understand and manipulate URLs - how browsers process them, how code such as JavaScript processes them, etc.
-We will support it eventually, but it's just not practical at this very moment.
+By aligning to the URL Living Standard, this project aims to provide the behavior you expect, with better reliability and interoperability, sharing a standard and test-suite with your browser, and engaging with the web standards process. And by doing so, we hope to make Swift an even more attractive language for both servers and client applications.
\ No newline at end of file