Story of Deprecation and Positive Thinking in URLs Encoding

Table of contents

There is the saying, ‘If it works, don’t touch it!’ I like it, but sometimes changes could be requested by someone from the outside, and if that ‘someone’ is as big as Apple, we have to listen.

Recently, we decided to update some code related to URL encoding (frankly speaking, we should have addressed it earlier, but you know how things go…). The Apple compiler asked as to update the code with the following warning:

‘CFURLCreateStringByAddingPercentEscapes’ was deprecated in iOS 9.0: Use [NSString stringByAddingPercentEncodingWithAllowedCharacters:] instead, which always uses the recommended UTF-8 encoding, and which encodes for a specific URL component or subcomponent (since each URL component or subcomponent has different rules for what characters are valid).

The code piece that caused this warning was the following:

(__bridge_transfer NSString*)CFURLCreateStringByAddingPercentEscapes(NULL, 
(CFStringRef)self, NULL, (CFStringRef)@”!*’\”();:@&=+$,/?%#[]% “,
CFStringConvertNSStringEncodingToEncoding(encoding))

(By the way, this was a code from one of Google’s libraries.)

The easiest ‘fix’ to silence deprecation with no effort would be this:

NSString *blackList = @”!*’\”();:@&=+$,/?%#[]% ”
NSCharacterSet *allowedCharacters = [NSCharacterSet characterSetWithCharactersInString:blackList].invertedSet;
NSString *encodedURL = [url stringByAddingPercentEncodingWithAllowedCharacters:allowedCharacters]

However, this wasn’t a perfect solution in our case. This is because our app had to be able to handle any (valid) URL. The app doesn’t compose a URL through components but instead receives it as input (by decoding QR-codes). Furthermore, with knowledge of the existence of internationalised domains, it’s better to operate with the allowed character set than with a black list of characters. Thus, in our case, we should use (NSString *)stringByAddingPercentEncodingWithAllowedCharacters:(NSCharacterSet *)allowedCharacters.

What should the allowedCharacters presumably have been? There is a bunch of standard character sets prefixed with ‘URL’ that could have been used for this purpose:

URLFragmentAllowedCharacterSet
URLHostAllowedCharacterSet
URLPasswordAllowedCharacterSet
URLPathAllowedCharacterSet
URLQueryAllowedCharacterSet
URLUserAllowedCharacterSet

Should we have used a union of all of these or some of these as a superset for others? We like to have things clear, so we decided to check, and with a simple playground, we tried to compare them.

The result:

urlQueryAllowed superset for urlUserAllowed
urlQueryAllowed NOT superset for urlHostAllowed
urlQueryAllowed superset for urlPathAllowed
urlQueryAllowed superset for urlFragmentAllowed
urlQueryAllowed superset for urlPasswordAllowed

It looked like the only missing part was URLHostAllowedCharacterSet. We added it and checked once again:

let allowedCharacters = NSMutableCharacterSet()
allowedCharacters.formUnion(with: .urlQueryAllowed)
allowedCharacters.formUnion(with: .urlHostAllowed)

And now it’s a ‘full house’:

allowedCharacters superset for urlUserAllowed
allowedCharacters superset for urlHostAllowed
allowedCharacters superset for urlPathAllowed
allowedCharacters superset for urlFragmentAllowed
allowedCharacters superset for urlPasswordAllowed
allowedCharacters superset for urlQueryAllowed

We got our answer: the union of urlQueryAllowed and urlHostAllowed contained all the characters.

In general, you must use allowed character sets specifically for every part of URL/URI. For example, apply the urlPasswordAllowed subset only for user credentials (if you pass them) and the urlQueryAllowed subset only for a query part.

Happy URL encoding!

More to read:

You Might Also Like

Blog Posts Distribution of Educational Content within LMS and Beyond
October 16, 2023
When creating digital education content, it is a good practice to make it compatible with major LMSs by using one of the widely used e-learning standards. The post helps to choose a suitable solution with minimal compromise.
Blog Posts The Laws of Proximity and Common Region in UX Design
April 18, 2022
The Laws of Proximity and Common Region explain how people decide if an element is a part of a group and are especially helpful for interface designers.
Blog Posts Custom Segmented Control with System-like Interface in SwiftUI
March 31, 2022
Our goal today is to create a Segmented Control that accepts segments not as an array, but as views provided by the ViewBuilder. This is the same method that the standard Picker employs.