Set is a data structure that allows you to store unique values. If you try to add the same value multiple times, the Set will only add it once and ignore all other requests. Also, you can check very quickly if a value exists or not. Searching by value on arrays takes O(n)
. However, searching by value on a Set takes O(1)
on average.
A Set can be implemented in different ways. One way it’s using a Hash Map, and other is using a Tree Map. JavaScript has a built-in Hash Set, so that' the one we are going to focus on.
Tip
|
We will go more in details with Tree Map after we cover the [binary-search-tree-chap]. |
An array allows you to search a value by index in constant time O(1)
; however, if you don’t know the index, searching a value would take you linear time O(n)
. A Set has doesn’t allow you to search value by index, but you can search by value in constant time. The Set.add
and Set.has
method both are O(1)
in average.
Take a look at the following examples:
const set = new Set();
set.add(1); //↪️ Set [ 1 ]
set.add(1); //↪️ Set [ 1 ]
set.add(2); //↪️ Set [ 1, 2 ]
set.add(3); //↪️ Set [ 1, 2, 3 ]
set.has(1); //↪️ true
set.delete(1); //↪️ removes 1 from the set
set.has(1); //↪️ false, 1 has been removed
set.size; //↪️ 2, we just removed one value
console.log(set); //↪️ Set(2) {2, 3}
As you can see, even if we insert the same value multiple times, it only gets added once.
Like a map, you can also insert objects, arrays, maps, and even other sets. However, be careful because anything that is not a number, string, or symbol would be matched by reference. Let’s do some examples.
const set = new Set();
// matching by value
set.add({a: 1, b: 2});
set.has({a: 1, b: 2}); // ↪️ false
set.add({a: 1, b: 2}); // not ignored
// matching by reference
const a = {a: 1, b: 2};
set.add(a);
set.has(a); // ↪️ true
set.add(a); // this requests will be ignore.
// Set has 3 arrays with the same value, but since they all have different memory address it's allowed.
console.log(set); // Set { {a: 1, b: 2}, {a: 1, b: 2}, {a: 1, b: 2} }
As you can see, you can’t find an object using a new object (e.g. {a: 1, b: 2}
); you need the reference to find it.
If you need to match by value, you would need to convert it to a string using JSON.stringify
.
const set = new Set();
set.add(JSON.stringify({a: 1, b: 2}));
set.add(JSON.stringify({a: 1, b: 2})); // ignored
set.has(JSON.stringify({a: 1, b: 2})); // ↪️ true
// Only one object, since strings are matched by value and not by reference.
console.log(set); // Set { '{"a":1,"b":2}' }
One typical case for a Set is to eliminate duplicates from an array.
const arr = [1, 2, 2, 1, 3, 2];
// convert array to set
const set = new Set(arr);
// convert set to array
const uniqueValues = Array.from(set);
// check array
console.log(uniqueValues); // [ 1, 2, 3 ]
You can also do it all in one line.
const arr = [1, 2, 2, 1, 3, 2];
console.log([...new Set(arr)]); // [ 1, 2, 3 ]
All operations on Hash Set are constant time on average: O(1)
. Like the Hash Map, there are cases when the the Set is getting full, and it would do a rehash taking O(n)
for that one insertion.
Data Structure |
Searching By |
Insert |
Delete |
Space Complexity |
|
Index/Key |
Value |
||||
Hash Set |
O(1) |
- |
O(1)* |
O(1) |
O(n) |
* = Amortized run time. E.g. rehashing might affect run time to O(n).
ST-1) Given a text and a list of banned words. Find the most common word that is not on the banned list. You might need to sanitize the text and strip out punctuation `?!,'.`
Common in interviews at: Amazon.
Examples:
mostCommonWord(
`How much wood, would a Woodchuck chuck,
if a woodchuck could chuck?`,
['a'],
); // woodchuck or chuck (both show up twice)
mostCommonWord(
`It's a blue ball and its shade... Very BLUE!`,
['and']); // blue (it show up twice, "it" and "its" once)
Starter code:
link:../../interview-questions/most-common-word.js[role=include]
Solution: [set-q-most-common-word]
ST-2) Find the length of the longest substring without repeating characters.
Common in interviews at: Amazon, Facebook, Bloomberg.
Examples:
lenLongestSubstring('aaaaa'); // 1 ('a')
lenLongestSubstring('abccdefg'); // 5 ('cdefg')
lenLongestSubstring('abc'); // 3 ('abc')
Starter code:
link:../../interview-questions/longest-substring-without-repeating-characters.js[role=include]