Beautiful 3 Set
I was on hackerrank the other day trying to solve ‘Beautiful 3 Set’. The problem goes like this:
You are given an integer n
. A set, S
, of triples (x_i, y_i, z_i)
is beautiful if and only if:
0 <= x_i, y_i, z_i
x_i + y_i + z_i = n
, For alli
such that1 <= i <= |S|
- Let X be the set of different
x_i
’s in S, Y be the set of differenty_i
’s in S, and Z be the set of differentz_i
in S. Then|X| = |Y| = |Z| = |S|
The third condition means that all x_i
’s are pairwise distinct. The same goes for y_i
and z_i
.
Problem: Given n
, find any beautiful set having maximum number of elements. Then print the cardinality of S
on a new line, followed by |S|
lines where each line contains 3 space-separated integers describing the respective values of x_i
, y_i
and z_i
.
Input Format:
A sample integer, n. (1 <= n <= 300)
Solution
The solution to this problem is not obvious. Writing out the testcases for the first few n
is difficult for verification and the questions of uniqueness for each n
comes up. Is there a quick and easy way of coming up with the solution without having to brute force? The answer to the question is yes, because otherwise hackerrank will reject the slow solution. There must be a quick solution and we shall find it (not always the case in general).
At this point I had no real intuition on the problem and decided to brute force the problem first. For the brute force approach, a basic outline of the strategy is to come up with all potential S
sets and store the largest ones of them for each n
. We don’t know if these S
sets are unique(excluding column -(x|y|z)
switching) for a given n
so we will have to store them all. Now let’s get into the specifics of our strategy:
For each n
we will store all S
that satisfy the first three conditions described in the problem description. To generate a single S
, we need some sort of global set to choose from. In other words, we need to generate all valid (x_i, y_i, z_i)
triples as our basis. Each (x|y|z)_i
is a number from 0
to n
(inclusive) and the triplets thus can be generated using a nested for loop with indexes i
and j
(once you know the first two numbers then you know the last z
number since x+y+z=n
). This gives an upper bound of O(n^2)
pairs.
Once you have generated all the possible triplets and have filtered all the invalid ones out by the second condition, then you can start generating potential S
candidates. For any given set S
, each existing triplet in the basis is either in S
or not ~ a binary choice. The binary choice is a good hint on some sort of recursive approach. By checking all potential S
our runtime for this will be O(2^{|triplet_basis_set_size})
which has an upper bound. Definitely won’t work for n=300
on hackerrank.
Thus the runtime of our brute force approach is O(2^{n^2})
(it is easy to optimize the brute force algorithm… but it would still be too slow for hackerrank). That is crazy slow! This is why we are doing this brute force approach on our machine and not hackerrank. Here is the brute force code:
I only had the patience to run up to n = 17
. I observed that there were many sets of max |S|
stored for each n
. Given the way the basis was derived and the recursive function ran, I decided to only filter for the first max |S|
printed for each n
. Here are the first for n=2
to n=17
:
This once again does not tell us much other than the fact that there exists a combination with max | S | such that the x ’s can be ordered from 0 to some sort of limit. Is it guaranteed that for all n the x ’s can be ordered? We will assume for now that the answer is yes. The pattern for the y and z are not clear either. I decided to treat y-z row values as coordinates on a y-z plane and plot them. Here is a plot for n=9 : |
- | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 2 | 3 | 4 | 5 |
6 | 7 | 8 | 9 |
1 | 1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 | 9 | 10 |
2 | 2 | 3 |
4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
3 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 |
4 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 |
5 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
6 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
7 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
8 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
9 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
The rows and column labels represent different potential values for y
and z
(first col is y
values and first row is z
values), the entries on (y,z)
coordinates is the sum: y+z
. This is done because as the x
column values increase one by one then, by definition, the sum of the y
and z
entry must decrease by one. This can be observed visually on the graph as each highlighted number is unique. In addition, we know that all numbers in the x
column are unique and therefore the values in the y
and z
columns must also be unique. Graphically, this translates to no row or column having more than one circled number and the equal sum present in each (anti) diagonal corresponds to a specific x value since x+y+z=n
.
We thus have translated our problem to a different, more visual one. Our translation is somewhat similar to the Eight(N) queens puzzle: no highlighted-number/queen can appear in the same column/row/(antidiagonal) on the matrix/chessboard with no pieces below the main y+z=n
diagonal. This translation gives us more insight into the restrictions on the values in the triplets and really drives the point home that this problem is only in terms of y
and z
.
If we go back and plot from n=2
to n=18
then one notices there is no pattern amongst the different n
’s and no natural way to build up the solution. This is slightly disappointing because the printed n=
results were the first from max |S|
found in the brute force code. The hope is that by recording the first result, we would have had caught some sort of relation between the different n
.
Nonetheless, we know what the |S|
size is and have a very simple visual way of coming up with new potential solutions. The simple solution for all n
followed the n=9
pattern (after playing around a bit with the different ‘n-chessboard’ it becomes clear). There are two diagonal like lines. Bottom one starts (bottom-left) at some number (for n=9
it is (y,z)=(6,0)
) and the consecutive entries in diagonal like line are located at (y-1, z+2)
all the way up until y+z=n
. The top diagonal like line starts with y'=y-1
from previous y
when y+z=n
. The first(left-bottom most) z
for top diagonal is z=1
. Top diagonal increases in same fashion as bottom one until bounds of ‘chessboard’ are hit.
Our solution now does not depend on y
and z
but on the left most number of the bottom row. If you plot out the ‘chessboards’ for first 10 n
, then when will notice that the pattern is mod 3
and thus the O(n)
solution is formed:
Conclusion
The main idea I wanted to push in this article is that in order to solve complex problems such as the ones present in hackerrank, it is OK to take a two step approach and brute it on your machine first. It is OK to try a stupid solution first. It is all about discovery.
Sometimes it is sufficient to simply work through some examples by hand to notice a pattern when solving these problems. However, sometimes that can be difficult to do and verify correctness of and a two step solution can help crack the problem.
Unfortunately, this tip is not the silver/magic bullet to solving ‘Hard’ problems on hackerrank. Oftentimes, the input to the problem is hard to formulate to begin with (like some graph with complex properties). In our case, we were just given a number n, which is easy to generate as input and easy to generate potential output for and test for correctness. Nonetheless, this approach gave me the opportunity to solve the problem by myself and learn something new during the process. That is always welcome.