Problem Description

The problem was published on Hackerrank, you can find it here.

You are given a set of integers S and an integer k. The goal of this problem is to return the size of the biggest subset you can build such as the sum of any two numbers of S is non-divisible by k.

Example

k = 3
S = {1 7 2 4}

The biggest subset you can build here is S' = {1 7 4}. The sum of any two numbers of S' is non-divisible by 3 and adding 2 would make that statement false. Indeed:

1 + 2 = 3 = 3 * 1
4 + 2 = 6 = 3 * 2
7 + 2 = 9 = 3 * 3

Problem Solution

It took me some time to understand how to solve the example above without testing all the subset possibilities. While I was testing, I had this idea : since we want to divide the sum of two numbers by k, then we can work with division remainders. Here is the quick math behind the idea :

(a+b)%k == 0 if and only if (a%k + b%k)%k == 0

This means a and b can coexist in a non-divisable subset if and only if
a%k and b%k can coexist in a non-divisable subset

Given this statement, the example above is much easier to understand : converting x to x%k for every x in the initial set gives 1 7 2 4 => 1 1 2 1. All the ones can coexist in the same subset since (1 + 1) % 3 != 0, and we can't add the two since it will make a three when combining with any one.

Now let's build our algorithm. First we will perform our euclidean division on each number of the set and store the remainder into a list. How will we build our subset from here? The idea is to see which numbers cannot coexist. Two remainders ri and rj can coexist if and only if ri + rj != k (this statement is based on the bit of math we did before). In the case where two remainders can't coexist, which one do we keep? We should keep the one that is the most represented, since we want to build the biggest subset. Example above with remainders 1 1 2 1: ones and twos can't coexist, but we keep ones because there are more ones than twos. We can store the the occurences count in a dictionary, a very handy tool here.

Once we have that dictionary, let's see how we compute the biggest subset size. Here is what the dictionary looks like at each step:

Initial list :  [1, 3, 5, 6, 7, 8, 10, 12, 15, 26, 74, 55, 235, 467]
k value :  4
With remainders :  [1, 3, 1, 2, 3, 0, 2, 0, 3, 2, 2, 3, 3, 3]
Dictionary step by step:
1 {1: 1}
3 {1: 1, 3: 1}
1 {1: 2, 3: 1}
2 {1: 2, 2: 1, 3: 1}
3 {1: 2, 2: 1, 3: 2}
0 {0: 1, 1: 2, 2: 1, 3: 2}
2 {0: 1, 1: 2, 2: 2, 3: 2}
0 {0: 2, 1: 2, 2: 2, 3: 2}
3 {0: 2, 1: 2, 2: 2, 3: 3}
2 {0: 2, 1: 2, 2: 3, 3: 3}
2 {0: 2, 1: 2, 2: 4, 3: 3}
3 {0: 2, 1: 2, 2: 4, 3: 4}
3 {0: 2, 1: 2, 2: 4, 3: 5}
3 {0: 2, 1: 2, 2: 4, 3: 6}
Biggest subset size: 8

Looping through the remainders, here is how we find the best subset :

  • For the numbers where remainders = 0, we can only keep one of them because 0 is the only remainder that can't coexist with zero. Indeed, if we have 0 twice in the final subset, then we can just sum them and get a number divisible by k.
  • Remainder = 1. These numbers can't coexist with those where remainder = 3, because 3 + 1 = 4 = k. But there are more threes so let's keep the threes.
  • Remainder = 2. They can't coexits with themselves (2 + 2 = 4 = k), so we'll keep only one of them. It is the same logic as in remainder equal to 0.
  • Remainder = 2. They can't coexist with numbers where remainder = 1, but there are more threes than ones so we'll keep them!

In the end, we have 1 zero, 1 two, and 6 threes. That's 8 elements for the biggest subset, which can be {3, 6, 7, 8, 15, 55, 235, 467} or {3, 6, 7, 12, 15, 55, 235, 467} or {3, 7, 12, 15, 26, 55, 235, 467} depending on the number we choose for remainders zero and two. The code below shows the algorithm implemented. It passes all test cases on Hackerrank. We iterate once over the initial list and once over the dictionary keys so the complexity is O(N), awesome!

def biggest_subset(list_init, k, n):
    nb_dict = dict()

    for nb in list_init:
            if nb in nb_dict:
                nb_dict[nb] += 1
            else:
                nb_dict[nb] = 1    
                
    count = n
    count_equal = 0

    for nb in nb_dict.keys():
        if k-nb == nb or nb==0:
            count -= nb_dict[nb] - 1
        elif (k-nb) in nb_dict and nb_dict[k-nb] > nb_dict[nb]:
            count -= nb_dict[nb]
        elif (k-nb) in nb_dict and nb_dict[k-nb] == nb_dict[nb]:
            count_equal += nb_dict[k-nb]
        
    return int(count - count_equal//2)

n_test, k_test = map(int, input().strip().split(' '))
list_nb = [int(x)%k_test for x in input().strip().split(' ')]

print(biggest_subset(list_nb, k_test, n_test))

I added a case in the loop over the dictionary, when two remainders that can't coexist have the same occurences. Here, we can select whichever we want for the final set but we know that half of the numbers will be removed. That's the count_equal//2 that you'll see below.

print(biggest_subset([1, 3, 5, 6, 7, 8, 10, 12, 15, 26, 74, 55, 235, 467], 4, 14))
8

Written by Victor
Published




Browse more articles and coding problems