Removing duplicates from a sorted array is a common problem asked in many coding interviews and programming challenges. This can be approached using different methods. In this article, we'll discuss two approaches to remove duplicates from a sorted array: one using set and another using the two-pointers technique.
Solution 1: Brute Force Approach (using Set)
This method uses a set to store unique elements, ensuring that duplicates are removed.
Implementation:
// Solution-1: Brute Force Approach (using Set)
// Time Complexity: O(nlogn)
// Space Complexity: O(n)
int removeDuplicates(vector<int> &arr, int n)
{
set<int> uniqueElements;
for (int i = 0; i < n; i++)
{
uniqueElements.insert(arr[i]);
}
int k = uniqueElements.size();
// Copy unique elements from the set back into the input array
int i = 0;
for (auto x : uniqueElements)
{
arr[i] = x;
i++;
}
// Return the number of unique elements
return k;
}
Logic:
Use a Set: Insert all elements of the array into a set to remove duplicates.
Get Unique Count: The size of the set gives the number of unique elements.
Copy Back to Array: Copy elements from the set back to the array.
Time Complexity: O(n log n)
- Explanation: Inserting each element into the set takes O(log n) time, resulting in O(n log n) for n elements.
Space Complexity: O(n)
- Explanation: The set stores all unique elements, potentially up to n elements.
Example:
Input:
arr = [1, 1, 2, 2, 3, 4, 4]
,n = 7
Output:
k = 4
,arr = [1, 2, 3, 4, ...]
Explanation: The array contains 4 unique elements.
Solution 2: Optimal Approach (Two-Pointers Technique)
This method uses two pointers to overwrite duplicates in place, providing an efficient solution.
Implementation:
// Solution-2: Optimal Approach (Two-Pointers Technique)
// Time Complexity: O(n)
// Space Complexity: O(1)
int removeDuplicates(vector<int> &arr, int n)
{
int i = 0; // Pointer to track the position of the next unique element
for (int j = 1; j < n; j++)
{
if (arr[i] != arr[j])
{
i++;
arr[i] = arr[j];
}
}
// Return the number of unique elements
return i + 1;
}
Logic:
Initialize Pointers: Use
i
to track the position of the next unique element andj
to traverse the array.Compare Elements: If
arr[i]
is not equal toarr[j]
, incrementi
and updatearr[i]
witharr[j]
.Return Count: The value
i + 1
gives the number of unique elements.
Time Complexity: O(n)
- Explanation: The array is traversed once.
Space Complexity: O(1)
- Explanation: No additional space is used apart from variables.
Example:
Input:
arr = [1, 1, 2, 2, 3, 4, 4]
,n = 7
Output:
k = 4
,arr = [1, 2, 3, 4, ...]
Explanation: The array contains 4 unique elements.
Comparison
Brute Force Method:
Pros: Simple and straightforward.
Cons: Inefficient for large arrays due to O(n log n) time complexity and additional space usage.
Use Case: Useful when simplicity is more important than efficiency.
Optimal Method:
Pros: Efficient with O(n) time complexity and O(1) space complexity.
Cons: Requires in-place modification of the array.
Use Case: Ideal for large arrays and when in-place operations are feasible.
Edge Cases
Empty Array: Returns 0 as there are no elements.
Single Element Array: Returns 1 as a single element is trivially unique.
All Identical Elements: Returns 1 as all elements are the same.
Additional Notes
Efficiency: The optimal approach is significantly more efficient for large datasets.
Simplicity: Despite its efficiency, the optimal approach is simple to implement.
Practicality: The optimal method is generally preferred due to its linear time complexity and constant space complexity.
Conclusion
Removing duplicates from a sorted array can be done efficiently using an in-place approach with two pointers. While the brute force method provides a straightforward solution, the optimal approach is both efficient and easy to implement, making it suitable for large datasets.