Find that unique element from the 10^5 array size

This question already has an answer here:

  • How to find the only number in an array that doesn't occur twice [duplicate] 6 answers

What would be the best algorithm for finding a number that occurs only once in a list which has all other numbers occurring exactly twice.

So, in the list of integers (lets take it as an array) each integer repeats exactly twice, except one. To find that one, what is the best algorithm.

--------------Solutions-------------

The fastest (O(n)) and most memory efficient (O(1)) way is with the XOR operation.

In C:

int arr[] = {3, 2, 5, 2, 1, 5, 3};

int num = 0, i;

for (i=0; i < 7; i++)
num ^= arr[i];

printf("%i\n", num);

This prints "1", which is the only one that occurs once.

This works because the first time you hit a number it marks the num variable with itself, and the second time it unmarks num with itself (more or less). The only one that remains unmarked is your non-duplicate.

O(N) time, O(N) memory

HT= Hash Table

HT.clear() go over the list in order for each item you see

if(HT.Contains(item)) -> HT.Remove(item)
else
ht.add(item)

at the end, the item in the HT is the item you are looking for.

Note (credit @Jared Updike): This system will find all Odd instances of items.



comment: I don't see how can people vote up solutions that give you NLogN performance. in which universe is that "better" ? I am even more shocked you marked the accepted answer s NLogN solution...

I do agree however that if memory is required to be constant, then NLogN would be (so far) the best solution.

By the way, you can expand on this idea to very quickly find two unique numbers among a list of duplicates.

Let's call the unique numbers a and b. First take the XOR of everything, as Kyle suggested. What we get is a^b. We know a^b != 0, since a != b. Choose any 1 bit of a^b, and use that as a mask -- in more detail: choose x as a power of 2 so that x & (a^b) is nonzero.

Now split the list into two sublists -- one sublist contains all numbers y with y&x == 0, and the rest go in the other sublist. By the way we chose x, we know that a and b are in different buckets. We also know that each pair of duplicates is still in the same bucket. So we can now apply ye olde "XOR-em-all" trick to each bucket independently, and discover what a and b are completely.

Bam.

Kyle's solution would obviously not catch situations were the data set does not follow the rules. If all numbers were in pairs the algorithm would give a result of zero, the exact same value as if zero would be the only value with single occurance.

If there were multiple single occurance values or triples, the result would be errouness as well.

Testing the data set might well end up with a more costly algorithm, either in memory or time.

Csmba's solution does show some errouness data (no or more then one single occurence value), but not other (quadrouples). Regarding his solution, depending on the implementation of HT, either memory and/or time is more then O(n).

If we cannot be sure about the correctness of the input set, sorting and counting or using a hashtable counting occurances with the integer itself being the hash key would both be feasible.

I would say that using a sorting algorithm and then going through the sorted list to find the number is a good way to do it.

And now the problem is finding "the best" sorting algorithm. There are a lot of sorting algorithms, each of them with its strong and weak points, so this is quite a complicated question. The Wikipedia entry seems like a nice source of info on that.

You need to specify what you mean by "best" - to some, speed is all that matters and would qualify an answer as "best" - for others, they might forgive a few hundred milliseconds if the solution was more readable.

"Best" is subjective unless you are more specific.



That said:

Iterate through the numbers, for each number search the list for that number and when you reach the number that returns only a 1 for the number of search results, you are done.

Seems like the best you could do is to iterate through the list, for every item add it to a list of "seen" items or else remove it from the "seen" if it's already there, and at the end your list of "seen" items will include the singular element. This is O(n) in regards to time and n in regards to space (in the worst case, it will be much better if the list is sorted).

The fact that they're integers doesn't really factor in, since there's nothing special you can do with adding them up... is there?

Question

I don't understand why the selected answer is "best" by any standard. O(N*lgN) > O(N), and it changes the list (or else creates a copy of it, which is still more expensive in space and time). Am I missing something?

Depends on how large/small/diverse the numbers are though. A radix sort might be applicable which would reduce the sorting time of the O(N log N) solution by a large degree.

The sorting method and the XOR method have the same time complexity. The XOR method is only O(n) if you assume that bitwise XOR of two strings is a constant time operation. This is equivalent to saying that the size of the integers in the array is bounded by a constant. In that case you can use Radix sort to sort the array in O(n).

If the numbers are not bounded, then bitwise XOR takes time O(k) where k is the length of the bit string, and the XOR method takes O(nk). Now again Radix sort will sort the array in time O(nk).

Implementation in Ruby:

a = [1,2,3,4,123,1,2,.........]
t = a.length-1
for i in 0..t
s = a.index(a[i])+1
b = a[s..t]
w = b.include?a[i]
if w == false
puts a[i]
end
end

You could simply put the elements in the set into a hash until you find a collision. In ruby, this is a one-liner.

def find_dupe(array)
h={}
array.detect { |e| h[e]||(h[e]=true; false) }
end

So, find_dupe([1,2,3,4,5,1]) would return 1.

This is actually a common "trick" interview question though. It is normally about a list of consecutive integers with one duplicate. In this case the interviewer is often looking for you to use the Gaussian sum of n-integers trick e.g. n*(n+1)/2 subtracted from the actual sum. The textbook answer is something like this.

def find_dupe_for_consecutive_integers(array)
n=array.size-1 # subtract one from array.size because of the dupe
array.sum - n*(n+1)/2
end

Category:algorithm Time:2008-08-29 Views:1

Related post

  • How can I get the unique elements of an unordered character array in Java? 2011-05-30

    If I have an unordered character array, how can I get the unique elements? --------------Solutions------------- One-liner: Set<Character> uniqueChars = new HashSet<Character>(Arrays.asList(array)); (the array will need to be Character[] n

  • How to tell how may unique elements there are an NSArray 2010-07-08

    I've got an array (actually a mutable array) of NSStrings, and I want to find out how many unique elements the are in the array. For instance, say the array is made up of: Orange, Lemon, Lemon, Orange, Lemon Then there are just two unique elements (O

  • How do I get the unique elements from an array of hashes in Ruby? 2008-10-08

    I have an array of hashes, and I want the unique values out of it. Calling Array.uniq doesn't give me what I expect. a = [{:a => 1},{:a => 2}, {:a => 1}] a.uniq # => [{:a => 1}, {:a => 2}, {:a => 1}] Where I expected: [{:a =>

  • Selecting Unique Elements From a List in C# 2008-11-15

    How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}? --------------Solutions------------- var numbers = new[] { 0, 1, 2, 2, 2, 3,

  • unique elements - struct array 2009-11-28

    I have a sorted struct of IPs where I need to get the number of unique IPs, for some reason the way I'm doing it, is giving me a "0" as a result. Which in this case there should be 12 unique ips. The struct array containing the following elements: 19

  • Efficiently estimating the number of unique elements in a large list 2009-12-30

    This problem is a little similar to that solved by reservoir sampling, but not the same. I think its also a rather interesting problem. I have a large dataset (typically hundreds of millions of elements), and I want to estimate the number of unique e

  • Return Unique Element with a Tolerance 2010-01-01

    In Matlab, there is this unique command that returns thew unique rows in an array. This is a very handy command. But the problem is that I can't assign tolerance to it-- in double precision, we always have to compare two elements within a precision.

  • jQuery unique elements 2010-03-01

    http://api.jquery.com/jQuery.unique/ lets you get only unique elements. Is there I can find out if an element is already in the list or not. list = $('#container p a'); elem = $('#container div a:first'); Is there a way to find out if elem is already

  • Determining if an unordered vector has all unique elements 2010-05-04

    Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how t

  • How do I find, count, and display unique elements of an array using Perl? 2010-05-12

    I am a novice Perl programmer and would like some help. I have an array list that I am trying to split each element based on the pipe into two scalar elements. From there I would like to spike out only the lines that read ‘PJ RER Apts to Share’ as th

  • unique elements in a haskell list 2010-06-23

    okay, this is probably going to be in the prelude, but: is there a standard library function for finding the unique elements in a list? my (re)implementation, for clarification, is: has :: (Eq a) => [a] -> a -> Bool has [] _ = False has (x:x

  • XSLT 1.0 Unique Elements 2010-10-28

    I am attempting to use preceding-sibling to select unique elements from a group. Using the folliwng xml as an example.. <items> <item> <options> <option> <option-data> <data-ab>TEST1</date-qualifier> <date

  • How to ensure list contains unique elements? 2011-01-20

    I have a class containing a list of strings. Say: ClassName: - list_of_strings I need to enforce that this list of strings contains unique elements. Unfortunately, I can't change this list_of_strings to another type, like a set. In the addToList(str_

  • Does array have a unique element? 2011-02-20

    I am looking for the most efficient algorithm to check if an array has a unique element in it. The result need not be the unique element, just true or false accordingly. I know I can reach O(n) efficiency with a hash table or somesuch, but I am still

  • how to count unique elements of a cell in matlab? 2011-03-12

    I want to count unique elements of a cell array in Matlab. How can I do this? Thank you. c = {'a', 'b', 'c', 'a'}; % count unique elements, return the following struct unique_count.a = 2 unique_count.b = 1 unique_count.c = 1 --------------Solutions--

  • jQuery function to get all unique elements from an array? 2011-03-21

    http://api.jquery.com/jQuery.unique/ lets you get unique elements of an array, but the docs say the function is mostly for internal use and only operates on DOM elements. Another SO response said the unique() function worked on numbers, but that this

  • Determining the number of occurrences of each unique element in a vector 2011-03-22

    How can I determine the relative frequency of a value in a MATLAB vector? vector = [ 2 2 2 2 1 1 1 2 2 1 1 1 2 2 2 2 1 2 ]; What function will return the number of occurrences of each unique element? --------------Solutions------------- You can use u

  • How to Keep Track of the Index of a Non-Unique Element in a Vector? 2011-04-13

    I'm most likely going about this the wrong way, but I was wondering if anyone has some advice for how to keep track of where a non-unique element is within a vector? I'm using glDrawArraysInstanced and using a vector to store translations (i.e. the f

  • Finding Unique Elements in a Matrix 2011-04-20

    I am working with a 2D array called matrix. I need to retrieve the unique elements in the array. nil A B C G F y1 a1 b2 c1 g1 f1 y2 a1 b1 c2 g2 f1 y3 a2 b1 c2 g1 f2 y4 a1 b2 c2 g2 f1 y5 a2 b2 c1 g1 f2 So for instance, for column A, I should get a1 an

Copyright (C) pcaskme.com, All Rights Reserved.

processed in 0.492 (s). 13 q(s)