Majority Element

Well ,to all those who aren't familiar with majority element,a majority element in an array of size N is any element which is present more than N/2 times.

Now our task is given an array of size N ,we need to find the majority element if it exists ,as efficiently as possible.

Here are some of the ways to do it.

Naive approach:

Just scan the array element wise and then make a count of the frequency of each of the distinct elements present in the array and if any element's count is more than N/2 then it is the majority element, otherwise it doesn't exist!!

One needn't ponder much on the complexity of this bruteforce approach.

This requires O(N) additional space and O(N) time.

Recursive Approach:

Here’s a divide-and-conquer algorithm:

function majority (A[1 . . . N])

if N = 1: return A[1]

let AL , AR be the first and second halves of A

ML = majority(AL ) and MR = majority(AR )

if neither half has a majority:

return ‘‘no majority’’

else:

check whether either ML or MR is a majority element of A

if so, return that element; else return ‘‘no majority’’

Brief justification: If A has a majority element x, then x appears more than N/2 times in A and
thus appears more than N/4 times in either AL or AR ; it follows that x must also be a majority
element of one (or both) of these two arrays.
Running time: T (N) = 2T (N/2) + O(N) = O(N log N).

Sorting Based Approach:
It is well known that an array of size N can be sorted in O(NlogN).

A majority element if it exists,should be the median!!

The simple reason to justify the above statement is ,that the majority element is present more than N/2 times hence occupies more than N/2 contiguous positions after sorting,which is bound to contain the middle position.


Linear Time Algorithm:

function majority (A[1 . . . N])

x = prune(A)

if x is a majority element of A:

return x

else:

return ‘‘no majority’’

function prune (S[1 . . . N])

if N = 1: return S[1]

S = [ ] (empty list)

for i = 1 to N/2:

if S[2i − 1] = S[2i]: add S[2i] to S

return prune(S )


Justification: We’ll show that each iteration of the prune procedure maintains the following
invariant: if x is a majority element of S then it is also a majority element of S . The rest then follows.
Suppose x is a majority element of S. In an iteration of prune, we break S into pairs. Suppose
there are k pairs of Type One and l pairs of Type Two:

• Type One: the two elements are different. In this case, we discard both.

• Type Two: the elements are the same. In this case, we keep one of them.

Since x constitutes at most half of the elements in the Type One pairs, x must be a majority element in the Type Two pairs.
At the end of the iteration, what remains are l elements, one
from each Type Two pair. Therefore x is the majority of these elements.

Running time: In each iteration of prune, the number of elements in S is reduced to l ≤ |S|/2.
Therefore, the total time taken is T (N) ≤ T (N/2) + O(N) = O(N).


Moore's Voting Approach:

sweep down the sequence starting at the pointer position shown above.

As we sweep we maintain a pair consisting of a current candidate and a counter. Initially, the current candidate is unknown and the counter is 0.

When we move the pointer forward over an element e:

  • If the counter is 0, we set the current candidate to e and we set the counter to 1.
  • If the counter is not 0, we increment or decrement the counter according to whether e is the current candidate.
When we are done, the current candidate is the majority element, if there is a majority.To ensure that just iterate over the array again and count the number of times the current candidate appears.
This is O(N) approach and more importantly requires only O(1) additional space.


Justification:This simple approach is tougher to crack.But the justification is simple with the help of an example.

Ex:
Imagine a convention center filled with delegates (i.e., voters) each carrying a placard proclaiming the name of his candidate. Suppose a floor fight ensues and delegates of different persuasions begin to knock one another down with their placards.
Suppose that each delegate who knocks down a member of the opposition is
simultaneously knocked down by his opponent. Clearly, should any candidate field more delegates than all the others combined, that candidate would win the floor fight and, when the chaos subsided, the only delegates left standing would be from the majority block. Should no candidate field a clear majority, the outcome is less clear; at the conclusion of the fight, delegates in favor of at most one candidate, say, the nominee, would remain
standing--but the nominee might not represent a majority of all the delegates.
Thus, in general, if someone remains standing at the end of such a fight, the convention chairman is obliged to count the nominee’s placards
(including those held by downed delegates) to determine whether a majority exists.

This is what the above voting algorithm simulates ,if we can correlate carefully!!

7 comments:

  1. On the last method (Moore), what if the sequence is:
    20,7,3,7,1,7,7,7,7
    , doesn't the algorithm fail?
    counter = 0, e = ?
    20 -> counter = 1, e = 20
    7 -> counter = 0, e = 20
    3 -> counter = 1, e = 3
    7 -> counter = 0, e = 3
    1 -> counter = 1, e = 1
    7 -> counter = 0, e = 1
    7 -> counter = 1, e = 7
    7 -> counter = 2, e = 7
    7 -> counter = 3, e = 7

    counter = 3 < 9/2 = 4

    Sorry if I misunderstood the algorithm, but i couldn't figure up this problem.

    Cheers

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Basic Algo MOORE..

      Step 1). initialize max_index=0; count=1;

      step 2).

      for(int i=1;i<A.size();i++)

      if(A.at(max_index)==A.at(i)) count++;
      else count- -;

      if(count==0) {

      count=1;
      max_index=i;
      }
      }

      return A.at(max_index);


      here I m trying to so you demo...of Your eg..
      20,7,3,7,1,7,7,7,7

      /**NMWPSMIE= Not matching with Previous Selected Max_Index Element */
      A[0]=20 i=0) max_index=0; count=1;
      A[1]=7 i=1) NMWPSMIE count--; count becomes=0; do this __ max_index=1; count=1;
      A[2]=3 i=2) NMWPSMIE count--; count becomes=0; do this __ max_index=2; count=1;
      A[3]=7 i=3) NMWPSMIE count--; count becomes=0; do this __ max_index=3; count=1;
      A[4]=1 i=4) NMWPSMIE count--; count becomes=0; do this __ max_index=4; count=1;
      A[5]=7 i=5) NMWPSMIE count--; count becomes=0; do this __ max_index=5; count=1;
      A[6]=7 i=6) count++;
      A[7]=7 i=7) count++;
      A[8]=7 i=8)count++

      max_index=5;

      A[max_index]=7(as A[5]=7)

      you need check also this element also repeat more floor(A.size()/2)..
      A.size()=9
      Repeatation Needed Atleast=9/2+1=5
      frequency of 7 = 6

      satisfy condition..So answer is 7....
      if u still has doubt please mail me your doubt rameshc10695@gmail.com
      or post here...facebook.com/groups/Novice.Programs.at.NITK/

      Happy Coding..................
      Hope to see Your Reply Soon..................

      Delete
  2. Good one. Found the solution here also http://geeksforgeeks.org/?p=503

    ReplyDelete
  3. It seems in this case that the naive approach is actually the best in terms of computations performed. O(N) is better than O(NlogN). I then wouldn't call it the naive approach but the smart approach. It's amazing how many problems Hash table solve in a very efficient manner.

    ReplyDelete
  4. On moore's approach :

    what u think should be mode in : 12141516171833 ??

    it should be 1 but moore's approach will return 3.

    ReplyDelete