Search In A Young tableau - A Sorted Matrix

Young tableau : For our present discussion ,we confine this entity to a table which elements are sorted both column wise and row wise.The degree of orderliness among the elements is loosely bound that a row by row or column by column traversal of this matrix doesn't essentially list out the elements in a sorted manner.So the search on this matrix is not all that simple and straight forward as it looks like.
In the following sections we will look at some interesting approaches to search for a key in this 2D array(after all it is!!).

One interesting yet simple thing worth observing is that
element A[i][j] is always > A[p][q] for i > p and j> q .
The next 2 strategies are based on this simple fact.

Strategy1 - A grid search: About any element A[i][i] divide the matrix in to 4 quadrants.
If the key K we are looking for is

  1. <A[i][j] then we can eliminate the lower right quadrant because all its elements are > A[i][j].

  2. <A[i][j] then we can eliminate the lower right quadrant because all its elements are > A[i][j].

  3. =A[i][j]. then our search is over.The choice of this i can be done in a binary search manner.. reducing the search space by half.


Now we can search the 3 quadrants individually and hence recursively.

T(N)=3*T(N/4)+O(1) which comes out to be O(N^(log3/log4)) which is less than O(N).

Strategy2:Now we move a step further in reducing the search space.Iterate along the diagonal and find i such that A[i][i] <k and A[i+1][i+1] >k.Now we have only 2 search intervals to search for.

T(N)=2*T(N/4)+O(N) which comes out to be O(N).

Strategy3:One more interesting solution and smart solution that I found was(the credit goes to the geek named Novice in the discussion http://inder-gnu.blogspot.com/2008/01/find-element-in-row-and-column-sorted.html ) this.

Start from the point in the last row first column. Every point to its right is greater than this point and every point on its top is smaller than this. So, if the point is greater than this point then move right otherwise move top. So, you will traverse at most 2*n points.


Well, these are 3 interesting solutions I could find till now and at this juncture ,it is not surprising if one wishes to compare these strategies to figure out the best and I don't even rule out any other solutions to this problem.So folks if you do any ,please put them in the comments sections and let others know.

Continued

Here are the C codes for the above discussed strategies.

Strategy1




strategy1

bool novice_search(int **grid,int size, int key,int &x,int &y)
{
int i=size-1,j=0;
while(0<=i && i<size && 0<=j && j<size)
{
if(grid[i][j]==key)
{
x=i;
y=j;
return true;
}
else if(grid[i][j] <key)
{
j++;
}
else
i--;
}
return false;
}



strategy2

bool quadra_partitionsearch(int **grid,int row_min,int row_max,int col_min,int col_max,int key,int &x,int &y)
{
if((row_min > row_max) ||( col_min >col_max))
return false;
else if((row_min==row_max) &&(col_min==col_max))
{
if(grid[row_min][col_min]==key)
{
x=row_min;
y=col_min;
return true;
}
else
return false;
}
else if((grid[row_min][col_min] <=key) && (grid[row_max][col_max]>=key))
{
int row_mid =(row_min +row_max)/2;
int col_mid =(col_min+col_max)/2;
bool flag;
// cout <<row_min <<'\t' <<row_max<<'\t'<<col_min<<'\t'<<col_max<<'\n';
if(grid[row_mid][col_mid]==key)
{
x=row_mid;
y=col_mid;
return true;
}

else if(grid[row_mid][col_mid]>key)
{
if(quadra_partitionsearch(grid,row_min,row_mid,col_min,col_mid,key,x,y))
return true;
}
else
{
if(quadra_partitionsearch(grid,row_mid,row_max,col_mid+1,col_max,key,x,y))
return true;
}
if(quadra_partitionsearch(grid,row_min,row_mid,col_mid+1,col_max,key,x,y))
return true;
else if(quadra_partitionsearch(grid,row_mid+1,row_max,col_min,col_mid,key,x,y))
return true;

}
return false;
}


strategy3

bool binary_partitionsearch(int **grid,int row_min,int row_max,int col_min,int col_max,int key,int &x,int &y)
{
if((row_min > row_max) ||( col_min >col_max))
return false;
else if((row_min==row_max) &&(col_min==col_max))
{
if(grid[row_min][col_min]==key)
{
x=row_min;
y=col_min;
return true;
}
else
return false;
}
else
{
if(grid[row_min][col_min] > key)
return false;
int row_mid=row_min,col_mid=col_min;
while(grid[row_mid][col_mid] < key)
{
row_mid++;
col_mid++;
}
if(grid[row_mid][col_mid]==key)
{
x=row_mid;
y=col_mid;
return true;
}
else
{
if(binary_partitionsearch(grid,row_mid,row_max,col_min,col_mid-1,key,x,y))
return true;
return binary_partitionsearch(grid,row_min,row_mid -1,col_mid,col_max,key,x,y);
}
}
}


I tried to check which of them is efficient by noting the runtimes and well all of them were quite close,though 2nd and 3rd approaches did mostly well compared to the first one.All these strategies worked more or less in the same manner on an grids of size varying from 100 to 1000.The last 2 strategies worked much better than the first one mostly.

9 comments:

  1. Represent each point by a zero based integer index so that for a given i value its corresponding matrix index (x,y) values are determenied by
    (x,y) = (i/n, i modulus n) where n is the size of matrix (one dimension)

    Therefore for eg, the 9th element value of a 4X4 matrix A is A[2,1]

    Having said that use any of the search algorithm by treating the matrix as a one dimensional array of size nXn, hence for eg if we use binary search then running time will be T(n) = log(n^2) = 2log(n)

    ReplyDelete
  2. the previous suggestions doesn't work, dis-regard it

    ReplyDelete
  3. for solution two, you might as well binary search for the spot on the diagonal that you want. Also, if n is the size of a dimension and not area then your subproblems are both n/2 and your big oh equations are all wrong.

    ReplyDelete
  4. Great post, I just read about Young Tableaux in "Introduction to Algorithms" and was wondering about the various strategies for querying them: you cover them well.

    Even though the first method is asymptotically faster I would probably use the 3rd method for small elements because of its straightforward implementation. I would think (guts guess) than it's faster for small tableaux.

    ReplyDelete
  5. There is something you greatly overlooked in your comparison.

    For the 2 first strategies, you express the complexity in term of the number of elements in the table:

    For the 1st strategy, it yields T(NumberElements) = Theta(NumberElements**(log 3 / log 4)).

    However, you use n with a totally different meaning in the 3 strategy: as the dimension of the square!!! In which case the search is O(NumberRows + NumberColumns).

    Now, if we define the array to be of dimension M*N, this means that:

    Strategy 1: O((M*N)**(log 3 / log 4))
    Strategy 3: O(M+N)

    And my choice is pretty clear!

    ReplyDelete
  6. If we choose N to be the size of area, the recurrence of the 2nd strategy would be T(N)=2*T(N/4)+O(log n) which is O(sqrt(N)) and in terms of number of rows is O(R).

    ReplyDelete
  7. could we print this 2D array in sorted order in O(n) time?

    ReplyDelete
  8. Nice matrix. i really like it. Thank you for sharing.

    Upcoming Bank Exams

    ReplyDelete