### Microsoft Interview Questions

These Are one of the Microsoft Interview Questions, if you attended Microsoft interview recently, please share you interview questions with us through comments or mail them at kchaitanyya@gmail.com

1. What are your interests? ( as in academics/topics)

2. Given a string, search it in a set of strings (say among 1000s of string). What data structure would you use to store those 1000 strings and get the results fastest?

`Answer:I answered hash tables but he said suggest a better      one.He said suggest a better one and then gave me one      Tree sort of DS and then asked me to compare the two.`

3. Reverse a linked list?

4. Find if there is a loop in a linked List?

5. Given two arrays of numbers, find if each of the two arrays have the same set of integers ? Suggest an algo which can run faster than NlogN ?

6. Validate a Binary search tree? ( as in the left- right child follow the property )
`  Well i gave a the some weird eg where the struct was not  a Binary tree but if passed through the test will give  positive results.then he asked me to solve for that too.`

The interviewer gets a bit serious with each stage. He will test ur work for all possible set of inputs.

Prologue: Well in my case he started with how they require not only a programmer but a designer and coder who writes perfect code.

1. Write a routine that takes input as a string such as
`          "aabbccdef" and o/p "a2b2c2def"     or          "a4bd2g4" for "aaaabddgggg"`

write it perfectly as if it should ready to be shipped after you code it.

2. In the same Question (q1) why will u o/p "abc" for the i/p "abc" instead of "a1b1c1" ?

3. Given a NxN matrix with 0s and 1s. now whenever you encounter a 0 make the corresponding row and column elements 0.

Flip 1 to 0 and 0 remains as they are.

for example
1 0 1 1 0
0 1 1 1 0
1 1 1 1 1
1 0 1 1 1
1 1 1 1 1

results in

0 0 0 0 0
0 0 0 0 0
0 0 1 1 0
0 0 0 0 0
0 0 1 1 0

1. Some Questions on the projects listed on your resume?
`     For me some Qs on DB Lock Manager?`

2. Given 2 set of arrays of size N(sorted +ve integers ) find the median of the resultent array of size 2N.
(dont even think of sorting the two arrays in a third array , though u can sort them. Try something better than order N ..order LogN )

3. Given 1000 bottles of juice, one of them contains poison and tastes bitter. Spot the spoiled bottle in minimum sips?

4. Whats the difference b/w a thread and a process? are Word and PowerPoint different processes or threads of a single process?

5. How does a spell checker routine (common to both, word and PowerPoint) used? I mean is the code copied 2 times for each of the processes in the main memory, if they are different processes or how is it used if they are threads.

1. HEY

EXCELLENT!!

2. While the interviewers are going to ask
to see code, they also want (as I would)
to see your thought processes as you write
the code. Let's pick question 2:

> Given 2 set of arrays of size N (sorted
> positive integers) find the median of the
> resultant array of size 2N.

Clearly you could combine these two arrays
into a single array, sort, and pick the
middle element. This requires additional
array space of size 2N and sorting this
larger array can be done in O(n log n). But
the aim is likely to find the most efficient
solution:

First, state your assumptions:

1. There may be duplicate integers.
2. The arrays A1 and A2 are sorted.
3. You can compare <, =, >

Second, you need to define what the median
value is. Since the total number 2N of
elements is even then define the median of
2N elements to be an element from either A1
or A2 that is greater than or equal to N
elements in the combined array and lesser
than or equal to N elements in the combined
array. This means you will some ambiguity
because if all elements are distinct then
there are indeed two distinct elements that
could be selected as the median value.

Third, try to identify useful test cases
that will show your algorithm in practice,
and validate you have the right
implementation.
The following test cases come to mind:

A1={1,2,3,4,5}, A2={10,11,12,13,14}, M=5
A1={1,2,3}, A2={1,2,3}, M=2
A1={1,3,5}, A2={2,4,6}, M=3
A1={1,2,9,10}, A2={3,4,7,8}, M=4
A1={1,7,17,90}, A2={2,7,8,9}, M=7

The last test is interesting. Note that the
median value of each of the individual
Arrays is 7 and this is the final median of
the combined array! This is the best case
scenario. But then I'm struck by the
following idea: Since both arrays are
already sorted, you can "mark" the smaller
of the two first elements (if they are the
same mark both of them). Let's call this
operation "compare-and-mark". Once done,
advance (in one array or both) to the next
unmarked entry. And repeat until N numbers
are marked. These N numbers are the smallest
N numbers from the resulting set. This
computation has actually determined the
median, given our definition earlier. So
once N numbers are marked, we choose the
largest of these N numbers, which is a simple
comparison between the largest marked number
in A1 against the largest marked number in A2.

Here's some sample Java code:

public static int median (int[] a1, int[] a2) {
int i1 = 0, i2 = 0;
int marked = 0;
int n = a1.length;
while (marked < n) {
// when comparing, either advance both
// index values or just one.
if (a1[i1] == a2[i2]) {
marked += 2;
i1++;
i2++;
} else if (a1[i1] < a2[i2]) {
marked++;
i1++;
} else {
marked++;
i2++;
}
}

// once done, i1 and i2 point to unmarked
// entries OR are past array bound.

// If last two entries are the same, we may
// have gone one too far and marked N+1
// entries. Since the two numbers are the
// same in this case, it does not affect
// the result.

// all smallest elements are in a2.
if (i1 == 0) { return a2[i2-1]; }

// all smallest elements are in a1.
if (i2 == 0) { return a1[i1-1]; }

// find larger (or equal) and return.
if (a1[i1-1] < a2[i2-1]){ return a2[i1-1];}
return (a1[i1-1]);
}

You can observe that the while loop is
executed no more than N times, thus this
algorithm is able to compute the median in
O(N) time.

For more ideas regarding the principles of
effective algorithms, including divide-
and-conquer, selecting the right data
structure, and greedy approaches, see the
"Algorithms in a Nutshell" book:

http://oreilly.com/catalog/9780596516246

Best of luck!

3. For Round 1, question 2 a possible representation of the tree would look like:

For strings:
aaa
aab
aac
aae
aaf
aag

a
__|...
|
a
|
_____________
| | | | | | |
a b c d e f g

With the above representation it can be implemented below at the end (c# code - not sure if it compiles as I wrote it in a notepad):

The first method creates a tree representation of the strings and the second one checks if the given string exists in the list of strings. The tree

is constructed with hashtable of hashtables data structure, but if the characters in the strings could be let's say a-z or A-Z, then the hashtable

can be replaced with 52 elements of array, with each element contains a bit (0 means no path or no string containing this char, 1 means has path) and

a pointer to another array of same structure and for this case we need a hash function to map the char to array index.

The tree using hashtable approach requires O(n) to construct the tree as we need to access all strings. But since we also parse each string to access

each characters that would add another O(m) time for each string. m represents the average length of the strings, therefore the time complexity is

O(n X m) which is O(n) if m is much...much...much less than n as n approaches to infinity. The above assertion is invalid in cases when m is close to

n as n approaches to infinity, but in reality that would not be the case. Therefore, construction the tree requires O(n), then look up is O(c) so

time it takes to search will be O(n). But once the tree is constructed any search would take an order of constant time O(c).

The whole idea of either Hashtable or the array approach is to speed up look up time, in either case look up will be done in a constant time. But the

hashtable approach would be a better solution in cases when among the 52 array elements only very few of them are used because we are wasting memory

besides the assumptions made.

Another approach that the writer of the article mentioned is to put all the strings in one hashtable, and then search the other string from the

hashtable. Now obviously one has to access all the strings to add them in a hashtable that would be done in linear time O(n), and then look up from a

hashtable takes a constant time O(c) therefore this approach would require O(n) time, so no much difference from the approach that uses a tree data

structure.

Let us see its memory utilization; best case scenario is if the strings are identical, in that case only one string will be added in the hashtable

but the worst would be if none of them are the same that means O(n) memory would be required to put the copy of the strings into the hashtable. But

the tree approach would reduce this memory usage significantly, because any sub strings starting from the beginning that are the same will share only

that substring space, that means if you have a strings array of 1000 elements and 20 char length each and the first 15 chars are the same for all the

1000 elements then you will have to use only 15 char spaces for the 1000 strings that would have required 15000 char spaces if implemented with the

hashtable approach.

Note: There is a char STRING_END_CHAR which is used to mark that the path from root to that char means that path represents a valid string. It is

very crucial to have this otherwise we wouldn't know what is valid string or not.

I would like to hear from any one any opinion and a better solution.

Here is the code:

public static char STRING_END_CHAR = '~';

public static Hashtable ConstructTree(string[] strings)
{
Hashtable mainht = new Hashtable();

foreach( string str in strings)
{
Hashtable ht = mainht;
foreach(char chr in str)
{
if(ht.ContainsKey(chr))
{
ht = ht[chr] as Hashtable;
}
else
{
Hashtable nht = new Hashtable();
ht = nht;
}
}
if(!ht.ContainsKey(STRING_END_CHAR))
{
}
}
return mainht;
}

public static bool Exists(string[] strings, string str)
{

Hashtable ht = ConstructTree(strings);

foreach(char chr in str)
{
if(ht.ContainsKey(chr))
{
ht = ht[chr] as Hashtable;
}
else
{
return false;
}
}

return ht.ContainsKey(STRING_END_CHAR);
}

4. Refering: Given two arrays of numbers, find if each of the two arrays have the same set of integers ? Suggest an algo which can run faster than NlogN ?

This can be done in different ways:

1 - Brute force: for each element in array1 check that element exists in array2. Note this would require to note the position/index so that duplicates can be handled properly. This requires O(n^2) with much complicated code, don't even think of it at all...

2 - Sort both lists, then check each element to see if they're identical. O(n log n) for sorting and O(n) to check so basically O(n log n), sort can be done in-place if messing up the arrays is not a problem, if not you need to have 2n size memory to copy the sorted list.

3 - Add the items and count from one array to a hashtable, then iterate through the other array, checking that each item is in the hashtable and in that case decrement count if it is not zero otherwise remove it from hashtable. O(n) to create a hashtable, and O(n) to check the other array items in the hashtable, so O(n). This introduces a hashtable with memory at most for n elements.

4 - Best of Best (Among the above): Subtract or take difference of each element in the same index of the two arrays and finally sum up the subtacted values. For eg A1={1,2,3}, A2={3,1,2} the Diff={-2,1,1} now sum-up the Diff = 0 that means they have same set of integers. This approach requires an O(n) with no extra memory. A c# code would look like as follows:

public static bool ArrayEqual(int[] list1, int[] list2)
{
if (list1 == null || list2 == null)
{
throw new Exception("Invalid input");
}

if (list1.Length != list2.Length)
{
return false;
}

int diff = 0;

for (int i = 0; i < list1.Length; i++)
{
diff += list1[i] - list2[i];
}

return (diff == 0);
}

5. I should mention that the summation of the differences algorithm proposed in my previous comment works only when either the integers are all positive or negative.

6. The 4th approach doesn't really work, I am lol on my self, eg a1={1,1,1} a2={2,1,0}

7. For round 1 question 5:
if we xor all the elements of the first array and compare that with the result of the xor operation of the second array, we should be able to determine if both the arrays have the same elements.

8. int main()
{
char s[]="aaabbbccccdee",ch;
int i,c=1;
ch=s;

for(i=1;s[i];i++)
{
if(s[i]==ch)
{
c++;
}
else
{
printf("%c",ch);
if(c>1)
{
printf("%d",c);
c=1;
}
ch=s[i];
}

}
printf("%c",ch);
if(c>1)
{
printf("%d",c);
}

getch();
return 0;
}

9. @ rakesh
for Round 1 ques 5
ur approach seems to be completely wrong to me.
check ur approach for
set1={2,3,5,7} and set2={0,1,4,6}
by xoring the elements in each set we will get the answer = 3{011 in binary} but the number in both the sets are completely different.

10. May be if someone else has any idea to solve the ques 5 of Round 1 in O(n) or less time other than Hash Table approach then post it here...

11. hey..
round 1 , question 5:-
build binary search trees for both the array elements.
now perform in order traversal of both the trees, if the sequences are same, then both arrays contain same elements.
the time complexity is O(n), the only catch here is that extra space for creating a tree is required, but there is no space constraint, so i guess this can be an alternative solution to the hashing approach.

12. hey..
round 1 , question 5:-
build binary search trees for both the array elements.
now perform in order traversal of both the trees, if the sequences are same, then both arrays contain same elements.
the complexity is O(n), the only catch here is that extra space for creating a tree is required, but there is no space constraint, so i guess this can be an alternative solution to the hashing approach.

13. In round 2 2nd queston:

#include
#include
main()
{
char *a,ch;
int *b,i;
int* rep(char*);
clrscr();
printf("enter any string\n");

gets(a);

b=rep(a);
for(i=0;i<=25;i++)
{
if(b[i]!=0)
{

if(b[i]>1)

printf("%c%d",tolower(i+65),b[i]);

else
printf("%c",tolower(i+65));
}
}

getch();
return(0);

}

int* rep(char* a)
{
static int b;
int i;

for(i=0;a[i]!='\0';i++)
{
if(isalpha(a[i]))
{
b[toupper(a[i])-65]++;
}

}
return(b);
}

14. Round 1 Q5: to have same elements in both the arrays, the sum of elements of each array should be equal as well as the sum of squares of each array should be equal (and of course both arrays should have equal no. of elements). Complexity O(n) :)

15. 16. Hi

I read this post two times.

I like it so much, please try to keep posting.

Let me introduce other material that may be good for our community.

Source: Microsoft interview questions

Best regards
Henry

17. 18. 19. Reading interview questions prepares you for the ultimate test of your career. You get to know what type of questions are asked in the companies. So you prepare accordingly. Thanks for sharing such an interesting blog posts.

Best Regards,
Crish Watson
Pass Microsoft Certification Without Exam