Monday, August 16, 2010

Interesting Question

Maybe not so interesting to most people. But it is good to have a chance to dig a hole and jump into it by myself. I am implementing this and hopefully get it resolved elegantly.


Target:


Design a data structure + algorithm to be able to respond to query requests in a timely manner.


1. Assume you have a set of string, say N strings.
2. The lengths of the N strings vary from 1 byte to X bytes.
3. Design a data structure + algorithm to provide a fast query based on:
4. If a string S is provided, we want to be able to find out all strings containing S as their substring in O (log (N)) time.

Conditions:

A. Pre-processing time to make the data structure ready for query need not be considered. We can improve it later on. This is to simply the answer at beginning. I started with rigid requirement of this part and had trouble to have an answer quickly at beginning.

B. Assuming we to run the query many times, and hence first priority is to optimize complexity of the query.

C. You can use hash table in design at beginning. But it is really a memory hogger, so eventually you need to change it to something else.