Longest ordered subset

Let's talk about algorithms!

Moderator: Board moderators

Longest ordered subset

Postby niklaus » Mon Feb 23, 2009 3:53 pm

i'm given a set of 3D points (x,y,z) , The first coordinate is unique. I've to choose a longest subset so that the points are increasing.

(a,b,c) < (p,q,r) if (a<p AND b<q AND c<r)..

Say the points were (3,6,3) , (2,5,1), (-1,4,2) (-5,0,0). then longest subset is (-5,0,0) (-1,4,2) (3,6,3).
Say the points were (3,6,3) , (2,5,2), (-1,4,3) (-5,0,0). then longest subset is (-5,0,0) (2,5,2) (3,6,3).

I could think some reduction. If we sort by the x co-ordinate, then we need to find the longest increasing subsequence of (y,z). So the problem boils down to

(0,0) (5,1) (4,2) (6,3) -> 3

(0,0) (4,3) (5,2) (6,3) -> 3

We cannot extend the 1D LIS algorithm to 2D because the choosing of the point (5,1) or (4,2) makes the difference. When we know that (x,y) != (a,b) then we cannot make a decision of the points to choose. In 1D , if x!=y, either x<y or x>y but it doesn't generalize in 2D.

How do we solve this problem in less than O(n^2) ? How to extend this to n dimensional points.
niklaus
New poster
 
Posts: 6
Joined: Thu Nov 16, 2006 2:34 pm

Re: Longest ordered subset

Postby mf » Mon Feb 23, 2009 10:57 pm

You can solve it in O(n log^2 n) with 2D range trees.
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Mon Feb 23, 2009 11:23 pm

Can you throw some light on it ? . But how does range trees apply here .
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Re: Longest ordered subset

Postby mf » Mon Feb 23, 2009 11:49 pm

You can make a range tree which stores a set of 2D points, some value for each of them, and answers queries of these forms: 1) 'what is the largest value, associated with any point (x, y), such that x<=x0, y<=y0', and 2) 'increase value of point (x, y) to z'.

After you have this structure, you can use it to accelerate the usual O(n^2) dynamic programming algorithm. First, build a range tree which contains the points {(y, z) | (x,y,z) is an input point}, and associate a value of zero with each of them. Then for each point (x_i,y_i,z_i) in increasing order of x coordinate, ask the range tree what's the largest value associated with any point (y, z) such that y<=y_i and z<=z_i, and change the value of point (y_i, z_i) to that value plus 1. In the end, the maximum of all point's values will be the answer.
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Mon Feb 23, 2009 11:54 pm

How do you compute when there are 2 largest values ? Why isn't it the smallest largest value.

What if we have x,y < (x1,y1) (x2,y2) ie x<x1 && y<y1 and x<x2 && y<y2 but x1 < x2 and y1> y2.

Which one do i choose ?

My understanding of your post is if we have a sequence x1 < x2 < x3 <x4... for x1 you find the next smallest value xk> x1 and do v[xk]=v[x1]+1.

For ties , my understanding is to increase all of v[k's] values. I think i miss your insight.
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Re: Longest ordered subset

Postby mf » Tue Feb 24, 2009 12:12 am

Sorry, I don't quite understand what you mean.

My understanding of your post is if we have a sequence x1 < x2 < x3 <x4... for x1 you find the next smallest value xk> x1 and do v[xk]=v[x1]+1.

Here's the DP algorithm that I use:
Code: Select all
Input: coordinates of N points x[1..N], y[1..N], z[1..N], sorted by x.

for i = 1..N:
    v[i] = 1
    for j = 1..i-1:
        if y[j] <= y[i] and z[j] <= z[i]:
            v[i] = max(v[i], v[j] + 1)

output max(v[1], ..., v[N])

What I propose is: use a 2D range tree to accelerate the inner loop - it can be implemented as a query to a range tree in O(log^2 n) time.
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Tue Feb 24, 2009 9:50 am

Code: Select all
    for j = 1..i-1:
        if y[j] <= y[i] and z[j] <= z[i]:
            v[i] = max(v[i], v[j] + 1)


When you say this .

{ 1,1 }, { 2,3 }, { 3,2 }, { 4,3 }-> 1st,3rd,4th
{ 1,1 }, { 2,3 }, { 3,2 }, { 3,4 }-> 1st,2nd,4th


When you are at (6,3) , you have to check for both (5,1) (4,2) in first case and (4,3) (5,2) in second case where as range tree gives only 1 value which is either 2nd or 3rd pair.
Last edited by tryit1 on Tue Feb 24, 2009 11:46 am, edited 1 time in total.
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Re: Longest ordered subset

Postby mf » Tue Feb 24, 2009 10:46 am

Sorry, but I don't get you.

Are you saying there is a problem with the DP algorithm above? Or you don't believe that range trees can be used to replace the inner loop in it?
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Tue Feb 24, 2009 11:51 am

{ 1,1 }, { 2,3 }, { 3,2 }, { 4,3 }-> 1st,3rd,4th
{ 1,1 }, { 2,3 }, { 3,2 }, { 3,4 }-> 1st,2nd,4th

{ 1,2 }, { 2,3 }, { 3,2 }, { 4,4 }-> 1st,2nd,4th ,

When we come to (4,4) we should select (2,3) not (3,2) even though (3,2) < (4,4).


Dp algorithm is correct but in the range tree , it has to return a set of points, namely { 2,3 }, { 3,2 } for the above example

length 1-> (1,1)
length 2-> ( {2,3} , {3,2} )
When we get point (4,3) you have to see if it matches any in length 2 , then update otherwise in length1 .


In the range tree if we have points

(3,4) (1,6)

which point do i get to put in its parent ? (3,4) or (1,6) , it depends on the next element. If the next element is (4,5) i put (3,4) because (3,4) (4,5) is good.

if the next element is (2,7) i put (1,6) to get (1,6) (2,7). You always have to look at the successive elements.
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Re: Longest ordered subset

Postby mf » Tue Feb 24, 2009 12:42 pm

Query to a range tree returns a single number, not a subset of points. The tree's responsibility is to organize the elements in such a way, that it can quickly "look" at all points (x, y), such that x<=x0 and y<=y0, and pick the point with maximum value. Range trees can accomplish this "looking" much faster than a naive algorithm, which examines these elements one by one, thanks to their special structure.

Here's my prototype code in Python, I hope it'll clarify some things:
Code: Select all
#!/usr/bin/env python2.5
import random

class RangeTree(object):
    def __init__(self, P):  # P is a list of (x, y) pairs.
        xs = sorted(set([x for (x, y) in P]))
        ys = sorted(set([y for (x, y) in P]))
        self.ys = ys
        self.fenwick = [0] * len(ys)      # 1D Fenwick tree on y axis

        if len(xs) > 1:
            self.mid = xs[len(xs) // 2]   # median x coordinate
            self.left  = RangeTree(filter(lambda (x, y): x < self.mid, P))
            self.right = RangeTree(filter(lambda (x, y): x >= self.mid, P))
        else:
            self.mid = None  # indicates that we're a leaf in outer tree

    def yrank(self, y):      # returns largest i such that ys[i] <= y
        a = -1
        b = len(self.ys)
        while b - a > 1:
            c = (a + b) >> 1
            if self.ys[c] <= y:
                a = c
            else:
                b = c
        return a

    def query1(self, y):
        i = self.yrank(y)
        res = 0
        while i >= 0:
            res = max(res, self.fenwick[i])
            i = (i & (i + 1)) - 1
        return res

    def increase1(self, y, val):
        i = self.yrank(y - 1) + 1
        while i < len(self.fenwick):
            self.fenwick[i] = max(self.fenwick[i], val)
            i |= i + 1

    def query(self, x, y):
        """Return max(value(x', y')) over all points (x', y'),
           represented by this subtree, such that x' <= x and y' <= y"""
        if self.mid is None:
            return self.query1(y)
        elif x < self.mid:
            return self.left.query(x, y)
        else:
            return max(self.left.query1(y),
                       self.right.query(x, y))

    def increase(self, x, y, val):
        """Changes the value of point (x, y) to the maximum of val and its old value."""
        self.increase1(y, val)
        if self.mid is not None:
            if x < self.mid:
                self.left.increase(x, y, val)
                self.right.increase1(y, val)
            else:
                self.right.increase(x, y, val)


# Solution for the main problem.
# Input:
#   a list of (x, y, z) triples
# Output:
#   length of longest sequences of points (x1,y1,z1),...,(x_k,y_k,z_k),
#   such that x_i <= x_{i+1}, y_i <= y_{i+1}, z_i <= z_{i+1}
# Assumes that all x coordinates are distinct.
def solution(input):
    input = sorted(input)                           # sort by x
    T = RangeTree([(y, z) for (x, y, z) in input])  # build tree
    res = 0
    for x, y, z in input:
        v = 1 + T.query(y, z)
        res = max(res, v)
        T.increase(y, z, v)
    return res

assert solution([(3,6,3), (2,5,1), (-1,4,2), (-5,0,0)]) == 3
assert solution([(3,6,3), (2,5,2), (-1,4,3), (-5,0,0)]) == 3



#  unit test for RangeTree:

class DumbRangeTree(object):
    def __init__(self, P):
        self.d = dict()
        for x, y in P: self.d[x, y] = 0
    def query(self, x0, y0):
        res = 0
        for x, y in self.d.iterkeys():
            if x <= x0 and y <= y0:
                res = max(res, self.d[x, y])
        return res
    def increase(self, x, y, val):
        self.d[x, y] = max(self.d[x, y], val)

def testRangeTree():
    r = random.Random(42)
    for test_no in range(10):
        print 'Test %d' % test_no
        N = r.randint(1, 1000)
        M = 1000000000
        P = [(r.randint(0, M), r.randint(0, M)) for each in range(N)]
        tree1 = DumbRangeTree(P)
        tree2 = RangeTree(P)
        for each in range(10000):
            if r.randint(1, 100) < 30:
                x, y = r.choice(P)
                z = r.randint(1, M)
                tree1.increase(x, y, z)
                tree2.increase(x, y, z)
            else:
                x, y = r.choice(P)
            assert tree1.query(x, y) == tree2.query(x, y)
testRangeTree()
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Tue Feb 24, 2009 10:06 pm

---------------------------------------Point P
-------------Point X
************
************
---------------------Point Y
********************
-------------------------- Point Z
***************************
***************************
***************************

The sequence of points is X , Y,Z,P. When it comes to P which one do you choose X or Y or Z? I am asking this because the sequence until X can be longest or Y can be longest or Z can be longest.

If you choose based on x direction , then you will choosing point Z, if you choose Y direction , you will be choosing point X . But we could have a point R that is less than Y. So the answer will be R, Y, P.

---------------------------------------Point P
-------------Point X
************
************
---------------------Point Y
********************
-------------- PointR
-------------------------- Point Z
***************************
***************************
***************************
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Re: Longest ordered subset

Postby mf » Tue Feb 24, 2009 11:53 pm

The tree chooses the point, which has the largest value, associated with it.
And the point's value here is computed by DP - it's the length of longest increasing sequence of points ending at that point.

In your second case, it'll choose point Y, as it has a value of 1 (thanks to point R), while X and Z have 0. In the first case, it can choose any of X,Y,Z.
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Fri Aug 14, 2009 3:24 pm

mf wrote:
Code: Select all
    def increase(self, x, y, val):
        """Changes the value of point (x, y) to the maximum of val and its old value."""
        self.increase1(y, val)
        if self.mid is not None:
            if x < self.mid:
                self.left.increase(x, y, val)
 *************  self.right.increase1(y, val) ********************
            else:
                self.right.increase(x, y, val)



************* self.right.increase1(y, val) ********************
What does this line do ? I feel it is wrong , can you explain it to me. I understand the function increase without that line.
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Re: Longest ordered subset

Postby mf » Fri Aug 14, 2009 4:47 pm

Hm... the code seems to pass the tests without it, so I guess that line is just a harmless mistake.

I would appreciate it if you don't ask me questions about code that I wrote many months ago, because after such a long time I might have no clue what it does (nor why I wrote it that way), too. :)

(And who does...)
mf
Guru
 
Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

Re: Longest ordered subset

Postby tryit1 » Sat Aug 15, 2009 3:03 am

i'm really thankful to your replies and help on problems. I sometimes don't understand or vaguely understand what you wrote. I come back later and on an another day i re-read it and understand it. Only this time it was months ago :)

Well i tried commenting that line and running it and it passed.

But conceptually i thought it was wrong.Most of the times my understanding about the problem /code had to be changed because it was wrong.

What i understood from below code is at every root where (x,y) is possible you increase the value of that particular y. Now if x is found in left subtree , you do a increase for left subtree and otherwise for right subtree. The recursion will take care until the leaf ie (log n) levels.

Lets say our node is left most leaf. So everything (y val) in the left subtree of root gets updated. But why are we increasing in right node of root with y . It should be wrong. But funny thing was i couldn't find an example where it will go wrong. So i thought it was necessary. It doesn't fail because it does extra work but should have failed on subsequent queries.

I feel it works because you are making only 1 query per test case ie you are not querying for the updated values otherwise you would have failed. Maybe i'm wrong.

Code: Select all
  def increase(self, x, y, val):
        """Changes the value of point (x, y) to the maximum of val and its old value."""
        self.increase1(y, val)
        if self.mid is not None:
            if x < self.mid:
                self.left.increase(x, y, val)
*************  self.right.increase1(y, val) ********************
            else:
                self.right.increase(x, y, val)
User avatar
tryit1
Experienced poster
 
Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Next

Return to Algorithms

Who is online

Users browsing this forum: No registered users and 0 guests