## Longest ordered subset

Moderator: Board moderators

### Longest ordered subset

i'm given a set of 3D points (x,y,z) , The first coordinate is unique. I've to choose a longest subset so that the points are increasing.

(a,b,c) < (p,q,r) if (a<p AND b<q AND c<r)..

Say the points were (3,6,3) , (2,5,1), (-1,4,2) (-5,0,0). then longest subset is (-5,0,0) (-1,4,2) (3,6,3).
Say the points were (3,6,3) , (2,5,2), (-1,4,3) (-5,0,0). then longest subset is (-5,0,0) (2,5,2) (3,6,3).

I could think some reduction. If we sort by the x co-ordinate, then we need to find the longest increasing subsequence of (y,z). So the problem boils down to

(0,0) (5,1) (4,2) (6,3) -> 3

(0,0) (4,3) (5,2) (6,3) -> 3

We cannot extend the 1D LIS algorithm to 2D because the choosing of the point (5,1) or (4,2) makes the difference. When we know that (x,y) != (a,b) then we cannot make a decision of the points to choose. In 1D , if x!=y, either x<y or x>y but it doesn't generalize in 2D.

How do we solve this problem in less than O(n^2) ? How to extend this to n dimensional points.
niklaus
New poster

Posts: 6
Joined: Thu Nov 16, 2006 2:34 pm

### Re: Longest ordered subset

You can solve it in O(n log^2 n) with 2D range trees.
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

Can you throw some light on it ? . But how does range trees apply here .

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

### Re: Longest ordered subset

You can make a range tree which stores a set of 2D points, some value for each of them, and answers queries of these forms: 1) 'what is the largest value, associated with any point (x, y), such that x<=x0, y<=y0', and 2) 'increase value of point (x, y) to z'.

After you have this structure, you can use it to accelerate the usual O(n^2) dynamic programming algorithm. First, build a range tree which contains the points {(y, z) | (x,y,z) is an input point}, and associate a value of zero with each of them. Then for each point (x_i,y_i,z_i) in increasing order of x coordinate, ask the range tree what's the largest value associated with any point (y, z) such that y<=y_i and z<=z_i, and change the value of point (y_i, z_i) to that value plus 1. In the end, the maximum of all point's values will be the answer.
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

How do you compute when there are 2 largest values ? Why isn't it the smallest largest value.

What if we have x,y < (x1,y1) (x2,y2) ie x<x1 && y<y1 and x<x2 && y<y2 but x1 < x2 and y1> y2.

Which one do i choose ?

My understanding of your post is if we have a sequence x1 < x2 < x3 <x4... for x1 you find the next smallest value xk> x1 and do v[xk]=v[x1]+1.

For ties , my understanding is to increase all of v[k's] values. I think i miss your insight.

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

### Re: Longest ordered subset

Sorry, I don't quite understand what you mean.

My understanding of your post is if we have a sequence x1 < x2 < x3 <x4... for x1 you find the next smallest value xk> x1 and do v[xk]=v[x1]+1.

Here's the DP algorithm that I use:
Code: Select all
`Input: coordinates of N points x[1..N], y[1..N], z[1..N], sorted by x.for i = 1..N:    v[i] = 1    for j = 1..i-1:        if y[j] <= y[i] and z[j] <= z[i]:            v[i] = max(v[i], v[j] + 1)output max(v[1], ..., v[N])`

What I propose is: use a 2D range tree to accelerate the inner loop - it can be implemented as a query to a range tree in O(log^2 n) time.
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

Code: Select all
`    for j = 1..i-1:        if y[j] <= y[i] and z[j] <= z[i]:            v[i] = max(v[i], v[j] + 1)`

When you say this .

{ 1,1 }, { 2,3 }, { 3,2 }, { 4,3 }-> 1st,3rd,4th
{ 1,1 }, { 2,3 }, { 3,2 }, { 3,4 }-> 1st,2nd,4th

When you are at (6,3) , you have to check for both (5,1) (4,2) in first case and (4,3) (5,2) in second case where as range tree gives only 1 value which is either 2nd or 3rd pair.
Last edited by tryit1 on Tue Feb 24, 2009 11:46 am, edited 1 time in total.

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

### Re: Longest ordered subset

Sorry, but I don't get you.

Are you saying there is a problem with the DP algorithm above? Or you don't believe that range trees can be used to replace the inner loop in it?
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

{ 1,1 }, { 2,3 }, { 3,2 }, { 4,3 }-> 1st,3rd,4th
{ 1,1 }, { 2,3 }, { 3,2 }, { 3,4 }-> 1st,2nd,4th

{ 1,2 }, { 2,3 }, { 3,2 }, { 4,4 }-> 1st,2nd,4th ,

When we come to (4,4) we should select (2,3) not (3,2) even though (3,2) < (4,4).

Dp algorithm is correct but in the range tree , it has to return a set of points, namely { 2,3 }, { 3,2 } for the above example

length 1-> (1,1)
length 2-> ( {2,3} , {3,2} )
When we get point (4,3) you have to see if it matches any in length 2 , then update otherwise in length1 .

In the range tree if we have points

(3,4) (1,6)

which point do i get to put in its parent ? (3,4) or (1,6) , it depends on the next element. If the next element is (4,5) i put (3,4) because (3,4) (4,5) is good.

if the next element is (2,7) i put (1,6) to get (1,6) (2,7). You always have to look at the successive elements.

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

### Re: Longest ordered subset

Query to a range tree returns a single number, not a subset of points. The tree's responsibility is to organize the elements in such a way, that it can quickly "look" at all points (x, y), such that x<=x0 and y<=y0, and pick the point with maximum value. Range trees can accomplish this "looking" much faster than a naive algorithm, which examines these elements one by one, thanks to their special structure.

Here's my prototype code in Python, I hope it'll clarify some things:
Code: Select all
`#!/usr/bin/env python2.5import randomclass RangeTree(object):    def __init__(self, P):  # P is a list of (x, y) pairs.        xs = sorted(set([x for (x, y) in P]))        ys = sorted(set([y for (x, y) in P]))        self.ys = ys        self.fenwick = [0] * len(ys)      # 1D Fenwick tree on y axis        if len(xs) > 1:            self.mid = xs[len(xs) // 2]   # median x coordinate            self.left  = RangeTree(filter(lambda (x, y): x < self.mid, P))            self.right = RangeTree(filter(lambda (x, y): x >= self.mid, P))        else:            self.mid = None  # indicates that we're a leaf in outer tree    def yrank(self, y):      # returns largest i such that ys[i] <= y        a = -1        b = len(self.ys)        while b - a > 1:            c = (a + b) >> 1            if self.ys[c] <= y:                a = c            else:                b = c        return a    def query1(self, y):        i = self.yrank(y)        res = 0        while i >= 0:            res = max(res, self.fenwick[i])            i = (i & (i + 1)) - 1        return res    def increase1(self, y, val):        i = self.yrank(y - 1) + 1        while i < len(self.fenwick):            self.fenwick[i] = max(self.fenwick[i], val)            i |= i + 1    def query(self, x, y):        """Return max(value(x', y')) over all points (x', y'),           represented by this subtree, such that x' <= x and y' <= y"""        if self.mid is None:            return self.query1(y)        elif x < self.mid:            return self.left.query(x, y)        else:            return max(self.left.query1(y),                       self.right.query(x, y))    def increase(self, x, y, val):        """Changes the value of point (x, y) to the maximum of val and its old value."""        self.increase1(y, val)        if self.mid is not None:            if x < self.mid:                self.left.increase(x, y, val)                self.right.increase1(y, val)            else:                self.right.increase(x, y, val)# Solution for the main problem.# Input:#   a list of (x, y, z) triples# Output:#   length of longest sequences of points (x1,y1,z1),...,(x_k,y_k,z_k),#   such that x_i <= x_{i+1}, y_i <= y_{i+1}, z_i <= z_{i+1}# Assumes that all x coordinates are distinct.def solution(input):    input = sorted(input)                           # sort by x    T = RangeTree([(y, z) for (x, y, z) in input])  # build tree    res = 0    for x, y, z in input:        v = 1 + T.query(y, z)        res = max(res, v)        T.increase(y, z, v)    return resassert solution([(3,6,3), (2,5,1), (-1,4,2), (-5,0,0)]) == 3assert solution([(3,6,3), (2,5,2), (-1,4,3), (-5,0,0)]) == 3#  unit test for RangeTree:class DumbRangeTree(object):    def __init__(self, P):        self.d = dict()        for x, y in P: self.d[x, y] = 0    def query(self, x0, y0):        res = 0        for x, y in self.d.iterkeys():            if x <= x0 and y <= y0:                res = max(res, self.d[x, y])        return res    def increase(self, x, y, val):        self.d[x, y] = max(self.d[x, y], val)def testRangeTree():    r = random.Random(42)    for test_no in range(10):        print 'Test %d' % test_no        N = r.randint(1, 1000)        M = 1000000000        P = [(r.randint(0, M), r.randint(0, M)) for each in range(N)]        tree1 = DumbRangeTree(P)        tree2 = RangeTree(P)        for each in range(10000):            if r.randint(1, 100) < 30:                x, y = r.choice(P)                z = r.randint(1, M)                tree1.increase(x, y, z)                tree2.increase(x, y, z)            else:                x, y = r.choice(P)            assert tree1.query(x, y) == tree2.query(x, y)testRangeTree()`
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

---------------------------------------Point P
-------------Point X
************
************
---------------------Point Y
********************
-------------------------- Point Z
***************************
***************************
***************************

The sequence of points is X , Y,Z,P. When it comes to P which one do you choose X or Y or Z? I am asking this because the sequence until X can be longest or Y can be longest or Z can be longest.

If you choose based on x direction , then you will choosing point Z, if you choose Y direction , you will be choosing point X . But we could have a point R that is less than Y. So the answer will be R, Y, P.

---------------------------------------Point P
-------------Point X
************
************
---------------------Point Y
********************
-------------- PointR
-------------------------- Point Z
***************************
***************************
***************************

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

### Re: Longest ordered subset

The tree chooses the point, which has the largest value, associated with it.
And the point's value here is computed by DP - it's the length of longest increasing sequence of points ending at that point.

In your second case, it'll choose point Y, as it has a value of 1 (thanks to point R), while X and Z have 0. In the first case, it can choose any of X,Y,Z.
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

mf wrote:
Code: Select all
`    def increase(self, x, y, val):        """Changes the value of point (x, y) to the maximum of val and its old value."""        self.increase1(y, val)        if self.mid is not None:            if x < self.mid:                self.left.increase(x, y, val) *************  self.right.increase1(y, val) ********************            else:                self.right.increase(x, y, val)`

************* self.right.increase1(y, val) ********************
What does this line do ? I feel it is wrong , can you explain it to me. I understand the function increase without that line.

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

### Re: Longest ordered subset

Hm... the code seems to pass the tests without it, so I guess that line is just a harmless mistake.

I would appreciate it if you don't ask me questions about code that I wrote many months ago, because after such a long time I might have no clue what it does (nor why I wrote it that way), too. :)

(And who does...)
mf
Guru

Posts: 1244
Joined: Mon Feb 28, 2005 4:51 am
Location: Zürich, Switzerland

### Re: Longest ordered subset

i'm really thankful to your replies and help on problems. I sometimes don't understand or vaguely understand what you wrote. I come back later and on an another day i re-read it and understand it. Only this time it was months ago

Well i tried commenting that line and running it and it passed.

But conceptually i thought it was wrong.Most of the times my understanding about the problem /code had to be changed because it was wrong.

What i understood from below code is at every root where (x,y) is possible you increase the value of that particular y. Now if x is found in left subtree , you do a increase for left subtree and otherwise for right subtree. The recursion will take care until the leaf ie (log n) levels.

Lets say our node is left most leaf. So everything (y val) in the left subtree of root gets updated. But why are we increasing in right node of root with y . It should be wrong. But funny thing was i couldn't find an example where it will go wrong. So i thought it was necessary. It doesn't fail because it does extra work but should have failed on subsequent queries.

I feel it works because you are making only 1 query per test case ie you are not querying for the updated values otherwise you would have failed. Maybe i'm wrong.

Code: Select all
`  def increase(self, x, y, val):        """Changes the value of point (x, y) to the maximum of val and its old value."""        self.increase1(y, val)        if self.mid is not None:            if x < self.mid:                self.left.increase(x, y, val)*************  self.right.increase1(y, val) ********************            else:                self.right.increase(x, y, val)`

tryit1
Experienced poster

Posts: 119
Joined: Sun Aug 24, 2008 9:09 pm

Next