pku-3368

2009年8月27日 11:19

Frequent valuesTime

Limit: 2000MS  Memory Limit: 65536K
Total Submissions: 3831  Accepted: 1314


Description

You are given a sequence of n integers a1 , a2 , ... , an in non-decreasing order. In addition to that, you are given several queries consisting of indices i and j (1 ≤ i ≤ j ≤ n). For each query, determine the most frequent value among the integers ai , ... , aj.

Input

The input consists of several test cases. Each test case starts with a line containing two integers n and q (1 ≤ n, q ≤ 100000). The next line contains n integers a1 , ... , an (-100000 ≤ ai ≤ 100000, for each i ∈ {1, ..., n}) separated by spaces. You can assume that for each i ∈ {1, ..., n-1}: ai ≤ ai+1. The following q lines contain one query each, consisting of two integers i and j (1 ≤ i ≤ j ≤ n), which indicate the boundary indices for the
query.

The last test case is followed by a line containing a single 0.

Output

For each query, print one line with one integer: The number of occurrences of the most frequent value within the given range.

Sample Input
10 3
-1 -1 1 1 1 1 3 10 10 10
2 3
1 10
5 10
0

Sample Output
1
4
3

Source
Ulm Local 2007

解:

很显然,这题要用离散化,首先将原始数组a[1..n],变成另外一个数组b[1..k],b[i]表示a数组中第i种数的个数(显然k<=n),再用一个数组c[1..k]来记录数组b的前i项和,求频率最高数的时候,我们只要理清所要查询的区间中含有多少个完整的b[i],求出这些b[i]中的最大值,再于两端边界比较,就能得到结果了。
就拿样例来说,这时a[]={0,-1,-1,1,1,1,1,3,10,10,10},b[]={0,2,4,1,3},c[]={0,2,6,7,10},k=4;

当查询5 10时,通过对c数组的查询可以得到有两个完整的b[i],就是b[3]=1,b[4]=3,这两个中的最大值是3,再处理边界情况,a[5]到a[6]有两个1,比3小,故可得结果为3。

由于本人的语言表达能力有限,下面贴出代码,请结合代码理解:

 线段树求最大值法:

#include <cstdio>
#include <algorithm>
#define MAX 100010
#define Max(a,b) (a>b?a:b)

int a[MAX],b[MAX],c[MAX];

typedef struct node {
    int i,j,max;
    struct node *left,*right;
}segTree;

segTree t[MAX*2];
int used=0;

segTree *creat(int i,int j)
{
    if(i>j){
        return NULL;
    }
    segTree *T=&t[used++];
    T->i = i , T->j = j;
    if(i==j){
        T->left = T->right =NULL;
        T->max = b[i];
    }
    else {
        T->left = creat(i,(i+j)/2);
        T->right = creat((i+j)/2+1,j);
        T->max = Max(T->left->max,T->right->max);
    }
    return T;
}

int search(segTree *T,int i,int j)
{
    if(T== NULL || T->i >j || T->j <i){
        return 0;
    }
    if(T->i>=i && T->j<=j){
        return T->max;
    }
    int leftmax = search(T->left,i,j),rightmax = search(T->right,i,j);
    return Max(leftmax,rightmax);
}

int main()
{
    int n,q,x,y,k;
    while(scanf("%d",&n),n>0){
        scanf("%d",&q);
        for(int i=1;i<=n;i++){
            scanf("%d",&a[i]);
        }
        b[1]=1,x=a[1],k=1;
        for(int i=2;i<=n;i++){
            if(x==a[i]){
                b[k]++;
            }
            else {
                b[++k]=1;
                x=a[i];
            }
        }
        segTree *T=creat(1,k);
        c[0]=0;
        for(int i=1;i<=k;i++){
            c[i]=c[i-1]+b[i];
        }
        c[++k]=100001;
        for(int i=0;i<q;i++){
            scanf("%d%d",&x,&y);
            int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;/*这里处理的一些细节要注意一下*/
            int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1;
            if(tempi>tempj+1){
                printf("%d\n",y-x+1);
            }
            else if(tempi==tempj+1){
                printf("%d\n",Max(c[tempj]-x+1,y-c[tempj]));
            }
            else {
                int max = search(T,tempi,tempj);
                max = Max(max,c[tempi-1]-x+1);
                max = Max(max,y-c[tempj]);
                printf("%d\n",max);
            }
        }
    }
}

很久以前就听说有一个RMQ问题貌似是求区间极大极小值,于是baidu了一下,原来RMQ (Range Minimum/Maximum Query)问题是指:对于长度为n的数列A,回答若干询问RMQ(A,i,j)(i,j<=n),返回数列A中下标在[i,j]里的最小(大)值。这里有两种较优的方法,一是上面写的线段树求法,二是没学过的动态规划ST算法(Sparse Table)。先简要介绍一下ST算法:以求最大值为例,设d[i,j]表示[i,i+2^j-1]这个区间内的最大值,那么在询问到[a,b]区间的最大值时答案就是max(d[a,k], d[b-2^k+1,k]),其中k是满足2^k<=b-a的最大的k,即k=[ln(b-a+1)/ln(2)],d的求法可以用动态规划,d[i,j]=max(d[i,j-1],d[i+2^(j-1),j-1])(摘自百度百科)。学习了这种算法以后,就试着用这种方法来解这道题,于是又下面代码:

 

#include <cstdio>
#include <cmath>
#include <algorithm>
#define MAX 100010
#define Max(a,b) (a>b?a:b)

int a[MAX],b[MAX][20],c[MAX];
void make(int n)
{
    int k=(int)(log(n*1.0)/log(2.0));
    for(int j=1;j<=k;j++){
        for(int i=1;i<=n-(1<<j)+1;i++){
            b[i][j]=Max(b[i][j-1],b[i+(1<<(j-1))][j-1]);
        }
    }
}
int search(int i,int j)
{
    int k=(int)(log(j-i+1.0)/log(2.0));
    int rmq=Max(b[i][k], b[j - (1 <<k) + 1][k]);
    return rmq;
}

int main()
{
    int n,q,x,y,k;
    while(scanf("%d",&n),n>0){
        scanf("%d",&q);
        for(int i=1;i<=n;i++){
            scanf("%d",&a[i]);
        }
        b[1][0]=1,x=a[1],k=1;
        for(int i=2;i<=n;i++){
            if(x==a[i]){
                b[k][0]++;
            }
            else {
                b[++k][0]=1;
                x=a[i];
            }
        }
        make(k);
        c[0]=0;
        for(int i=1;i<=k;i++){
            c[i]=c[i-1]+b[i][0];
        }
        c[++k]=100001;
        for(int i=0;i<q;i++){
            scanf("%d%d",&x,&y);
            int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;
            int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1;
            if(tempi>tempj+1){
                printf("%d\n",y-x+1);
            }
            else if(tempi==tempj+1){
                printf("%d\n",Max(c[tempj]-x+1,y-c[tempj]));
            }
            else {
                int max = search(tempi,tempj);
                max = Max(max,c[tempi-1]-x+1);
                max = Max(max,y-c[tempj]);
                printf("%d\n",max);
            }
        }
    }
}

注意,不要把int k=(int)(log(n*1.0)/log(2.0));写成int k=(int)(log(n)/log(2));用G++交不会有事,可用C++交就会
        math.h(567): could be 'long double log(long double)'
        math.h(519): or       'float log(float)'
        math.h(121): or       'double log(double)'
        while trying to match the argument list '(int)'
开始没看懂意思,CE了n多次,血的教训呀……

比较两种算法,先从空间复杂度来说,线段树是o(n)的,ST是o(nlogn),就这点而言线段树要优于ST;

再看预处理,线段树的预处理的o(n)的,而ST的预处理是o(nlogn)的,
而对于一次查询,线段树是o(logn)的,ST更快,是o(1)的,
显然从时间的复杂度来说,他们各有长处,当n远远大于q(查询次数)时,线段树是占优势的,而对于q远远大于n的情况,ST就更占优势了

而对于代码复杂度,ST要优于线段树

由于这题n的范围和p的范围相同,于是两种算法错不多:
线段树:3368 Accepted 2684K 657MS C++ 2054B
ST:3368 Accepted 4224K 532MS C++ 1650B

还有一点不能忘记,线段树是在线的算法,而ST是离线的……