xhSong's Blog

pku-3368

2009年8月27日 11:19

Frequent valuesTime

Limit: 2000MS Memory Limit: 65536K
Total Submissions: 3831 Accepted: 1314

Description

You are given a sequence of n integers a1 , a2 , ... , an in non-decreasing order. In addition to that, you are given several queries consisting of indices i and j (1 ≤ i ≤ j ≤ n). For each query, determine the most frequent value among the integers ai , ... , aj.

Input

The input consists of several test cases. Each test case starts with a line containing two integers n and q (1 ≤ n, q ≤ 100000). The next line contains n integers a1 , ... , an (-100000 ≤ ai ≤ 100000, for each i ∈ {1, ..., n}) separated by spaces. You can assume that for each i ∈ {1, ..., n-1}: ai ≤ ai+1. The following q lines contain one query each, consisting of two integers i and j (1 ≤ i ≤ j ≤ n), which indicate the boundary indices for the
query.

The last test case is followed by a line containing a single 0.

Output

For each query, print one line with one integer: The number of occurrences of the most frequent value within the given range.

Sample Input
10 3
-1 -1 1 1 1 1 3 10 10 10
2 3
1 10
5 10
0

Sample Output
1
4
3

Source
Ulm Local 2007

解：

很显然，这题要用离散化，首先将原始数组a[1..n]，变成另外一个数组b[1..k]，b[i]表示a数组中第i种数的个数（显然k<=n），再用一个数组c[1..k]来记录数组b的前i项和，求频率最高数的时候，我们只要理清所要查询的区间中含有多少个完整的b[i]，求出这些b[i]中的最大值，再于两端边界比较，就能得到结果了。
就拿样例来说，这时a[]={0,-1,-1,1,1,1,1,3,10,10,10}，b[]={0,2,4,1,3}，c[]={0,2,6,7,10}，k=4；

当查询5 10时，通过对c数组的查询可以得到有两个完整的b[i]，就是b[3]=1,b[4]=3，这两个中的最大值是3，再处理边界情况，a[5]到a[6]有两个1，比3小，故可得结果为3。

由于本人的语言表达能力有限，下面贴出代码，请结合代码理解：

线段树求最大值法：

#include <cstdio>
#include <algorithm>
#define MAX 100010
#define Max(a,b) (a>b?a:b)

int a[MAX],b[MAX],c[MAX];

typedef struct node {
    int i,j,max;
    struct node *left,*right;
}segTree;

segTree t[MAX*2];
int used=0;

segTree *creat(int i,int j)
{
    if(i>j){
        return NULL;
    }
    segTree *T=&t[used++];
    T->i = i , T->j = j;
    if(i==j){
        T->left = T->right =NULL;
        T->max = b[i];
    }
    else {
        T->left = creat(i,(i+j)/2);
        T->right = creat((i+j)/2+1,j);
        T->max = Max(T->left->max,T->right->max);
    }
    return T;
}

int search(segTree *T,int i,int j)
{
    if(T== NULL || T->i >j || T->j <i){
        return 0;
    }
    if(T->i>=i && T->j<=j){
        return T->max;
    }
    int leftmax = search(T->left,i,j),rightmax = search(T->right,i,j);
    return Max(leftmax,rightmax);
}

int main()
{
    int n,q,x,y,k;
    while(scanf("%d",&n),n>0){
        scanf("%d",&q);
        for(int i=1;i<=n;i++){
            scanf("%d",&a[i]);
        }
        b[1]=1,x=a[1],k=1;
        for(int i=2;i<=n;i++){
            if(x==a[i]){
                b[k]++;
            }
            else {
                b[++k]=1;
                x=a[i];
            }
        }
        segTree *T=creat(1,k);
        c[0]=0;
        for(int i=1;i<=k;i++){
            c[i]=c[i-1]+b[i];
        }
        c[++k]=100001;
        for(int i=0;i<q;i++){
            scanf("%d%d",&x,&y);
            int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;/*这里处理的一些细节要注意一下*/
            int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1;
            if(tempi>tempj+1){
                printf("%d\n",y-x+1);
            }
            else if(tempi==tempj+1){
                printf("%d\n",Max(c[tempj]-x+1,y-c[tempj]));
            }
            else {
                int max = search(T,tempi,tempj);
                max = Max(max,c[tempi-1]-x+1);
                max = Max(max,y-c[tempj]);
                printf("%d\n",max);
            }
        }
    }
}

很久以前就听说有一个RMQ问题貌似是求区间极大极小值，于是baidu了一下，原来RMQ (Range Minimum/Maximum Query)问题是指：对于长度为n的数列A，回答若干询问RMQ(A,i,j)(i,j<=n)，返回数列A中下标在[i,j]里的最小(大）值。这里有两种较优的方法，一是上面写的线段树求法，二是没学过的动态规划ST算法（Sparse Table）。先简要介绍一下ST算法：以求最大值为例，设d[i,j]表示[i,i+2^j-1]这个区间内的最大值，那么在询问到[a,b]区间的最大值时答案就是max(d[a,k], d[b-2^k+1,k])，其中k是满足2^k<=b-a的最大的k,即k=[ln(b-a+1)/ln(2)]，d的求法可以用动态规划，d[i,j]=max(d[i,j-1],d[i+2^(j-1),j-1])（摘自百度百科）。学习了这种算法以后，就试着用这种方法来解这道题，于是又下面代码：

#include <cstdio>
#include <cmath>
#include <algorithm>
#define MAX 100010
#define Max(a,b) (a>b?a:b)

int a[MAX],b[MAX][20],c[MAX];
void make(int n)
{
    int k=(int)(log(n*1.0)/log(2.0));
    for(int j=1;j<=k;j++){
        for(int i=1;i<=n-(1<<j)+1;i++){
            b[i][j]=Max(b[i][j-1],b[i+(1<<(j-1))][j-1]);
        }
    }
}
int search(int i,int j)
{
    int k=(int)(log(j-i+1.0)/log(2.0));
    int rmq=Max(b[i][k], b[j - (1 <<k) + 1][k]);
    return rmq;
}

int main()
{
    int n,q,x,y,k;
    while(scanf("%d",&n),n>0){
        scanf("%d",&q);
        for(int i=1;i<=n;i++){
            scanf("%d",&a[i]);
        }
        b[1][0]=1,x=a[1],k=1;
        for(int i=2;i<=n;i++){
            if(x==a[i]){
                b[k][0]++;
            }
            else {
                b[++k][0]=1;
                x=a[i];
            }
        }
        make(k);
        c[0]=0;
        for(int i=1;i<=k;i++){
            c[i]=c[i-1]+b[i][0];
        }
        c[++k]=100001;
        for(int i=0;i<q;i++){
            scanf("%d%d",&x,&y);
            int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;
            int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1;
            if(tempi>tempj+1){
                printf("%d\n",y-x+1);
            }
            else if(tempi==tempj+1){
                printf("%d\n",Max(c[tempj]-x+1,y-c[tempj]));
            }
            else {
                int max = search(tempi,tempj);
                max = Max(max,c[tempi-1]-x+1);
                max = Max(max,y-c[tempj]);
                printf("%d\n",max);
            }
        }
    }
}

注意，不要把int k=(int)(log(n*1.0)/log(2.0));写成int k=(int)(log(n)/log(2));用G++交不会有事，可用C++交就会
        math.h(567): could be 'long double log(long double)'
        math.h(519): or       'float log(float)'
        math.h(121): or       'double log(double)'
        while trying to match the argument list '(int)'
开始没看懂意思，CE了n多次，血的教训呀……

比较两种算法，先从空间复杂度来说，线段树是o(n)的，ST是o(nlogn)，就这点而言线段树要优于ST；

再看预处理，线段树的预处理的o(n)的，而ST的预处理是o(nlogn)的，
而对于一次查询，线段树是o(logn)的，ST更快，是o(1)的，
显然从时间的复杂度来说，他们各有长处，当n远远大于q（查询次数）时，线段树是占优势的，而对于q远远大于n的情况，ST就更占优势了

而对于代码复杂度，ST要优于线段树

由于这题n的范围和p的范围相同，于是两种算法错不多：
线段树：3368 Accepted 2684K 657MS C++ 2054B
ST：3368 Accepted 4224K 532MS C++ 1650B

还有一点不能忘记，线段树是在线的算法，而ST是离线的……

hustsxh

分类

最新评论

最新留言

链接

功能

pku-3368

2009年8月27日 11:19