pku-3368
2009年8月27日 11:19
Frequent valuesTime
Limit: 2000MS Memory Limit: 65536K
Total Submissions: 3831 Accepted: 1314
Description
You are given a sequence of n integers a1 , a2 , ... , an in non-decreasing order. In addition to that, you are given several queries consisting of indices i and j (1 ≤ i ≤ j ≤ n). For each query, determine the most frequent value among the integers ai , ... , aj.
Input
The input consists of several test cases. Each test case starts with a line containing two integers n and q (1 ≤ n, q ≤ 100000). The next line contains n integers a1 , ... , an (-100000 ≤ ai ≤ 100000, for each i ∈ {1, ..., n}) separated by spaces. You can assume that for each i ∈ {1, ..., n-1}: ai ≤ ai+1. The following q lines contain one query each, consisting of two integers i and j (1 ≤ i ≤ j ≤ n), which indicate the boundary indices for the
query.
The last test case is followed by a line containing a single 0.
Output
For each query, print one line with one integer: The number of occurrences of the most frequent value within the given range.
Sample Input
10 3
-1 -1 1 1 1 1 3 10 10 10
2 3
1 10
5 10
0
Sample Output
1
4
3
Source
Ulm Local 2007
解:
很显然,这题要用离散化,首先将原始数组a[1..n],变成另外一个数组b[1..k],b[i]表示a数组中第i种数的个数(显然k<=n),再用一个数组c[1..k]来记录数组b的前i项和,求频率最高数的时候,我们只要理清所要查询的区间中含有多少个完整的b[i],求出这些b[i]中的最大值,再于两端边界比较,就能得到结果了。
就拿样例来说,这时a[]={0,-1,-1,1,1,1,1,3,10,10,10},b[]={0,2,4,1,3},c[]={0,2,6,7,10},k=4;
当查询5 10时,通过对c数组的查询可以得到有两个完整的b[i],就是b[3]=1,b[4]=3,这两个中的最大值是3,再处理边界情况,a[5]到a[6]有两个1,比3小,故可得结果为3。
由于本人的语言表达能力有限,下面贴出代码,请结合代码理解:
线段树求最大值法:
#include <cstdio> #include <algorithm> #define MAX 100010 #define Max(a,b) (a>b?a:b) int a[MAX],b[MAX],c[MAX]; typedef struct node { int i,j,max; struct node *left,*right; }segTree; segTree t[MAX*2]; int used=0; segTree *creat(int i,int j) { if(i>j){ return NULL; } segTree *T=&t[used++]; T->i = i , T->j = j; if(i==j){ T->left = T->right =NULL; T->max = b[i]; } else { T->left = creat(i,(i+j)/2); T->right = creat((i+j)/2+1,j); T->max = Max(T->left->max,T->right->max); } return T; } int search(segTree *T,int i,int j) { if(T== NULL || T->i >j || T->j <i){ return 0; } if(T->i>=i && T->j<=j){ return T->max; } int leftmax = search(T->left,i,j),rightmax = search(T->right,i,j); return Max(leftmax,rightmax); } int main() { int n,q,x,y,k; while(scanf("%d",&n),n>0){ scanf("%d",&q); for(int i=1;i<=n;i++){ scanf("%d",&a[i]); } b[1]=1,x=a[1],k=1; for(int i=2;i<=n;i++){ if(x==a[i]){ b[k]++; } else { b[++k]=1; x=a[i]; } } segTree *T=creat(1,k); c[0]=0; for(int i=1;i<=k;i++){ c[i]=c[i-1]+b[i]; } c[++k]=100001; for(int i=0;i<q;i++){ scanf("%d%d",&x,&y); int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;/*这里处理的一些细节要注意一下*/ int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1; if(tempi>tempj+1){ printf("%d\n",y-x+1); } else if(tempi==tempj+1){ printf("%d\n",Max(c[tempj]-x+1,y-c[tempj])); } else { int max = search(T,tempi,tempj); max = Max(max,c[tempi-1]-x+1); max = Max(max,y-c[tempj]); printf("%d\n",max); } } } }
很久以前就听说有一个RMQ问题貌似是求区间极大极小值,于是baidu了一下,原来RMQ (Range Minimum/Maximum Query)问题是指:对于长度为n的数列A,回答若干询问RMQ(A,i,j)(i,j<=n),返回数列A中下标在[i,j]里的最小(大)值。这里有两种较优的方法,一是上面写的线段树求法,二是没学过的动态规划ST算法(Sparse Table)。先简要介绍一下ST算法:以求最大值为例,设d[i,j]表示[i,i+2^j-1]这个区间内的最大值,那么在询问到[a,b]区间的最大值时答案就是max(d[a,k], d[b-2^k+1,k]),其中k是满足2^k<=b-a的最大的k,即k=[ln(b-a+1)/ln(2)],d的求法可以用动态规划,d[i,j]=max(d[i,j-1],d[i+2^(j-1),j-1])(摘自百度百科)。学习了这种算法以后,就试着用这种方法来解这道题,于是又下面代码:
#include <cstdio> #include <cmath> #include <algorithm> #define MAX 100010 #define Max(a,b) (a>b?a:b) int a[MAX],b[MAX][20],c[MAX]; void make(int n) { int k=(int)(log(n*1.0)/log(2.0)); for(int j=1;j<=k;j++){ for(int i=1;i<=n-(1<<j)+1;i++){ b[i][j]=Max(b[i][j-1],b[i+(1<<(j-1))][j-1]); } } } int search(int i,int j) { int k=(int)(log(j-i+1.0)/log(2.0)); int rmq=Max(b[i][k], b[j - (1 <<k) + 1][k]); return rmq; } int main() { int n,q,x,y,k; while(scanf("%d",&n),n>0){ scanf("%d",&q); for(int i=1;i<=n;i++){ scanf("%d",&a[i]); } b[1][0]=1,x=a[1],k=1; for(int i=2;i<=n;i++){ if(x==a[i]){ b[k][0]++; } else { b[++k][0]=1; x=a[i]; } } make(k); c[0]=0; for(int i=1;i<=k;i++){ c[i]=c[i-1]+b[i][0]; } c[++k]=100001; for(int i=0;i<q;i++){ scanf("%d%d",&x,&y); int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ; int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1; if(tempi>tempj+1){ printf("%d\n",y-x+1); } else if(tempi==tempj+1){ printf("%d\n",Max(c[tempj]-x+1,y-c[tempj])); } else { int max = search(tempi,tempj); max = Max(max,c[tempi-1]-x+1); max = Max(max,y-c[tempj]); printf("%d\n",max); } } } }
注意,不要把int k=(int)(log(n*1.0)/log(2.0));写成int k=(int)(log(n)/log(2));用G++交不会有事,可用C++交就会
math.h(567): could be 'long double log(long double)'
math.h(519): or 'float log(float)'
math.h(121): or 'double log(double)'
while trying to match the argument list '(int)'
开始没看懂意思,CE了n多次,血的教训呀……
比较两种算法,先从空间复杂度来说,线段树是o(n)的,ST是o(nlogn),就这点而言线段树要优于ST;
再看预处理,线段树的预处理的o(n)的,而ST的预处理是o(nlogn)的,
而对于一次查询,线段树是o(logn)的,ST更快,是o(1)的,
显然从时间的复杂度来说,他们各有长处,当n远远大于q(查询次数)时,线段树是占优势的,而对于q远远大于n的情况,ST就更占优势了
而对于代码复杂度,ST要优于线段树
由于这题n的范围和p的范围相同,于是两种算法错不多:
线段树:3368 Accepted 2684K 657MS C++ 2054B
ST:3368 Accepted 4224K 532MS C++ 1650B
还有一点不能忘记,线段树是在线的算法,而ST是离线的……