pku-3368
2009年8月27日 11:19
Frequent valuesTime
Limit: 2000MS Memory Limit: 65536K
Total Submissions: 3831 Accepted: 1314
Description
You are given a sequence of n integers a1 , a2 , ... , an in non-decreasing order. In addition to that, you are given several queries consisting of indices i and j (1 ≤ i ≤ j ≤ n). For each query, determine the most frequent value among the integers ai , ... , aj.
Input
The input consists of several test cases. Each test case starts with a line containing two integers n and q (1 ≤ n, q ≤ 100000). The next line contains n integers a1 , ... , an (-100000 ≤ ai ≤ 100000, for each i ∈ {1, ..., n}) separated by spaces. You can assume that for each i ∈ {1, ..., n-1}: ai ≤ ai+1. The following q lines contain one query each, consisting of two integers i and j (1 ≤ i ≤ j ≤ n), which indicate the boundary indices for the
query.
The last test case is followed by a line containing a single 0.
Output
For each query, print one line with one integer: The number of occurrences of the most frequent value within the given range.
Sample Input
10 3
-1 -1 1 1 1 1 3 10 10 10
2 3
1 10
5 10
0
Sample Output
1
4
3
Source
Ulm Local 2007
解:
很显然,这题要用离散化,首先将原始数组a[1..n],变成另外一个数组b[1..k],b[i]表示a数组中第i种数的个数(显然k<=n),再用一个数组c[1..k]来记录数组b的前i项和,求频率最高数的时候,我们只要理清所要查询的区间中含有多少个完整的b[i],求出这些b[i]中的最大值,再于两端边界比较,就能得到结果了。
就拿样例来说,这时a[]={0,-1,-1,1,1,1,1,3,10,10,10},b[]={0,2,4,1,3},c[]={0,2,6,7,10},k=4;
当查询5 10时,通过对c数组的查询可以得到有两个完整的b[i],就是b[3]=1,b[4]=3,这两个中的最大值是3,再处理边界情况,a[5]到a[6]有两个1,比3小,故可得结果为3。
由于本人的语言表达能力有限,下面贴出代码,请结合代码理解:
线段树求最大值法:
#include <cstdio>
#include <algorithm>
#define MAX 100010
#define Max(a,b) (a>b?a:b)
int a[MAX],b[MAX],c[MAX];
typedef struct node {
int i,j,max;
struct node *left,*right;
}segTree;
segTree t[MAX*2];
int used=0;
segTree *creat(int i,int j)
{
if(i>j){
return NULL;
}
segTree *T=&t[used++];
T->i = i , T->j = j;
if(i==j){
T->left = T->right =NULL;
T->max = b[i];
}
else {
T->left = creat(i,(i+j)/2);
T->right = creat((i+j)/2+1,j);
T->max = Max(T->left->max,T->right->max);
}
return T;
}
int search(segTree *T,int i,int j)
{
if(T== NULL || T->i >j || T->j <i){
return 0;
}
if(T->i>=i && T->j<=j){
return T->max;
}
int leftmax = search(T->left,i,j),rightmax = search(T->right,i,j);
return Max(leftmax,rightmax);
}
int main()
{
int n,q,x,y,k;
while(scanf("%d",&n),n>0){
scanf("%d",&q);
for(int i=1;i<=n;i++){
scanf("%d",&a[i]);
}
b[1]=1,x=a[1],k=1;
for(int i=2;i<=n;i++){
if(x==a[i]){
b[k]++;
}
else {
b[++k]=1;
x=a[i];
}
}
segTree *T=creat(1,k);
c[0]=0;
for(int i=1;i<=k;i++){
c[i]=c[i-1]+b[i];
}
c[++k]=100001;
for(int i=0;i<q;i++){
scanf("%d%d",&x,&y);
int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;/*这里处理的一些细节要注意一下*/
int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1;
if(tempi>tempj+1){
printf("%d\n",y-x+1);
}
else if(tempi==tempj+1){
printf("%d\n",Max(c[tempj]-x+1,y-c[tempj]));
}
else {
int max = search(T,tempi,tempj);
max = Max(max,c[tempi-1]-x+1);
max = Max(max,y-c[tempj]);
printf("%d\n",max);
}
}
}
}
很久以前就听说有一个RMQ问题貌似是求区间极大极小值,于是baidu了一下,原来RMQ (Range Minimum/Maximum Query)问题是指:对于长度为n的数列A,回答若干询问RMQ(A,i,j)(i,j<=n),返回数列A中下标在[i,j]里的最小(大)值。这里有两种较优的方法,一是上面写的线段树求法,二是没学过的动态规划ST算法(Sparse Table)。先简要介绍一下ST算法:以求最大值为例,设d[i,j]表示[i,i+2^j-1]这个区间内的最大值,那么在询问到[a,b]区间的最大值时答案就是max(d[a,k], d[b-2^k+1,k]),其中k是满足2^k<=b-a的最大的k,即k=[ln(b-a+1)/ln(2)],d的求法可以用动态规划,d[i,j]=max(d[i,j-1],d[i+2^(j-1),j-1])(摘自百度百科)。学习了这种算法以后,就试着用这种方法来解这道题,于是又下面代码:
#include <cstdio>
#include <cmath>
#include <algorithm>
#define MAX 100010
#define Max(a,b) (a>b?a:b)
int a[MAX],b[MAX][20],c[MAX];
void make(int n)
{
int k=(int)(log(n*1.0)/log(2.0));
for(int j=1;j<=k;j++){
for(int i=1;i<=n-(1<<j)+1;i++){
b[i][j]=Max(b[i][j-1],b[i+(1<<(j-1))][j-1]);
}
}
}
int search(int i,int j)
{
int k=(int)(log(j-i+1.0)/log(2.0));
int rmq=Max(b[i][k], b[j - (1 <<k) + 1][k]);
return rmq;
}
int main()
{
int n,q,x,y,k;
while(scanf("%d",&n),n>0){
scanf("%d",&q);
for(int i=1;i<=n;i++){
scanf("%d",&a[i]);
}
b[1][0]=1,x=a[1],k=1;
for(int i=2;i<=n;i++){
if(x==a[i]){
b[k][0]++;
}
else {
b[++k][0]=1;
x=a[i];
}
}
make(k);
c[0]=0;
for(int i=1;i<=k;i++){
c[i]=c[i-1]+b[i][0];
}
c[++k]=100001;
for(int i=0;i<q;i++){
scanf("%d%d",&x,&y);
int tempi = std::lower_bound(c,c+k+1,x-1) - c +1 ;
int tempj = std::lower_bound(c,c+k+1,y+1) - c- 1;
if(tempi>tempj+1){
printf("%d\n",y-x+1);
}
else if(tempi==tempj+1){
printf("%d\n",Max(c[tempj]-x+1,y-c[tempj]));
}
else {
int max = search(tempi,tempj);
max = Max(max,c[tempi-1]-x+1);
max = Max(max,y-c[tempj]);
printf("%d\n",max);
}
}
}
}
注意,不要把int k=(int)(log(n*1.0)/log(2.0));写成int k=(int)(log(n)/log(2));用G++交不会有事,可用C++交就会
math.h(567): could be 'long double log(long double)'
math.h(519): or 'float log(float)'
math.h(121): or 'double log(double)'
while trying to match the argument list '(int)'
开始没看懂意思,CE了n多次,血的教训呀……
比较两种算法,先从空间复杂度来说,线段树是o(n)的,ST是o(nlogn),就这点而言线段树要优于ST;
再看预处理,线段树的预处理的o(n)的,而ST的预处理是o(nlogn)的,
而对于一次查询,线段树是o(logn)的,ST更快,是o(1)的,
显然从时间的复杂度来说,他们各有长处,当n远远大于q(查询次数)时,线段树是占优势的,而对于q远远大于n的情况,ST就更占优势了
而对于代码复杂度,ST要优于线段树
由于这题n的范围和p的范围相同,于是两种算法错不多:
线段树:3368 Accepted 2684K 657MS C++ 2054B
ST:3368 Accepted 4224K 532MS C++ 1650B
还有一点不能忘记,线段树是在线的算法,而ST是离线的……