在hive查询中关于聚合和统计函数中涉及到百分比函数,比如PERCENT_RANK,不过这都是依据某个字段行数统计,有些时候我们是需要统计某个字段的当前值除以这个字段值总和来代表相应的百分比,这个目前没有直接的func来完成,但是在实际的数据抽取过程中又会用到,所以今天碰到这个问题卡了一下,不过还是解决了,记录一下 下面给出在mysql上的实际结果,跟hive结果是一致的,因为阿里云ecs上没有安装hive所以就用mysql代替了
mysql> create database t; Query OK, 1 row affected (0.01 sec) mysql> use t; Database changed mysql> show tables; Empty set (0.00 sec) mysql> create table test(id int,num int); Query OK, 0 rows affected (0.04 sec) mysql> insert into table test values(1,4); ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'table test values(1,4)' at line 1 mysql> insert into test values(1,4); Query OK, 1 row affected (0.10 sec) mysql> insert into test values(2,6); Query OK, 1 row affected (0.01 sec) mysql> select id,num/b.d from (select sum(num)d from test)b,test ; +------+---------+ | id | num/b.d | +------+---------+ | 1 | 0.4000 | | 2 | 0.6000 | +------+---------+ 2 rows in set (0.00 sec) mysql>