hive 和 hbase 整合,往分区表插入数据报错?

问答 liwei131313 ⋅ 于 2018-03-22 15:01:47 ⋅ 最后回复由 timzhang 2020-02-19 17:22:14 ⋅ 5848 阅读

1.创建有分区的表
CREATE TABLE hbase_table_1(key int, value string) partitioned by (day string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "xyz");
2.查询源表的记录
file

3、将数据插入指定分区的hbase_table_1,执行报错,错误信息如下:
hive> insert overwrite table hbase_table_1 partition (day='2012-01-01') select * from pokes;
Query ID = root_20180322145656_b9e8bd17-2489-41da-a4f3-6d04e42738d3
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: java.lang.IllegalArgumentException: Must specify table name
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:1080)
at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:272)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:573)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:428)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:142)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1978)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1691)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1423)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1197)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:226)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:175)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:389)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:634)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.IllegalArgumentException: Must specify table name
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:195)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:277)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:267)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:1078)
... 37 more
Job Submission failed with exception 'java.io.IOException(java.lang.IllegalArgumentException: Must specify table name)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

成为第一个点赞的人吧 :bowtie:
回复数量: 6
  • 青牛 国内首批大数据从业者,就职于金山,担任大数据团队核心研发工程师
    2018-03-22 15:30:42

    在你的查询语句后面加上limit试试, insert overwrite table hbase_table_1 partition (day='2012-01-01') select * from pokes limit 10;

  • liwei131313
    2018-03-22 15:37:31

    @青牛
    添加limit后,就可以,数据可以插入到分区表,这个是什么原因导致的?如果需要将源表中的数据全量插入到指定分区,要如何处理呢?

    file

  • 青牛 国内首批大数据从业者,就职于金山,担任大数据团队核心研发工程师
    2018-03-22 16:19:32

    @liwei131313 需要加限制条件吧,你看看加where好使不?再不行你就limit多点

  • liwei131313
    2018-03-22 16:25:28

    @青牛
    加上where不行,会报错。
    hive> insert overwrite table hbase_table_1 partition (day='2012-01-02') select * from pokes where foo=4;
    Query ID = root_20180322162424_577330d0-5b03-4d17-92be-552eb126557d
    Total jobs = 3
    Launching Job 1 out of 3
    Number of reduce tasks is set to 0 since there's no reduce operator
    java.io.IOException: java.lang.IllegalArgumentException: Must specify table name
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:1080)
    at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67)
    at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:272)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578)
    at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:573)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564)
    at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:428)
    at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:142)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1978)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1691)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1423)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1197)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:226)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:175)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:389)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:634)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
    Caused by: java.lang.IllegalArgumentException: Must specify table name
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:195)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:277)
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:267)
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:1078)
    ... 37 more
    Job Submission failed with exception 'java.io.IOException(java.lang.IllegalArgumentException: Must specify table name)'
    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

  • 青牛 国内首批大数据从业者,就职于金山,担任大数据团队核心研发工程师
    2018-03-22 16:26:44

    @liwei131313 那就limit多点吧。

  • timzhang
    2020-02-19 17:22:14

    测试发现如果换一个partition插入,会导致其他partition的数据被copy一份,不知道你们有没有碰到过
    insert overwrite table hbase_table_1 partition (day='2020-01-01') select * from pokes limit 10;

暂无评论~~
  • 请注意单词拼写,以及中英文排版,参考此页
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`, 更多语法请见这里 Markdown 语法
  • 支持表情,可用Emoji的自动补全, 在输入的时候只需要 ":" 就可以自动提示了 :metal: :point_right: 表情列表 :star: :sparkles:
  • 上传图片, 支持拖拽和剪切板黏贴上传, 格式限制 - jpg, png, gif,教程
  • 发布框支持本地存储功能,会在内容变更时保存,「提交」按钮点击时清空
Ctrl+Enter