有一个空的HBase表有两个列族:
create 'emp', 'personal_data', 'professional_data'现在我正在尝试将Hive外部表映射到它,这自然会有一些列:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":id, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");现在我得到的错误是这样的:
FAILED:执行错误,从org.apache.hadoop.hive.ql.exec.DDLTask返回代码1。 java.lang.RuntimeException:MetaException(消息:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe:columns有5个元素,而hbase.columns.mapping有6个元素(计算密钥)如果隐含))
你能帮帮我吗? 难道我做错了什么?
There is an empty HBase table with two column families:
create 'emp', 'personal_data', 'professional_data'Now I am trying to map a Hive external table to it, which would naturally have some columns:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":id, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");Now the error that I get is this:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: columns has 5 elements while hbase.columns.mapping has 6 elements (counting the key if implicit))
Could you please help me out? Am i doing something wrong?
最满意答案
在映射中,引用id字段但应引用HBase key关键字。 如文档中所述:
映射条目必须是:key或者表格column-family-name:[column-name] [#(binary | string)
只需替换:id by :key ,它应该这样做:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");列映射基于列的顺序,而不是它们的名称。 在文档“ 多列和族”一节中,您可以清楚地看到名称无关紧要
CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,a:b,a:c,d:e" )然后映射
key - > id a:b - > value1 a:c - > value2 d:e - > value3In your mapping, you're referencing the id field but you should reference the HBase key keyword. As stated in the documentation :
a mapping entry must be either :key or of the form column-family-name:[column-name][#(binary|string)
Just replace :id by :key and that should do it :
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");The column mapping is based on the ordering of the columns, not on their names. In the documentation, paragraph Multiple Columns and Families you can clearly see that the names don't matter
CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,a:b,a:c,d:e" )The mapping is then
key -> id a:b -> value1 a:c -> value2 d:e -> value3在HBase现有表的顶部定义Hive外部表(Defining Hive external table on top of HBase existing table)有一个空的HBase表有两个列族:
create 'emp', 'personal_data', 'professional_data'现在我正在尝试将Hive外部表映射到它,这自然会有一些列:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":id, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");现在我得到的错误是这样的:
FAILED:执行错误,从org.apache.hadoop.hive.ql.exec.DDLTask返回代码1。 java.lang.RuntimeException:MetaException(消息:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe:columns有5个元素,而hbase.columns.mapping有6个元素(计算密钥)如果隐含))
你能帮帮我吗? 难道我做错了什么?
There is an empty HBase table with two column families:
create 'emp', 'personal_data', 'professional_data'Now I am trying to map a Hive external table to it, which would naturally have some columns:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":id, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");Now the error that I get is this:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: columns has 5 elements while hbase.columns.mapping has 6 elements (counting the key if implicit))
Could you please help me out? Am i doing something wrong?
最满意答案
在映射中,引用id字段但应引用HBase key关键字。 如文档中所述:
映射条目必须是:key或者表格column-family-name:[column-name] [#(binary | string)
只需替换:id by :key ,它应该这样做:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");列映射基于列的顺序,而不是它们的名称。 在文档“ 多列和族”一节中,您可以清楚地看到名称无关紧要
CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,a:b,a:c,d:e" )然后映射
key - > id a:b - > value1 a:c - > value2 d:e - > value3In your mapping, you're referencing the id field but you should reference the HBase key keyword. As stated in the documentation :
a mapping entry must be either :key or of the form column-family-name:[column-name][#(binary|string)
Just replace :id by :key and that should do it :
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, personal_data:city, personal_data:name, professional_data:occupation, professional_data:salary") TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");The column mapping is based on the ordering of the columns, not on their names. In the documentation, paragraph Multiple Columns and Families you can clearly see that the names don't matter
CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,a:b,a:c,d:e" )The mapping is then
key -> id a:b -> value1 a:c -> value2 d:e -> value3
发布评论