本文共 1076 字,大约阅读时间需要 3 分钟。
yum -y install *lzo*
修改hdfs 的 core-site.xml
io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
添加
io.compression.codec.lzo.class=com.hadoop.compression.lzo.LzoCodec
修改下面参数(必选):
mapreduce.map.output.compress=true;
mapreduce.output.fileoutputformat.compress=true;
mapreduce.map.output.compress.codec=com.hadoop.compression.lzo.LzoCodec;
mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzoCodec;
hive.exec.compress.output=true;
4. 测试mapreduce读lzo
hive新建一张表lzo_test
CREATE TABLE lzo_aa( id bigint, name string)ROW FORMAT DELIMITED FIELDS TERMINATED BY '/t'STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat" OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
下载lzop工具,load一个lzo文件进lzo_test表中,执行“select * from lzo_test"和"select count(1) from lzo_test"正确
hive默认字段分隔符\001