3、运行opt 文件
sqoop --options-file m2h-d-acounttype.opt
4、启动hive
hive
5、查询tables
select * from d_accounttype;
6、show date
遇到问题:
1、没有在hive创建mysql中得相同的表,就直接导入数据。系统自动生成输出目录就不在指定hive路径了。
再次运行就会报 目录以存在 error。
2、sqoop 所有参数说明
usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]
Common arguments:
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class name
--connection-param-file <properties-file> Specify connection parameters file
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-mapred-home <dir> Override
$HADOOP_MAPRED_HOME_ARG
--help Print usage instructions
-P Read password from console
--password <password> Set authentication password
--username <username> Set authentication username
--verbose Print more information while working
Import control arguments:
--append Imports data in append mode
--as-avrodatafile Imports data to Avro data files
--as-sequencefile Imports data to SequenceFile
--as-textfile Imports data as plain text (default)
--boundary-query <statement> Set boundary
query for retrieving max and min value of the primary key
--columns <col,col,col...> Columns to import from table
--compression-codec <codec> Compression codec to use for import
--direct Use direct import fast path
--direct-split-size <n> Split the input stream every 'n' bytes when importing in direct mode
-e,--query <statement> Import results of SQL 'statement'
--fetch-size <n> Set number 'n' of rows to fetch from the database when more rows are needed
--inline-lob-limit <n> Set the maximum size for an inline LOB
-m,--num-mappers <n> Use 'n' map tasks to import in parallel
--mapreduce-job-name <name> Set name for generated mapreduce job
--split-by <column-name> Column of the table used to split work units
--table <table-name> Table to read
--target-dir <dir> HDFS plain table destination
--validate Validate the copy using the configured validator
--validation-failurehandler <validation-failurehandler> Fully qualified class name for ValidationFa
ilureHandler
--validation-threshold <validation-threshold> Fully qualified class name for ValidationTh reshold
--validator <validator> Fully qualified class name for the Validator
--warehouse-dir <dir> HDFS parent for table destination
--where <where clause> WHERE clause to use during import
-z,--compress Enable compression
Incremental import arguments:
--check-column <column> Source column to check for incremental change
--incremental <import-type> Define an incremental import of type 'append' or 'lastmodified'
--last-value <value> Last imported value in the incremental check column
Output line formatting arguments:
--enclosed-by <char> Sets a required field enclosing character
--escaped-by <char> Sets the escape character
--fields-terminated-by <char> Sets the field separator character
--lines-terminated-by <char> Sets the end-of-line character
--mysql-delimiters Uses MySQL's default delimiter set:
fields: , lines: \n escaped-by: \optionally-enclosed-by: '
--optionally-enclosed-by <char> Sets a field enclosing character
Input parsing arguments:
--input-enclosed-by <char> Sets a required field encloser
--input-escaped-by <char> Sets the input escape character
--input-fields-terminated-by <char> Sets the input field separator
--input-lines-terminated-by <char> Sets the input end-of-line char
--input-optionally-enclosed-by <char> Sets a field enclosing character
Hive arguments:
--create-hive-table Fail if the target hive table exists
--hive-delims-replacement <arg> Replace Hive record \0x01 and row delimiters (\n\r) from imported string fields with user-defined string
--hive-drop-import-delims Drop Hive record \0x01 and row delimiters (\n\r) from imported string fields
--hive-home <dir> Override $HIVE_HOME
--hive-import Import tables into Hive (Uses Hive's default delimiters if none are set.)
--hive-overwrite Overwrite existing data in the Hive table
--hive-partition-key <partition-key> Sets the partition key to use when importing to hive
--hive-partition-value <partition-value> Sets the partition value to use when importing to hive
--hive-table <table-name> Sets the table name to use when importing to hive
--map-column-hive <arg> Override mapping for specific column to hive types.
HBase arguments:
--column-family <family> Sets the target column family for the import
--hbase-create-table If specified, create missing HBase tables
--hbase-row-key <col> Specifies which input column to use as the row key
--hbase-table <table> Import to <table> in HBase
Code generation arguments:
--bindir <dir> Output directory for compiled objects
--class-name <name> Sets the generated class name. This overrides --package-name.
When combined with --jar-file, sets the input class.
--input-null-non-string <null-str> Input null non-string representation
--input-null-string <null-str> Input null string representation
--jar-file <file> Disable code generation; use specified jar
--map-column-java <arg> Override mapping for specific columns to java types
--null-non-string <null-str> Null non-string representation
--null-string <null-str> Null string representation
--outdir <dir> Output directory for generated code
--package-name <name> Put auto-generated classes in this package
3、增量导入 怎样动态修改 last-value ?
另外一个问题,如果我们要导入某个表中的新加入的部分,可以使用--check-column、--incremental、--last-value三个参数,--check-column指定要检查的列,--incremental指定某种增加的模式,只有两个合法的值,append 和lastmodified。如果--incremental为append,则Sqoop会导入--check-column指定的列的值大于--last-value所指定的值的记录。如果--incremental为lastmodified,则Sqoop会导入--check-column指定的列的值(这是一个时间戳)近于--last-value所指定的时间戳的记录。例如下面的命令可以导入列ID的值大于4的所有记录。
$ sqoop import --hive-import --connect jdbc:oracle:thin:@192.168.56.111:1521:DBNAME --username USERNAME --password PASSWORD --verbose -m 1 --table TABLENAME --check-column ID --incremental append --last-value 4
相关推荐
sqoop导入数据到hive
出现此问题时使用:java.lang.NullPointerException at org.json.JSONObject.(JSONObject.java:144) at org.apache.sqoop.util.SqoopJsonUtil.... at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
2、sqoop导入(RMDB-mysql、sybase到HDFS-hive) 网址:https://blog.csdn.net/chenwewi520feng/article/details/130572275 介绍sqoop从关系型数据库mysql、sybase同步到hdfs、hive中
Sqoop导Oracle数据到Hive,代码清晰一目了然
数据同步Sqoop用法之mysql与Hive导入导出.docx
实战Sqoop数据导入及大数据用户行为案例分析...08_Sqoop将MySQL数据导入Hive表中 09_Sqoop的导出及脚本中使用的方式 10_案例分析-动态分区的实现 11_案例分析-源表的分区加载创建 12_案例分析-指标分析使用Sqoop导出
使用sqoop从传统数据库mysql中抽取数据到大数据集群。
yinian_hive_increase_sqoop sqoop从mysql同步数据到hive
sqoop 从 hive 导到mysql遇到的问题.docx
NULL 博文链接:https://kingding.iteye.com/blog/2435018
(3)sqoop数据迁移,完成HIve与MySQL数据库中的数据交互 (4)Echarts搭建动态可视化大屏 (5)SpringBoot搭建可视化后台系统,完成前端与后台的数据传递与交互。 (6)基于Cenots7 搭建虚拟机,配置Hadoop、HDFS、...
一个简单的Hive项目,使用了Sqoop、Hadoop、Hive、MySQL,对电商数据进行分析-BI_project
基于Sqoop+Hive+Spark+MySQL+AirFlow+Grafana的工业大数据离线数仓项目
一、问题发现与分析 ...经过对比:sqoop在转换MySQL的datatime字段类型为hive的string时会出现问题:默认先转为对应时间戳,再转换为北京市区时间,就会使时间多8小时。 解决办法有两个: 1、在sql里将时间字段转换为
flume、hive和sqoop的实用案例:flume收集日志hive负责处理数据sqoop负责将数据导出到mysql中供页面展示
2. 练习使用Sqoop将数据从Mysql中导入到HBase中 3. 练习使用Sqoop将数据在Mysql与Hive之间的导入导出 4. Sqoop上的操作举例
Hadoop HBbase HIVE Sqoop概念介绍说明,及和RDBMS的比较
sqoop导入数据到hdfs,所有相关的操作命令均在更改相关地址就行。
sqoop的数据导入