Monday, November 14, 2016

mapr-SQOOP2-MySQL Data import


Enviroment:
mapr-sqoop2-server-2.0.0.201607271151-1.noarch
mapr-sqoop2-client-2.0.0.201607271151-1.noarch


Steps
1.Creating link for maprfs
2.Creating link for MySQL & maprfs
3.creating a job
4.Starting a job
5.Status of a  job.


Step I:

Check available links

sqoop:000> show link
+----+------+--------------+----------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+------+--------------+----------------+---------+
+----+------+--------------+----------------+---------+

Check available Connectors.

sqoop:000> show connector
+----+------------------------+------------------+------------------------------------------------------
+----------------------+
| Id | Name | Version | Class | Supported Directions |
+----+------------------------+------------------+------------------------------------------------------
+----------------------+
| 1 | kite-connector | 1.99.6-mapr-1507 | org.apache.sqoop.connector.kite.KiteConnector |
FROM/TO |
| 2 | kafka-connector | 1.99.6-mapr-1507 | org.apache.sqoop.connector.kafka.KafkaConnector
| TO |
| 3 | hdfs-connector | 1.99.6-mapr-1507 | org.apache.sqoop.connector.hdfs.HdfsConnector |
FROM/TO |
| 4 | generic-jdbc-connector | 1.99.6-mapr-1507 |
org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO |
+----+------------------------+------------------+------------------------------------------------------
+----------------------+


Step II:
Create a link for RDBMS( from which DB we would like to import Data
NOTE: Provide the Connecotr ID for Name :generic-jdbc-connector . For Ex here Id is 4 for -c
arguement.)

1.Creating link for MySQL:

sqoop:000> create link -c 4
Creating link for connector with id 4
Please fill following values to create new link object
Name: <mysql>
Link configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://<DB HostName>/<Database>
Username: <sqoop>
Password: <*****>
JDBC Connection Properties:<Optional>
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and persistent id 2

sqoop:000> show link
+----+-------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+-------+--------------+------------------------+---------+
| 2 | mysql | 4 | generic-jdbc-connector | true |
+----+-------+--------------+------------------------+---------+


Step 3:
Create a link for import location i.e. MFS location

sqoop:000> create link -c 3
Creating link for connector with id 3
Please fill following values to create new link object
Name: maprfs
Link configuration
HDFS URI:maprfs://<CLDB HostName>:7222
Hadoop conf directory: /opt/mapr/hadoop/hadoop-0.20.2/conf
New link was successfully created with validation status OK and persistent id 4

sqoop:000> show link
+----+--------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+--------+--------------+------------------------+---------+
| 2 | mysql | 4 | generic-jdbc-connector | true |
| 4 | maprfs | 3 | hdfs-connector | true |
+----+--------+--------------+------------------------+---------+


Step 4:
Creating a JOB

sqoop:000> create job --from 2 --to 4
Creating job for links with from id 2 and to id 4
Please fill following values to create new job object
Name: tetsjob
From database configuration
Schema name: mysql
Table name: <TableName>
Table SQL statement:<Optional>
Table column names:<Optional>
Partition column name: <Provide a ColumnNamefor Partitioning>
Null value allowed for the partition column: true
Boundary query:<Optional>
Incremental read
Check column:<Optional>
Last value:<Optional>
To HDFS configuration
Override null value:<Optional>
Null value:<Optional>
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom compression format:
Output directory: </MFS LOCATION NAME>
Append mode:<Optional>
Throttling resources
Extractors:<Optional>
Loaders:<Optional>
New job was successfully created with validation status OK and persistent id 12
sqoop:000> show job
sqoop:000> start job -j <Job Id>
Ex: start job -j 12
Submission details
Job ID: 12
Server URL:
Created by: mapr
Creation date:
Lastly updated by: mapr
External ID: job_<ID>
http://<Host>:8088/proxy/application_1461206632562_0005/
Source Connector schema: Schema{TABLE SCHEMA WILL BE DISPLAYED HERE}

5.Starting a job
sqoop:000> start job -j 11
Submission details
Job ID: 11
Server URL: http://VM207:12000/sqoop/
Created by: mapr
Creation date: 2016-10-27 12:45:48 IST
Lastly updated by: mapr
External ID: job_1476676093162_0026
        http://VM203:8088/proxy/application_1476676093162_0026/
       
6.Check the status of running job:
sqoop:000> status job -j 12

For Reference:
https://github.com/splicemachine/mapr-hbase/tree/0.94.17-mapr
http://bruzah.blogspot.in/2012/02/java-google-protobufs-to-rpc-some-data.html
https://cwiki.apache.org/confluence/download/attachments/27362072/system_architecture.png?version=1&modificationDate=1414560669000&api=v2
http://hortonworks.com/blog/logparsing-with-cascading/
https://docs.google.com/document/d/12VBKeMgXKhWm0qIcRlrxpSJ-1GPUq463KlaWKrDB3qU/edit#heading=h.ljwprjkuevdk

3 comments:

  1. I really impressed after read this because of some quality work and informative thoughts . I just wanna say thanks for the writer and wish you all the best for coming!. sulopay

    ReplyDelete
  2. import export data Thank you because you have been willing to share information with us. we will always appreciate all you have done here because I know you are very concerned with our.

    ReplyDelete