Friday, June 10, 2016

Configuring network to a VM




Supported OS-Redhat/CentOS/Ubuntu
Step 1:
>vi /etc/resolv.conf
nameserver <provide the serverIP>

Step2:
>vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=<default one>
TYPE=Ethernet
UUID=<default-one>
ONBOOT=yes
NETMASK=255.255.255.0
NM_CONTROLLED=no
BOOTPROTO=static
IPADDR=172.20.10.42

Step 3:
>vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=host-vm1
GATEWAY=172.20.10.1



Step 4:

>service NetworkManager stop
>service network restart
> ping google.com
PING google.com (216.58.194.206) 56(84) bytes of data.
64 bytes from sfo03s01-in-f14.1e100.net (216.58.194.206): icmp_seq=1 ttl=54 time=6.46 ms
(#You ll see response like this if everything went well)






Thursday, June 9, 2016

MapR-Hue and MySql Integration


Requisites are:
OS-CentOS-6.6/Redhat-6.7
MapR-5.1
Hue-3.9
MySql

Steps:
Here Hue and MySql is installed on different nodes.
MySql Host:Host1
MapR-Hue Host:Host2

On Host1 Host, do below steps: Confirm before going to install it using rpm -qa command.
Step 1:
>rpm -qa | grep mysql-devel
If not installed , do below execution.
>yum install mysql-devel

Step 2:
rpm -qa | grep mysql-connector-java
If not installed , do below execution.
>yum install mysql-connector-java

Step 3:
> rpm -qa | grep mysql-server
If not installed , do below execution
>yum install mysql-server

Step 4:Change the /etc/my.cnf file as follows:
>vi /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=Host1
#This IP is MySql installed Host.
default-storage-engine=InnoDB
sql_mode=STRICT_ALL_TABLES

Step 5:
Start the mysql daemon.
$ sudo service mysqld start

Step 6:
Configure the Mysql with password, initially MySql pwd could be not set, please set here with pwd.
>/usr/bin/mysql_secure_installation
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] y
New password:x
Re-enter new password:x
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!
(#Here I have provided password as “ x ”)
Step 7:
>mysql> drop database hue;
Create a database for Hue with name “hue” in mysql by using below command.
mysql> create database hue;
Query OK, 1 row affected (0.00 sec)

Here the Host Host2 is the HUE installation HostName.
mysql> grant all on hue.* to 'hue'@Host2 identified by 'x';
Query OK, 0 rows affected (0.00 sec)
# here 'x' is the mysql password

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)







Step 8:
In Hue installed Host Host2, Edit the hue.ini file with this information.
> vi /opt/mapr/hue/hue-3.9.0/desktop/conf/hue.ini

[[database]]
host=Host1
port=3306
engine=mysql
user=hue
password=x
name=hue

Add the below details under “# Settings for the RDBMS application “
> vi /opt/mapr/hue/hue-3.9.0/desktop/conf/hue.ini

[librdbms]

[[databases]]
#Add at MySql settings area.
host=Host1
port=3306
engine=mysql
user=hue
password=x
name=hue

Step 9:
Do these steps in Hue installation Host i.e. Host2
>/opt/mapr/hue/hue-3.9.0/build/env/bin/hue dumpdata > bkpHueData.json
Delete the data in the file in “model “ field if any thing with "useradmin.userprofile" , remove that json file and save it.
Check if any words are presenting with "useradmin.userprofile" string in backup file.
>cat bkpHueData.json | grep useradmin.userprofile

Step 10:
Restart the Hue serverice
> maprcli node services -name hue -action restart -nodes `hostname`

Syncdb operation:
/opt/mapr/hue/hue-3.9.0/build/env/bin/hue syncdb --noinput

Migrating the hue.
/opt/mapr/hue/hue-3.9.0/build/env/bin/hue migrate
From Host1 Mysql Installed Host execute this command.

Step 11:
From MySql Host do the below operation:
mysql> DELETE FROM hue.django_content_type;

Step 12:
From Host2, Hue installed host do this command.
> /opt/mapr/hue/hue-3.9.0/build/env/bin/hue loaddata bkpHueData.json

Start the Hue UI using the Url.
:8888 ” at url browser.

Try to add a user to “ AddUser “ are at mapr user and add it and check the updated table info at mysql using below command/.
mysql> select *from auth_user;
(It ll display the recently added user info in this table.)


Thursday, June 2, 2016

Configuring MapR-5.0.0 cluster on RHEL-6.7 OS.

Here I am adding very basic steps to follow configuring  MapR-5.0.0 cluster on Redhat-6.7 operating system.
Please follow below steps for  for configuring.
Note:
Here We have 2 disks in each node.
Cluster Information:
Cluster Version : MapR-5.0.0
OS version : RHEL-6.7
Number of nodes : 3 nodes.
Login as :mapr/mapr
(Note: Assuming that the basic hardware requirements meet the cluster to configure.)

Steps:
Step 1:
Add the hostnames and ipaddress of all the cluster nodes in /etc/hosts file on each node.
$sudo vi /etc/hosts
172.20.10.10 host10
172.20.10.11 host11
172.20.10.12 host12

Step 2:
Adding virtual disks info. in file called disks.txt to each node.
$sudo vi /tmp/disks.txt
xvdb
xvdc

Step 3:
Adding a group and a User to the each node
$sudo groupadd -g 5000 mapr
$sudo useradd -g 5000 -u 5000 mapr

Step 4:
Changing the SELINUX parameter=pemissive in each node
$sudo vi /etc/selinux/config
SELINUX=pemissive

Step 5:
Disable the iptables(disable firewall permanently) using below command in each node.
$sudo iptables -F
or use below two commands to disable permanently in each node.
$chkconfig iptables off
$service iptables stop

Step 6:
Add the MapR version and ecosystem repos to maprtech.repo file in each node
$sudo vi /etc/yum.repos.d/maprtech.repo
[maprtech]
name=MapR Technologies
baseurl=http://package.mapr.com/releases/v5.0.0/redhat/
enabled=1
gpgcheck=0
protect=1
[maprecosystem]
name=MapR Technologies
baseurl=http://package.mapr.com/releases/ecosystem-5.x/redhat
enabled=1
gpgcheck=0
protect=1


Step 7:
Use below rpm packages to download on each node
$sudo wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

Step 8:
Install the above downloaded rpm.
$sudo rpm -Uvh epel-release-6-8.noarch.rpm

Step 9:
import the key
$sudo rpm --import http://package.mapr.com/releases/pub/maprgpg.key

Step 10:
Install java-1.7.0.openjdk-devel , if not installed.
$sudo yum install java-1.7.0-openjdk-devel

Step 11:
Install the mapr-core packages(eg: mapr-fileserver,mapr-cldb,mapr-zookeeper etc... required services on each cluster)
$sudo yum install mapr-fileserver
$sudo yum install mapr-cldb mapr-zookeeper

Step 12:
List out the CLDB's ,Zookeeper's insta nodes and then run the configure.sh which is present in "/opt/mapr/server/" folder on each node.
List of CLDB installed nodes : 172.20.10.10,172.20.10.11
List of Zookeeper installed Nodes: 172.20.10.10,172.20.10.11,172.20.10.12
$sudo server/configure.sh -C 172.20.10.10,172.20.10.11 -Z 172.20.10.10,172.20.10.11,172.20.10.12 -N 3Node-MapR-5.0-Cluster

Step 13:
Add the disks.txt file each node
$sudo server/disksetup -F /tmp/disks.txt

Step 14:
use below command to check for all started commands.
$ sudo maprcli node services -columns svc
Note: 
Please refer below link for more detailed info:
http://doc.mapr.com/display/MapR/Advanced+Installation+Topics

Thursday, May 26, 2016

Working with Sqoop2 import command.



Note:
Please configure PostgresSql as Sqoop2 metastore , then follow below steps.
Reference for Configuring ,please follow previous post " PostgresSql configuration for SQOOP2"

 This steps are tested in MapR-cluster-5.0,CentOS-6.6

 STEPS:
Step 1:
Check available links and Connectors.
sqoop:000> show link
+----+------+--------------+----------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+------+--------------+----------------+---------+
+----+------+--------------+----------------+---------+

sqoop:000> show connector
+----+------------------------+------------------+------------------------------------------------------
+----------------------+
| Id | Name | Version | Class | Supported Directions |
+----+------------------------+------------------+------------------------------------------------------
+----------------------+
| 1 | kite-connector | 1.99.6-mapr-1507 | org.apache.sqoop.connector.kite.KiteConnector |
FROM/TO |
| 2 | kafka-connector | 1.99.6-mapr-1507 | org.apache.sqoop.connector.kafka.KafkaConnector
| TO |
| 3 | hdfs-connector | 1.99.6-mapr-1507 | org.apache.sqoop.connector.hdfs.HdfsConnector |
FROM/TO |
| 4 | generic-jdbc-connector | 1.99.6-mapr-1507 |
org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO |
+----+------------------------+------------------+------------------------------------------------------
+----------------------+


Step 2:
Create a link for RDBMS( from which DB we would like to import Data
NOTE: Provide the Connecotr ID for Name :generic-jdbc-connector . For Ex here Id is 4 for -c
arguement.)

sqoop:000> create link -c 4
Creating link for connector with id 4
Please fill following values to create new link object
Name: <mysql>
Link configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://<DB HostName>/<Database>
Username: <sqoop>
Password: <*****>
JDBC Connection Properties:<Optional>
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and persistent id 2

sqoop:000> show link
+----+-------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+-------+--------------+------------------------+---------+
| 2 | mysql | 4 | generic-jdbc-connector | true |
+----+-------+--------------+------------------------+---------+


Step 3:
Create a link for import location i.e. MFS location

sqoop:000> create link -c 3
Creating link for connector with id 3
Please fill following values to create new link object
Name: maprfs
Link configuration
HDFS URI:maprfs://<CLDB HostName>:7222
Hadoop conf directory: /opt/mapr/hadoop/hadoop-0.20.2/conf
New link was successfully created with validation status OK and persistent id 4

sqoop:000> show link
+----+--------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+--------+--------------+------------------------+---------+
| 2 | mysql | 4 | generic-jdbc-connector | true |
| 4 | maprfs | 3 | hdfs-connector | true |
+----+--------+--------------+------------------------+---------+


Step 4:
Create a Job

sqoop:000> create job --from 2 --to 4
Creating job for links with from id 2 and to id 4
Please fill following values to create new job object
Name: tetsjob
From database configuration
Schema name: mysql
Table name: <TableName>
Table SQL statement:<Optional>
Table column names:<Optional>
Partition column name: <Provide a ColumnNamefor Partitioning>
Null value allowed for the partition column: true
Boundary query:<Optional>
Incremental read
Check column:<Optional>
Last value:<Optional>
To HDFS configuration
Override null value:<Optional>
Null value:<Optional>
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom compression format:
Output directory: </MFS LOCATION NAME>
Append mode:<Optional>
Throttling resources
Extractors:<Optional>
Loaders:<Optional>
New job was successfully created with validation status OK and persistent id 12

sqoop:000> show job

sqoop:000> start job -j <Job Id>
Ex: start job -j 12
Submission details
Job ID: 12
Server URL:
Created by: mapr
Creation date:
Lastly updated by: mapr
External ID: job_<ID>
http://<Host>:8088/proxy/application_1461206632562_0005/
Source Connector schema: Schema{TABLE SCHEMA WILL BE DISPLAYED HERE}

sqoop:000> status job -j <JobID>

PostgresSql configuration for SQOOP2




Please follow below steps to configure the PostgresSql for Sqoop2.
(Sqoop2 will store it's metastore in PostgresSql.)
This one is specific to MapR-cluster  environment.

Required Steps:
Step 1:
Install the postgresql using below command
$ yum install postgresql-server

Step 2:
Start the postgresql service using below command
$ service postgresql initdb

Step 3: Change the parameter in the below specified file
$ vim /var/lib/pgsql/data/postgresql.conf
listen_addresses = <10.10.71.19 >
#Note : add IP of postgresql IP where it has installed.

Step 4:
Add parameters to below specified file
$ vim /var/lib/pgsql/data/pg_hba.conf

# "local" is for Unix domain socket connections only
#local all all ident
local all all trust
# IPv4 local connections:
#host all all 127.0.0.1/32 trust
host all all 10.10.72.78/32 trust

# IPv6 local connections:
host all all ::1/128 ident

Step 5:
Comment existing below parameters in the below and add new parameters values into the specified file.
$ vi /opt/mapr/sqoop/sqoop-2.0.0/server/conf/sqoop.properties

org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.postgresql.PostgresqlRepositoryHandler
org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED
org.apache.sqoop.repository.jdbc.maximum.connections=10
org.apache.sqoop.repository.jdbc.url=jdbc:postgresql://10.10.72.110:5432/sqoop
org.apache.sqoop.repository.jdbc.driver=org.postgresql.Driver
org.apache.sqoop.repository.jdbc.user=sqoop
org.apache.sqoop.repository.jdbc.password=sqoop
#org.apache.sqoop.repository.jdbc.properties.property=value


Step 6:
Download Jar and place it into below path.
Downloadablw link:
Place the downloaded jar into this location.
/opt/mapr/sqoop/sqoop-2.0.0/lib

Step 7: Ecxecute below command.
$ chkconfig postgresql on

Step 8:
Start the postgresql shell using below command
$ psql -U postgres

Step 9:
Create a table using below command.

$ CREATE ROLE sqoop LOGIN ENCRYPTED PASSWORD 'sqoop'
NOSUPERUSER INHERIT CREATEDB NOCREATEROLE;

$ CREATE DATABASE "sqoop" WITH OWNER = sqoop TABLESPACE = pg_default;

Step 10:
login postgres
$/usr/bin/pg_ctl -D /var/lib/pgsql/data -l logfile start



Oozie installation in MapR platform for unsecured cluster



Steps:
Log-in as a root user and follow the below steps.
Step 1:
             $ cd /opt/mapr
             $ yum install mapr-oozie

Step 2: Add below properties into the " core-site.xml " file.
           
           $vi /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/core-site.xml
 <property>
  <name>hadoop.proxyuser.mapr.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.mapr.groups</name>
  <value>*</value>
</property>

Step 3: Re-configure the cluster using below  command.
$/opt/mapr/server/configure.sh   -R

Step 4: export the OOZIE_URL in CLI.

$export OOZIE_URL=http://10.10.80.242:11000/oozie

Step 5:
Start the oozie service from CLI using below command.
$maprcli node services -name oozie -action restart -nodes `hostname`

Step 6:Check the list of running of services using below command.
$maprcli node list -columns svc

Note:
These steps are for non-secure cluster only.


HP Vertica Cluster to HDFS platform Using SQOOP




Follow below steps to Achieve the importing from HP vertica cluster to HDFS.

Step 1:
Please download and add recent version of below jars to Sqoop library
vertica-jdbc-7.1.2-0.jar;
vertica-jdk5-6.1.3-0.jar
hadoop-vertica.jar

Step 2:
Please use below query to run  using sqoop.

>  sqoop import \
 --driver com.vertica.jdbc.Driver  \
--connect jdbc:vertica://<HOSTNAME>:5433/<DATABASE-NAME> \
 --username <UNAME> \
 -P \
 --table <TABLE-NAME> \
 --target-dir <TARGET-DIRECTORY-NAME>  \
--as-textfile  \
-m <No-Mappers>