Thursday, November 27, 2014

installing aws on free instance - part 2

second time on AWS

all AWS instances were stopped.
starting will give then new ID.

For convenience list host names below
Master
MasterID :- ec2-54-173-207-5.compute-1.amazonaws.com
Slaves
Slave1 :- ec2-54-165-137-226.compute-1.amazonaws.com
Slave2 :- ec2-54-173-232-142.compute-1.amazonaws.com

# connect masters
#go to directory where these PEM files are stored. enter following commands one after another. following commands shoudl be repeaseted each time terminal is opened.

eval `ssh-agent`
ssh-add awssecuritykey.pem

# Update following fields with new ID/IP

#core site
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ec2-54-173-232-142.compute-1.amazonaws.com:8020</value>
<final>true</final>
</property>
</configuration>


#mapred-site.xml
----------

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>


<property>
<name>mapred.job.tracker</name>
<value>hdfs://ec2-54-173-232-142.compute-1.amazonaws.com:8021</value>
<final>true</final>
</property>

</configuration>


# transfer filws on slaves
scp core-site.xml mapred-site.xml ubuntu@ec2-54-172-79-68.compute-1.amazonaws.com:/home/ubuntu/hadoop/conf

scp core-site.xml mapred-site.xml ubuntu@ec2-54-174-72-143.compute-1.amazonaws.com:/home/ubuntu/hadoop/conf


#slaves file
#on Master machine; mention both "IP ID" in slaves file.
ec2-54-165-137-226.compute-1.amazonaws.com
ec2-54-173-232-142.compute-1.amazonaws.com


#on slaves machine; only that machines IP ID should be entered. connect to slaves and update slaves files
on slave 1; "slaves" file will have "ec2-54-165-137-226.compute-1.amazonaws.com" this line only.
on slave 2; "slaves" file will have "ec2-54-173-232-142.compute-1.amazonaws.com" this line only.

# on master
cd hadoop/bin
start-all.sh

# run following example for testing
hadoop jar hadoop-mapreduce-examples-2.1.0-beta.jar pi 16 100000

Monday, November 24, 2014

installing hadoop on aws free instance


Assumes:-
Starting from clean basic Ubuntu AMI.
If running free instances from AWS.
Then IP addresses may get changed; needs to be updated each time in files.
Using Putty from windows for connecting to AWS instance.


For convenience list host names below
Master
MasterID :- ec2-54-174-46-166.compute-1.amazonaws.com
Slaves
Slave1 :- ec2-54-165-137-226.compute-1.amazonaws.com
Slave2 :- ec2-54-173-232-142.compute-1.amazonaws.com

PEM file :-
awssecuritykey.pem

PPK file :-
PuttyKey.ppk

Connecting using Putty to AWS instance:-
1) Convert awssecuritykey.pem into PuttyKey.ppk using PuttyGen.
2) Open Putty
a) Goto session in hostname enter username@ID of master. port 22
b) Goto connection --> SSH --> Auth --> In box for Private key for authentication "Browse"; browse and select PPK key generated in first step.
it may ask to store server's host key in authenticated hosts. select "yes" after checking.


#Run commands on each node (master and slave)
sudo apt-get update
sudo apt-get upgrade

#install JDK
sudo apt-get install openjdk-7-jdk

# Copy your PEM files from local folder to all your instances. This is to ensure SSH connection between servers is authenticated.
# Using PSCP command; available from Putty folder. WinSCP can also be used which have GUI from windows.

pscp -i D:\Downloads\PuttyKey.ppk D:\Downloads\awssecuritykey.pem ubuntu@ec2-54-174-46-166.compute-1.amazonaws.com:/home/ubuntu/.ssh

pscp -i D:\Downloads\PuttyKey.ppk D:\Downloads\awssecuritykey.pem ubuntu@ec2-54-165-137-226.compute-1.amazonaws.com:/home/ubuntu/.ssh

pscp -i D:\Downloads\PuttyKey.ppk D:\Downloads\awssecuritykey.pem ubuntu@ec2-54-173-232-142.compute-1.amazonaws.com:/home/ubuntu/.ssh


#To fix this problem, we need to issue following commands

chmod 644 authorized_keys

# Quick Tip: If you set the permissions to ‘chmod 644', you get a file that can be written by you, but can only be read by the rest of the world.

chmod 400 awssecuritykey.pem

#Quick Tip: chmod 400 is a very restrictive setting giving only the file onwer read-only access. No write / execute capabilities for the owner, and no permissions what-so-ever for anyone else.

#To use ssh-agent and ssh-add, follow the steps below:
#go to directory where these PEM files are stored. enter following commands one after another. following commands shoudl be repeaseted each time terminal is opened.

eval `ssh-agent`
ssh-add awssecuritykey.pem

#checking localhost for ssh
ssh localhost # if no error is observed then; no issues.


#At the Unix prompt, enter: eval `ssh-agent`Note: Make sure you use the backquote ( ` ), located under the tilde ( ~ ), rather than the single quote ( ' ).
#Enter the command: ssh-add hadoopec2cluster.pem
#if you notice .pem file has “read-only” permission now and this time it works for us.
#checking remote SSH

#<your-amazom-public-URL>; if you are on master then try conecting slave1 or slave2. and viceversa
ssh ubuntu@ec2-54-174-46-166.compute-1.amazonaws.com


# download and  installing hadoop on each node. follwoign command will download in /home folder
wget http://apache.mirror.gtcomm.net/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz

# extracting
tar -xvf hadoop-1.2.1.tar.gz

# renaming for convenience
mv hadoop-1.2.1 hadoop

# optionally remove
rm hadoop-1.2.1.tar.gz


# updating path (.bashrc for ubuntu)

update .bashrc
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/home/ubuntu/hadoop
export HADOOP_HOME=/home/ubuntu/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin


#Update following files only on master node only. or Update on local PC. If updated on local PC; then it should be transferred to master and slaves through PSCP/WInSCP.

#Hadoop_env.sh
#search for JAVA_HOME parameter. update it with Java home path

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

#core site
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ec2-54-173-232-142.compute-1.amazonaws.com:8020</value>
<final>true</final>
</property>
</configuration>


#hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>

</configuration>




#mapred-site.xml
----------

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>


<property>
<name>mapred.job.tracker</name>
<value>hdfs://ec2-54-172-198-182.compute-1.amazonaws.com:8021</value>
<final>true</final>
</property>

</configuration>

#========================================================

transfer above configuration files to hadoop/slaves

# from master to slaves

cd hadoop/conf

scp hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml ubuntu@ec2-54-165-137-226.compute-1.amazonaws.com:/home/ubuntu/hadoop/conf

scp hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml ubuntu@ec2-54-174-46-166.compute-1.amazonaws.com:/home/ubuntu/hadoop/conf


# masters ; if secondary namenode should start form other node then mention "ID" in this file here; or leave blank.
# on slave machines keep masters file blank.

#slaves file
# on Master machine; mention both "IP ID" in slaves file.
ec2-54-165-137-226.compute-1.amazonaws.com
ec2-54-173-232-142.compute-1.amazonaws.com

# on slaves machine; only that machines IP ID should be entered.
on slave 1; "slaves" file will have "ec2-54-165-137-226.compute-1.amazonaws.com" this line only.
on slave 2; "slaves" file will have "ec2-54-173-232-142.compute-1.amazonaws.com" this line only.

#format file-system
hadoop namenode –format

# on Master node :- goto bin directory. On ubuntu simply typing and entering works.

cd hadoop/bin
start-all.sh



check health status by
name node
http://ec2-54-173-232-142.compute-1.amazonaws.com:50070/dfshealth.jsp

jobtracker
http://ec2-54-173-232-142.compute-1.amazonaws.com:50030/jobtracker.jsp

slave node status
http://ec2-54-174-46-166.compute-1.amazonaws.com:50060/tasktracker.jsp
http://ec2-54-165-137-226.compute-1.amazonaws.com:50060/tasktracker.jsp

Friday, October 25, 2013

Kill all running excel instances using VBScript


vbScript is very useful to handle Excel applications.
Following code should be placed in .VBS file.

SET WinShell = CreateObject("WScript.Shell")
SET oExcelKill=WinShell.Exec("taskkill /F /IM Excel.EXE")
SET oExcelKill= Nothing
SET WinShell =Nothing

Above code will kill all currently running excel applications without asking any confirmation messages.



Tuesday, July 9, 2013

Copy an open file through Excel VBA

To copy a open file through VBA; FileCopy statement will  not work; instead follow below steps to copy an open file

1. Go to VBA Editor

2.  Add a reference to the Microsoft Scripting Runtime (Tools -References - Microsoft Scripting Runtime)

3. Below is the sample Code 

Sub test()    
   Dim CopyOpenFile As New FileSystemObject     
   CopyOpenFile.CopyFile "c:\test\asd.xlsx", "c:\test2\asd.xlsx", True
End Sub 

c:\test\asd.xlsx - Source 
c:\test2\asd.xlsx - Destination 
True - Overwrite (False - Do not Overwrite) 

Wednesday, April 3, 2013

Excel Slicer for Pivot table and Grand Total: Workaround

If you create  slicer for a pivot table and you want to see details of "Grand Total"; it may give you wrong result.

e.g.

1. We have sales data as













2. And We have created Pivot table as (Without any Report Filters)
   
   Report filter:
                      None

   Column Labels:
                       Sales Person

    Values:
                      Sum of Unit sold












3. Now we insert slicer of months

















4. Select Any month e.g. January



5. You can see now Grand total is 3, if you want to view details of those 3 Sales persons who sold units in January, double click on 3; but you will see all data instead of only 3 people