Mounting and Unmounting Linux EBS volumes on AWS

You may use following Linux commands to try above. If you are new to Linux especially on a cloud infrastructure like AWS, the following would be useful.

AWS Instance Type: Amazon Linux (Redhat version)

1. lsblk – To check all volumes mounted
2. Then use the following to create a file system within the volume created
>> sudo mke2fs /dev/xvdf
3. Mount the created volume to an existing folder
>> sudo mount /dev/xvdf /mnt
4. Now check lsblk. You can see /mnt directory is mounted to /dev/xdvf folder.
5. Now you can copy files to the mounted folder
6. Id you want to unmount the volume you can use the following
>> sudo umount /mnt
Thats it!

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Apache Spark on Ubuntu – Part 01

1.0 Introduction

Spark is a fast and general cluster computing system for Big Data. It is written in Scala Language. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Spark Streaming for stream processing. Apache Spark is built on one main concept, which is “Resilient Distributed Data (RDD)”.

Spark Components

Python vs Scala vs Java with Spark?

The most popular languages that Spark associated are Python and Scala. Both languages follow a similar syntax and compared to Java they are quite easy to follow. However compared to Python, Scala seems more faster mainly Spark is written in Scala and it overcomes the delay of having to go through another set of libraries to interpret if you chose to use Python. However, in general both are capable of doing the task in almost all the use cases.

Spark with Python

You may install Python using Canopy

Use this link to download the binaries to your system. (Use Linux(64-bit Python 3.5 Download for this blog)

Once you installed,Canopy, you have a Python development environment to work with Spark with all the libraries including PySpark.

Once all these installed you can try PySpark by just typing “pyspark” on the terminal window.

$ pyspark

This will allow you to continue to execute your Python scripts on Spark.

Installing Apache Spark

In order to complete this task, you are required to follow the following steps one by one.

Step 1: Install Java

- I assume you already have Java Development Kit (JDK) installed in your machines. In March 2018, the Spark supported JDK version is JDK 8.

- You may verify the Java installation

$ java -version

Step 2: Install Scala

- If you do not have Scala installed in your system, use this link to install it.

- Get the “tgz” bundle and extract it to the /usrlocal/scala folder (This is as a best practice)

$ tar xvf scala-2.11.12.tgz

// Extract the scala into /usr/local folder
$ su -
$ mv scala-2.11.12 /usr/local/scala
$ exit

- Then update the .bashrc to have SCALA_HOME and $SCALA_HOME/bin to the $PATH.

- After all, verify the scala installation

$ scala -version

Step 3: Install Spark

After installing both Java and Scala, now you are ready to download the Spark version. Use this link to download the “tgz” file.

$ tar xvf spark-2.3.0-bin-hadoop2.7.tgz
// Extract the Spark into /usr/local/spark
$ su -
$ mv spark-2.3.0-bin-hadoop2.7 /usr/local/spark
$ exit

- Then update the .bashrc to have SPARK_HOME and $SPARK_HOME/bin to the $PATH.

- Now you may verify the Spark installation

$ spark-shell

if all goes well, you will see a Spark prompt being displayed!… Congratulations!

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Securing AWS Lambda Functions

The Default Security – (Permissions)

By default Lambda functions are “not” authorized to do access other AWS services. Hence, it is required to explicitly give access (permissions) to each and every AWS service.(i.e. accessing S3 to store images, accessing external databases such as DynamoDB, etc). These permissions are managed by AWS IAM roles.

Changing the Default Security – (Permissions)

If you are using the Serverless Framework you can customize the default settings by changing the serverless.yaml file (in the “iamRoleStatements:” block).

For example,

iamRoleStatements:
    - Effect: "Allow"
      Action:
        - "lambda:*"
      Resource:
        - "*"

The above will “Allow” all (“*”) to be invoked from the Lambda Function.

The Default Security – (Network)

By default, Lambda functions are not launched in a VPC. But you can change this by creating a Lambda function within a VPC. Furthermore, you can extend further by applying “Security Groups” as an additional layer of security within a VPC.

Changing the Default Security – (Network)

If you are using the Serverless Framework you can customize the default settings by changing the serverless.yaml file. Here is the code snippet that might use for this.

provider:
  name: aws
  runtime: python2.7
  profile: serverless-admin
  region: us-east-1

  vpc:
    securityGroupIds:
      - <security-group-id>
    subnetIds:
      - <subnet-1>
      - <subnet-2>
VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

The Serverless Framework with AWS

The Serverless Framework (https://serverless.com/framework/) is an open-source CLI for building serverless architectures to cloud providers (AWS, Microsoft Azure, IBM OpenWhisk, Google Cloud Platform, etc).

This article will brief you on the important steps you may require to get on with the AWS platform. This Framework works well with CI/CD tools and has the full support of AWS CloudFormation. With this it can provision your AWS Lambda functions,events, and infrastructure resources.

Step 1: Installing NodeJS

Serverless is a Node.js CLI tool so the first thing you need to do is to install Node.js on your machine. Refer the official NodeJS web site and download and follow the instructions to install NodeJS.

Serverless Framework runs on Node v6.5.0 or higher. You can verify that NodeJS is installed successfully by executing node -v in your terminal.

If all fine, we may proceed to the second step.

Step 2: Installing Serverless Framework

$ npm install -g serverless

Once installed, you may verify it.

$ serverless --version

Step 3: Setting up Cloud Provider (AWS) Credentials

The Serverless Framework needs access to your cloud provider’s account so that it can create and manage resources on your behalf. You may set it up with this Youtube link

Once above is completed, you may add the AWS credentials to your client machine to work as a CLI. You may use the following command to do that.

$ serverless config credentials --provider aws --key XXXXXXXXXXXXXXXXX --secret XXXXXXXXXXXXXXXXX --profile serverless-admin

This will basically add an entry to the credentials file, which is located in the $<home-folder>/.aws folder. (assumes the AWS user is serverless-admin)

[serverless-admin]
aws_access_key_id = XXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXX

If all above is OK, you are ready to create your first Serverless function (Lambda Function) with AWS.

Step 3: Creating your Serverless Project

You may build your projects based on the templates/ archetypes given by the framework.

By default, there are multiple templates/ archetypes given. (i.e. “aws-nodejs”, “aws-python”, “aws-python3″, “aws-groovy-gradle”, “aws-java-maven”, “aws-java-gradle”, “aws-scala-sbt”, “aws-csharp”, etc)

So lets create a “aws-python” project for fun…

$ serverless create --template aws-python --path hello-world-python

The above will create a folder named “hello-world-python”.

Just browse the folder. You would see two files.

1. handler.py – (This is the Serverless Function. Your Business Logic goes here)

Here just edit the handler.py to have a simple output.

def hello(event, context):
        print "Hello Crishantha"
        return "Hello World!"

2. serverless.yml – (The Serverless Function Configuration.)

P.Note: You may check the following configuration especially before you executing the rest of the key commands

If you are new to YAML and know JSON well, you may use https://www.jason2yaml.com link to convert JSON to YAML and vice versa.

provider:
  name: aws
  runtime: python2.7
  profile: serverless-admin
  region: us-east-1

If all above is ok, you are good to go and deploy the function on AWS. So lets move to the next step. (Step 4)

Step 4: Deploy the Serverless Function

As explained, move to “hello-world-python” folder and execute the following command.

$ serverless deploy -v

The above will run the automated script creating all the background scripts including CloudFormation scripts to deploy the respective application. It is pretty awesome!

Step 5: Invoke the Serverless Function

Use the following to see the output.

$ serverless invoke -f hello -l

The above will return a simple “hello” for you (The output that you have mentioned in the handler.py)

It is that simple!!!

Step 6: Verify

If you want to verify all this, you can log in to the AWS console and see what you have done is reflected in the AWS Lambda area. Sure you will.

Step 7: Remove All

OK. We just did some testing. So probably you want to remove the serverless function and all its dependencies (IAM roles, Cloudwatch Log groups, etc)

- Move to the folder that the function that you want to delete.

- Execute the following

$ serverless remove

The above will clean the whole thing up!…

So, if you are a AWS Developer, you may find it very useful as much as I do at the moment. Happy Coding!

[References]

1. Serverless Framework Page – https://serverless.com/framework/docs/providers/aws/guide/services/

2. AWS Provider Documentation – https://serverless.com/framework/docs/providers/aws/

3. Serverless AWS Lambda Guide – https://serverless.com/framework/docs/providers/aws/guide/

4. Serverless Framework GitHub – https://github.com/serverless/serverless

5. YAML to JSON tool – https://www.jason2yaml.com

6. The Serverless Framework: A deep overview of the best AWS Lambda + API Gateway Automation Solution – https://cloudacademy.com/blog/serverless-framework-aws-lambda-api-gateway-python/

VN:F [1.9.22_1171]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)

Connecting to a remote MYSQL instance on a AWS EC2 instance

If you are having a “self-managed” MySQL EC2 instance, which can be connected to other EC2 instances in the same VPC or even other remote machines. In order to do this, there are a few configuration changes you need to carry out.

Here are the steps:

1. Connect to the remote MySQL remote EC2 instance. – On default you can access the MySQL using “root” user. However it is not advisable to access a MySQL instance remotely using the “root” user for security reasons.

[P.Note: Please make sure the Port 3306 is added to the inbound rules in the EC2 Security Group prior attempting this.]

2. Change the <bind-address> parameter to 0.0.0.0, allowing the access to all remote addresses. This needs to be changed in the /etc/mysql/mysql.conf.d/my.cnf file.

3. Restart the MySQL instance

mysql-ec2-instance>> sudo /etc/init.d/mysqld restart

4. Therefore, create a new MySQL user. – For this, you are required to sign in to the MySQL and execute the following command(s).

mysql-ec2-instance>> mysql -u root -p<root-password>

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE USER 'user'@'localhost' IDENTIFIED BY 'user123';

mysql> CREATE USER 'user'@'%' IDENTIFIED BY 'user123';

mysql> GRANT ALL PRIVILEGES ON *.* to user@localhost IDENTIFIED BY 'user123' WITH GRANT OPTION;

mysql> GRANT ALL PRIVILEGES ON *.* to user@'%' IDENTIFIED BY 'user123' WITH GRANT OPTION;

mysql> FLUSH PRIVILEGES;

mysql> EXIT;

5. Now exit from the EC2 MySQL instance and try to log into the MySQL EC2 instance from your local machine.

your-local-machine>> mysql -h <ec2-public-dns-name> -u user -puser123

If all fine, you should be able to sign in to the remote EC2 instance without any issue!!

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Docker on Ubuntu 16.04 LTS – [Part 04] Docker Compose

Currently, Docker is the most popular and widely used container management system. In most of our enterprise applications nowadays, we do tend to have components running in separate containers. In such an architecture, the “container orchestration” (starting/ shutting down containers and setting up intra-container linkages) is an important factor and the Docker community came up with a solution called Fig, which basically handled this requirement. This uses a single YAML file to orchestrate all your Docker containers and configurations. The popularity of Fig allowed Docker community to plug into its own Docker code base as separate component called “Docker Compose“.

1. Installing Docker Compose

You are required to follow the steps below:

$ sudo curl -o /usr/local/bin/docker-compose -L "https://github.com/docker/compose/releases/download/1.11.2/docker-compose-$(uname -s)-$(uname -m)"

Set the permissions:

$ sudo chmod +x /usr/local/bin/docker-compose

Now check whether it is installed properly:

$ docker-compose -v

2. Running a Container with Docker Compose

Create a directory called “ubuntu” to download an image from GitHub. This will basically download the latest ubuntu distribution as an image to the local.

$ mkdir ubuntu
$ cd ubuntu

Once you do above, create a configuration file (docker-compose.yml) as an guideline to create an image.

docker-compose-test:
  image: ubuntu

Now execute the following:

$ docker-compose up // As an interactive job
$ docker-compose up -d // As a daemon job

The above will read the docker-compose.yml and pull the relevant images and up the respective container.

Pulling docker-compose-test (ubuntu:latest)...
latest: Pulling from library/ubuntu
e0a742c2abfd: Pull complete
486cb8339a27: Pull complete
dc6f0d824617: Pull complete
4f7a5649a30e: Pull complete
672363445ad2: Pull complete
Digest: sha256:84c334414e2bfdcae99509a6add166bbb4fa4041dc3fa6af08046a66fed3005f
Status: Downloaded newer image for ubuntu:latest
Creating ubuntu_docker-compose-test_1
Attaching to ubuntu_docker-compose-test_1
ubuntu_docker-compose-test_1 exited with code 0

Now execute the following to see whether an ubuntu:latest image is downloaded and container is created.

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             SIZE
ubuntu                        latest              14f60031763d        4 days ago          120 MB
$ docker ps -a
CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS                     PORTS                    NAMES
5705871fe7ed        ubuntu                        "/bin/bash"              2 minutes ago       Exited (0) 2 minutes ago                            ubuntu_docker-compose-test_1

References

1. How to install Docker Compose on Ubuntu 16.04 LTS

2. How to install and use Docker Compose on Ubuntu 14.04 LTS

3. How To Configure a Continuous Integration Testing Environment with Docker and Docker Compose on Ubuntu 16.04

4. How To Install WordPress and PhpMyAdmin with Docker Compose on Ubuntu 14.04

4. Docker Compose (Official Web URL)

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Docker on Ubuntu 16.04 LTS – [Part 03] Docker Networking

In my previous post on Docker images, we were able to run certain containers in the foreground. To recall it, here it is:

$ docker run -d -p 80 --name static_web crishantha/static_web  /usr/sbin/apache2ctl -D FOREGROUND

However, this container is not visible to outside since it runs in a private network. If you are to run this allowing to public means, you are required to bind the 80 port to some other port, which runs the container itself. For example, if we map the same port of 80 to the container, we should execute the above command as follows:

$ docker run -d -p 80:80 --name static-web crishantha/static-web  /usr/sbin/apache2ctl -D FOREGROUND

Once you do above, you are able to run the container from the outside IP. Hope this is clear now!

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Docker on Ubuntu 16.04 LTS – [Part 02] – Images

In my previous article, I stopped at the Docker Container management. In this article basically I will be touching the Docker Images.

A typical traditional Linux system to run, it basically needs two file systems:

  1. boot file system (bootfs)
  2. root file system (rootfs)

The bootfs contains the boot loader and the kernel. The user never makes any changes to the boot file system. In fact, soon after the boot process is complete, the entire kernel is in memory, and the boot file system is unmounted to free up the RAM associated with the initrd disk image.

The rootfs includes the typical directory structure we associate with Unix-like operating systems: /dev, /proc, /bin, /etc, /lib, /usr, and /tmp plus all the configuration files, binaries and libraries required to run user applications.

Here the root file system is mounted read-only and then switched to read-write after boot. In Docker, the root file system stays in read-only mode, and Docker takes advantage of a union mount to add more read-only filesystems onto the root file system and appear as only one file system. This gives the complete control of the all the file systems, which are added to the Docker container. Finally when a container is created/ launched, Docker will mount a read-write file system on top of all the other file system image layers. All the changes made to underneath images are basically stored in this read-write layer. However, the original copy is retained in underneath layers without and changes written to them. This read-write layer + other layers underneath  + base layer basically form a Docker container. (See the image below)

In Part 01 of this article, we created a container with an ubuntu image. You can see all the available images by,

$ sudo docker images

REPOSITORY TAG    IMAGE ID     CREATED     VIRTUAL SIZE
ubuntu     latest 07f8e8c5e660 4 weeks ago 188.3 MB

Seems now you have the “latest” ubuntu image with you. If you want a specific version image then you need to specify it as a TAG. i.e. ubuntu:12.04. So lets try that now.

$ sudo docker run -t -i --name new_container ubuntu:12.04 /bin/bash

Now, check the image status

$ sudo docker images

REPOSITORY TAG    IMAGE ID     CREATED     VIRTUAL SIZE
ubuntu     latest 07f8e8c5e660 4 weeks ago 188.3 MB
ubuntu     12.04  ac6b0eaa3203 4 weeks ago 132.5 MB

Further, if you want to delete one of the created images you can use,

$ sudo docker rmi <image-id>

While interacting with multiple images, there can be many unnamed and unwanted (dangling) images are being created. These can take a lot of space in the disk. Hence periodically it is required to purge  them from the system. Use the following to do the trick:

$ docker rmi $(docker images -q -f dangling=true)

Up to now, we used Docker run command to create containers. While creating it downloads the given image from the Docker Hub. This downloading to the local basically takes some time. If you want to save this time when you are creating the container, you can have the alternate route by first pulling the required template from the Docker Hub and then creating the container using the downloaded image. So here are the steps

// Pulling the image from Docker Hub
$ sudo docker pull fedora

// Creating a Docker Container using the pulled image
$ sudo docker run -i -t fedora /bin/bash

Now if you see, you will have 3 containers.

$ sudo docker ps -a

86476cec9907 fedora:latest ---
4d8b96d1f8b1 ubuntu:12.04  ---
c607547adce2 ubuntu:latest ---
Building your own Docker Images

There are two ways to do this.

Method (1). Via docker commit command

Method (2). Via docker build command with a Dockerfile (This is the recommended method)

To test method (1), first create a container using an already pulled image and then do some alteration to the image and then execute docker commit.

// Creating a Docker Container using an image
$ sudo docker run -i -t ubuntu:14.04 /bin/bash

// Alter the image
$ apt-get -yqq update
$ apt-get -y install apache2

// Committing the changes to the image
// Here, crishantha is the account created
// in the Docker Hub repository
// you may use Docker Hub or any other Docker repo
// 9b48a2b8850f is the Container ID of the contatiner

$ sudo docker commit 9b48a2b8850f crishantha/apache2

// List the Docker images
// Here the Docker altered image ID is shown
$ sudo docker images crishantha/apache2
crishantha/apache2 latest 0a33454e78e4 ....

To test method (2), you may create a Dockerfile at a given directory and specify the required changes needed for the image. For example, the Dockerfile can have the following lines, for an Ubuntu 14.04 image. FROM basically pulls the ubuntu 14.04 image and then RUN commands basically executes and add more layers to the image. EXPOSE will basically expose port 80 from the container.

Before executing the Dockerfile, it is good to create a new directory and create the DockerFile within that directory. Here the directory is called static_web.

FROM ubuntu:14.04
RUN apt-get update
RUN apt-get install -y apache2
EXPOSE 80

Once this is done, you can execute the Dockerfile by,

$ sudo docker build -t="crishantha/static_web" .

If all successful, it will return a image ID and further you can see it using docker images crishantha/static_web

Checking the Docker Image History

You can further check the history of the image by executing docker history <image Name/ image ID>

$ sudo docker history crishantha/static_web

Now you can execute the container by,

$ sudo docker run -d -p 80 --name static_web crishantha/static_web  /usr/sbin/apache2ctl -D FOREGROUND

The above will run as a detached process and you would see this by executing docker ps and you would see it running in the background as a Docker process.

If you use Nginx instead of Apache2 as the web server, you may add nginx -g “daemon off;” to the command. The daemon off; directive tells Nginx to stay in the foreground. For containers this is useful as best practice is for one container = one process. One server (container) has only one service.

Pushing Docker Images

Once an image is created we can always push it a Docker repository. If you are registered with Docker Hub, it is quite easy to push your image to it. Since it is a public repository, then if anyone interested can just pull it to his/her own Docker repository.

// If you have not already not logged in,
// Here the username is the one you registered
// with Docker Hub
$ sudo docker login
Username: crishantha
Password: xxxxxxxxxx

// If login is successful
$ sudo docker push crishantha/static_web

If all successful, you may see it is available in the Docker Hub.

Pulling Docker Images

Once it is push to Docker Hub, you may pull to to any other instance which runs Docker.

$ sudo docker pull crishantha/static_web
Automated Builds in Docker Hub Repositories

In addition to push our images from our set ups to Docker Hub, it allow us to automate Docker image builds within Docker Hub by connecting to external repositories. (private or public)

You can test this out by connecting your GitHub repository or Bitbucket repositories to Docker Hub. (Use the Add Repository –> Automated Build option in the Docker Hub to follow this process)

However, the Docker Hub automated builds should have a Dockerfile attached to it in the specific build folder. The build will go through based on the Dockerfile build that you specify here. Once the build is completed you can see the build log as well.

VN:F [1.9.22_1171]
Rating: 7.5/10 (4 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Docker on Ubuntu 16.04 LTS – [Part 01] – Installation and Containers

Docker is an open-source engine that automates the deployment of applications into containers released under the Apache 2 License. It adds an application deployment engine on top of a virtualized container execution environment. Docker aims to reduce the cycle time between code being written and code being tested, deployed, and used .

Core components of Docker:

  1. The Docker client and server
  2. Docker images
  3. Registries
  4. Docker Containers

Docker has client-server architecture. The docker binary acts as both the client and the server. As a client, the docker binary sends requests to docker daemon, process them and return.

Docker images are the building blocks or the the packaging aspect of Docker. Basically containers are launched from Docker images. These Docker images can be shared, stored, updated easily and considered highly portable.

Registries are there to store Docker images that you create in Docker. There are two types of Docker Registries 1) Private 2) Public. Docker Hub is the public Docker Registry maintained by the Docker Inc.

Containers are the running and execution aspect of Docker.

Docker does not care what software resides within the container. Each container is loaded on the same way as any other container. You can map this to a shipping container. A shipping container is not too much bothered about what it basically carries inside. It teats all the goods inside in the same way.

So the Docker containers are interchangeable, stackable, portable, and as generic as possible.

Docker can be run on any x64 host, which is running a modern Linux kernel.
(The Recommended kernel version 3.10 and later.)

The native Linux container format that Docker uses is libcontainer

The Linux Kernel Namespaces provides the isolation (file system, processes, network) which is required by Docker containers.

  • File System Isolation – Each container is running its own “root” file system
  • Process Isolation – Each container is running its own process environment
  • Network Isolation – Separate virtual interfaces and IP addressing

Resources like CPU and Memory allocation for each container happens using cgroups.(cgroups is a Linux kernel feature that limits, accounts for and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes)

Installing Docker on Ubuntu

Currently it is supported in wide variety of Linux platforms including Ubuntu, RedHat (RHEL), Dabian, CentOS, Fedora, Oracle Linux, etc.

Prerequisites

1. A 64-bit architecture (x86_64 abd amd64 only) 32 bit not supported.

2. Linux 3.8 Kernel or later version.

3. Kernel features such as cgroups and namespaces should be enabled.

Step 1 – Checking the Linux Kernel

In order to check the current Linux Kernel

$ uname -r

4.4.0-64-generic

So my Linux Kernel is 4.4 and should support Docker easily. (Since it is more than 3.8) and it is x86_64.

But if your Ubuntu Linux Kernel is less than 3.8 you may try to install 3.8.

$ sudo apt-get update
$ sudo apt-get install linux-headers-3.8.0-27-generic linux-image
-3.8.0-27-generic linux-headers-3.8.0-27

If above headers are not available, you can try referring the Docker manuals on the web. (https://docs.docker.com/engine/installation/ubuntulinux/)

Once this is done you are required to update the grub and reboot the system

$ sudo update-grub
$ sudo reboot

After rebooting pls check the Linux Kernel version by typing uname -a or uname -r

Step 2 – Installing Docker

Make sure the APT works fine with the “https” and the CA certificates are installed

$ sudo apt-get update

Add the new GPG Key

$ sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D

Add Docker APT repositories to /etc/apt/sources.list.d/docker.list file

$ sudo apt-add-repository 'deb https://apt.dockerproject.org/repo ubuntu-xenial main'

Now update the APT sources

$ sudo apt-get update
// Make sure you are installing from the Docker repositories and not from the Ubuntu repositories
$ apt-cache policy docker-engine

Finally, now you are in a position to install Docker and other additional packages using

$ sudo apt-get install -y docker-engine

Docker now is installed and the daemon must be started. Process also enabled to start on a reboot. You may check its availability using,

$ sudo systemctl status docker

You may get rid of having “sudo” in all the commands by adding the user to the docker group, which has the super user privileges.

$ sudo usermod -aG docker $(whoami)

Once you do above, you have to re-login to the system again.

If all OK, now you should check whether Docker was installed properly using

$ docker info

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 17.03.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 apparmor
 seccomp
 Profile: default
Kernel Version: 4.4.0-66-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 990.6 MiB
Step 6 – Creating a Docker Container

There are two types of Docker containers.

1. Interactive Docker Containers

2. Demonized Docker Containers.

1. Interactive Docker Container

Once Docker is installed successfully, now we can try an create a Docker container instance. Prior to that it is good to check the Docker status by typing the sudo docker status command.

If everything is alright, you can go ahead and create the Docker “interactive” container instance.

$ sudo docker run --name crish_container -i -t ubuntu /bin/bash

$ root@c607547adce2:/#

The above will create a container named crish_container, which is an ubuntu template. If you do not specify a name, the system will create a dummy name along with an unique container ID attached to it. One created you will be given an “interactive shell” like below.

root@c607547adce2:/#

Here the c607547adce2 is the container ID. You can type exit to move away from the containers interactive session. Once exited from the interactive session you can see the container is being stopped. The container only runs as long as the interactive session (/bin/bash) is running. That is the reason why they called as “interactive” docker containers.

Now again you can check the docker status by,

$ sudo docker info

Containers: 1
Images: 4
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 6
 Dirperm1 Supported: false
Execution Driver: native-0.2
Kernel Version: 3.13.0-32-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 2
Total Memory: 1.955 GiB
Name: crishantha
ID: QHAR:VN5E:SYKX:5LW4:YOW7:SKUB:SD6I:S4ZG:GEEI:IMU7:6MNI:WYR3
WARNING: No swap limit support

2. Demonized Docker Container

Other than the interactive docker instance, there is another type called “demonized containers”. These can be utilized to execute long-running jobs. In this you will not get an interactive session.

You can create a demonized container by,

$ sudo docker run --name crish_daemon -d ubuntu /bin/sh

However, these demonized sessions are ended in the background and you may not be able to reattach as an “interactive” docker session.

Step 7 – Display the Container List

To show all containers in the system

$ sudo docker ps -a

To show all the running containers,

$ sudo docker ps
Step 8 – Attach to a container

The container that you created with docker run command will restart with the same options that we have specified when we reattach to the same container again. The interactive session is basically waiting for the running container. You may use the following to reattach again.

$ sudo docker attach crish_container
OR
$ sudo docker attach c607547adce2

Here c607547adce2 is the <container_ID>

Note: Sometimes you are required to press ENTER key to show the bash shell once you execute the attach command.

Step 9 – Extract the Container IP

There is no straight forward command to get the Container IP that you are running. You may use the following to get it:

$ docker inspect <CONTAINER ID> | grep -w "IPAddress" | awk '{ print $2 }' | head -n 1 | cut -d "," -f1
Step 10 – Starting and Stopping a container

To start

$ sudo docker start crish_container

To stop

$ sudo docker stop crish_container
Step 10 – Deleting a container
$ sudo docker rm crish_container

Resources

1. Docker Home Page – https://www.docker.com/

2. Docker Hub – http://hub.docker.com

3. Docker Blog – http://blog.docker.com/

4. Docker Documentation – http://docs.docker.com

5. Docker Getting Started Guide – https://www.docker.com/tryit/

6. The Docker Book by James Turnbull

7. Introduction to Docker – http://www.programering.com/a/MDMzAjMwATk.html

8. Digital Ocean Guide – https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-16-04

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Hadoop 2.6 (Part 2) – Running the Mapreduce Job

This is the continuation of my previous article on “Installing Hadoop 2.6 on Ubuntu 16.04“. This article will explain how we run one of the examples given with the Hadoop binary.

Once the Hadoop installation is completed, you can run the “wordcount” example provided with the Hadoop examples in order to test a Mapreduce job. This example actually is bundled with the hadoop-examples.jar file in the distribution. (See the below steps for more details)

Step 1: Start the Hadoop Cluster, if not already started.

$ /usr/local/hadoop/sbin/start-dfs.sh
$ /usr/local/hadoop/sbin/start-yarn.sh

Step 2: Copy the text files that you are going to consider for a “wordcount” to a local folder (/home/hadoop/textfiles)

Step 3: Copy the text files (in the local folder) to HDFS.

$ echo "Word Count Text File" > textFile.txt
$ hdfs dfs -mkdir -p /user/hduser/dfs
$ hadoop dfs -copyFromLocal textFile.txt /user/hduser/dfs
Step 4: List the content of the HDFS folder.
$ hadoop dfs -ls /user/hduser/dfs
Step 5: If you were able to complete the step 4, you are good to go ahead with the MapReduce job.
$ cd /usr/local/hadoop
$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.1.jar wordcount /user/hduser/dfs /user/hduser/dfs-output
If the job was completed successfully, Congratulations!

You can either choose the command line or the web interface to display the contents of the HDFS directories. If you choose the command line you can try the following command.

$ hadoop dfs -ls /user/hduser/dfs-output

OR

http://localhost:50070/
VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Go to Top