Top 10 Useful Hdfs Commands Part-I

Datetime:2016-08-23 01:46:50          Topic: HDFS           Share

1. Objective

In this tutorial we are going to learn the most important and frequently used HDFS commands with the help of which we will be able to perform HDFS file operations like copying file, changing files permissions, viewing the file contents, changing files ownership, creating directories, etc.

2. HDFS Introduction

HDFS is a distributed file system which provides redundant storage space for files having huge sizes. It is used for storing files which are in the range of terabytes to petabytes. To learn more about world’s most reliable storage layer follow this HDFS introductory guide

Before working with HDFS you need to Deploy Hadoop, follow this guide to Install and configure Hadoop

3. HDFS Commands

Hadoop file system shell commands are used to perform various Hadoop HDFS operations and in order to manage the files present on HDFS clusters. All the Hadoop file system shell commands are invoked by the bin/hdfs script.

3.1. version

Command Usage

version

Command Example

hdfs dfs version

Description

Prints the Hadoop version

3.2. mkdir

Command Usage

mkdir <path>

Command Example

hdfs dfs -mkdir /user/dataflair/dir1

Description

Takes path uri’s as argument and creates directories.

Creates any parent directories in path that are missing (e.g., mkdir -p in Linux).

3.3. ls

Command Usage

ls <path>

Command Example

hdfs dfs -ls /user/dataflair/dir1

Description

It displays a list of the contents of a directory specified by path provided by the user, showing the names, permissions, owner, size and modification date for each entry.

Command Example

hdfs dfs -ls -R

Description

Behaves like -ls, but recursively displays entries in all subdirectories of path.

3.4. put

Command Usage

put <localSrc> <dest>

Command Example

hdfs dfs -put /home/dataflair/Desktop/sample /user/dataflair/dir1

Description

Copies the file or directory from the local file system to destination within the DFS.

Learn Internals of HDFS Data Write Pipeline and File write execution flow

3.5. copyFromLocal

Command Usage

copyFromLocal <localSrc> <dest>

Command Example

hdfs dfs -copyFromLocal /home/dataflair/Desktop/sample /user/dataflair/dir1

Description

Similar to put command, but the source is restricted to a local file reference.

Learn Internals of HDFS Data Read Operation, How Data flows in HDFS while reading the file

3.6. get

Command Usage

get [-crc] <src> <localDest>

Command Example

hdfs dfs -get /user/dataflair/dir2/sample /home/dataflair/Desktop

Description

Copies the file or directory in HDFS identified by source to the local file system path identified by local destination.

Command Example

hdfs dfs -getmerge /user/dataflair/dir2/sample /home/dataflair/Desktop

Description

Retrieves all files that matches to the source path entered by the user in HDFS, and creates copy of them to one single, merged file in the local file system identified by local destination.

Command Example

hadoop fs -getfacl  /user/dataflair/dir1/sample
hadoop fs -getfacl -R  /user/dataflair/dir1

Description

It shows the Access Control Lists (ACLs) of files and directories. If a directory contains a default ACL, then getfacl also displays the default ACL.

Options :

-R: It displays a list of all the ACLs of all files and directories recursively.

path: File or directory to list.

Command Example

hadoop fs -getfattr -d /user/dataflair/dir1/sample

Description

Displays if there is any extended attribute names and values for a file or directory.

Options:

-R: It recursively list the attributes for all files and directories.

-n name: It displays the named extended attribute value.

-d: It displays all the extended attribute values associated with pathname.

-e encoding: Encodes values after extracting them. The valid converted coded forms are “text”, “hex”, and “base64”. All the values encoded as text strings are with double quotes (“”), and prefix 0x and 0s are used on all the values which are converted and coded as hexadecimal and base64.

path: The file or directory.

3.7. copyToLocal

Command Usage

copyToLocal <src> <localDest>

Command Example

hdfs dfs -copyToLocal /user/dataflair/dir1/sample /home/dataflair/Desktop

Description

Similar to get command, only the difference is that in this the destination is restricted to a local file reference.

3.8. cat

Command Usage

cat <file-name>

Command Example

hdfs dfs -cat /user/dataflair/dir1/sample

Description

Displays the contents of filename on console or stdout.

3.9. mv

Command Usage

mv <src> <dest>

Command Example

hadoop fs -mv /user/dataflair/dir1/purchases.txt /user/dataflair/dir2

Description

Moves the file or directory indicated by source to destination, within HDFS.

3.10. cp

Command Usage

cp <src> <dest>

Command Example

hadoop fs -cp /user/dataflair/dir2/purchases.txt /user/dataflair/dir1

Description

Copies the file or directory identified by source to destination, within HDFS.





About List