1. 程式人生 > >hadoop系統 hdfs 命令列操作

hadoop系統 hdfs 命令列操作

轉自:https://blog.csdn.net/sjhuangx/article/details/79796388

Hadoop檔案系統shell命令列表: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html

命令檢視:  hadoop fs

[[email protected] mapreduce]$ hadoop fs
Usage: hadoop fs [generic options]
    [-appendToFile <localsrc> ... <dst>]
    [-cat [-ignoreCrc] <src> ...]
    [-checksum <src> ...]
    [-chgrp [-R] GROUP PATH...]
    [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
    [-chown [-R] [OWNER][:[GROUP]] PATH...]
    [-copyFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
    [-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
    [-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] <path> ...]
    [-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
    [-createSnapshot <snapshotDir> [<snapshotName>]]
    [-deleteSnapshot <snapshotDir> <snapshotName>]
    [-df [-h] [<path> ...]]
    [-du [-s] [-h] [-x] <path> ...]
    [-expunge]
    [-find <path> ... <expression> ...]
    [-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
    [-getfacl [-R] <path>]
    [-getfattr [-R] {-n name | -d} [-e en] <path>]
    [-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
    [-help [cmd ...]]
    [-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...]]
    [-mkdir [-p] <path> ...]
    [-moveFromLocal <localsrc> ... <dst>]
    [-moveToLocal <src> <localdst>]

1. hdfs檔案上傳

# 上傳 XXX.zip 檔案到 / 目錄
hadoop fs -put XXX.zip /


2. 檢視檔案目錄列表

[[email protected] ~]$ hadoop fs -ls /
Found 2 items
-rw-r--r--   2 hadoop supergroup  354635831 2018-04-02 10:27 /jdk-9.0.4_linux-x64_bin.tar.gz
-rw-r--r--   2 hadoop supergroup        279 2018-04-02 10:30 /test.txt


3. 獲取檔案 hadoop fs -get /XXX

[[email protected] ~]$ hadoop fs -get /jdk-9.0.4_linux-x64_bin.tar.gz
[[email protected] ~]$ ls
hadoop-2.9.0  hdfsdata  jdk-9.0.4_linux-x64_bin.tar.gz
[[email protected] ~]$ 
4. 建立目錄 hadoop fs -mkdir /XXX

[[email protected] ~]$ hadoop fs -mkdir /wordCount
[
[email protected]
~]$ hadoop fs -ls / Found 3 items -rw-r--r--   2 hadoop supergroup  354635831 2018-04-02 10:27 /jdk-9.0.4_linux-x64_bin.tar.gz -rw-r--r--   2 hadoop supergroup        279 2018-04-02 10:30 /test.txt drwxr-xr-x   - hadoop supergroup          0 2018-04-02 10:44 /wordCount [[email protected] ~]$ 


5. 指定jar命令  hadoop jar XXX.jar YYY /wordCount/ /wordCount/output

[[email protected] mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.9.0.jar wordcount /wordCount/ /wordCount/output
18/04/02 10:48:26 INFO client.RMProxy: Connecting to ResourceManager at mini1/192.168.241.100:8032
18/04/02 10:48:27 INFO input.FileInputFormat: Total input files to process : 2
18/04/02 10:48:28 INFO mapreduce.JobSubmitter: number of splits:2
18/04/02 10:48:28 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
18/04/02 10:48:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1522678018314_0001
18/04/02 10:48:29 INFO impl.YarnClientImpl: Submitted application application_1522678018314_0001
18/04/02 10:48:29 INFO mapreduce.Job: The url to track the job: http://mini1:8088/proxy/application_1522678018314_0001/
18/04/02 10:48:29 INFO mapreduce.Job: Running job: job_1522678018314_0001
18/04/02 10:48:36 INFO mapreduce.Job: Job job_1522678018314_0001 running in uber mode : false
18/04/02 10:48:36 INFO mapreduce.Job:  map 0% reduce 0%
18/04/02 10:48:48 INFO mapreduce.Job:  map 100% reduce 0%
18/04/02 10:48:53 INFO mapreduce.Job:  map 100% reduce 100%
18/04/02 10:48:54 INFO mapreduce.Job: Job job_1522678018314_0001 completed successfully
 
 
[[email protected] mapreduce]$ hadoop fs -cat /wordCount/output/part-r-00000
Apr    2
Don't    1
Jan    1
Mar    1
a    1
and    1
for    2
hadoop    9
 
[[email protected] mapreduce]$ 


6. 設定檔案副本數 setrep

hadoop fs -setrep 10 XXX.yyy

這裡設定的檔案副本數是指記錄在namenode中的元資料,具體的副本數量還需要看叢集中國是否有足夠數量的機器存放,如果有則真實的副本數量為元資料中的數量,否則只能達到機器允許的最大數量。