The Hadoop Distributed File System provides different interfaces so that clients can interact with it. Besides the HDFS shell, the file system exposes itself through WebDAV, Thrift, FTP and FUSE. In this post, we access HDFS over FTP. We have used Hadoop 0.20.2.
1. Download the hdfs-over-ftp tar from https://issues.apache.org/jira/secure/attachment/12409518/hdfs-over-ftp-0.20.0.tar.gz
2. Untar hdfs-over-ftp-0.20.0.tar.gz.
3. We now need to create the configuration with ftp username and password.
./register-user.sh username password >> users.conf
# the username user
ftpserver.user.username.userpassword=0238775C7BD96E2EAB98038AFE0C4279
ftpserver.user.username.homedirectory=/
ftpserver.user.username.enableflag=true
ftpserver.user.username.writepermission=true
ftpserver.user.username.maxloginnumber=0
ftpserver.user.username.maxloginperip=0
ftpserver.user.username.idletime=0
ftpserver.user.username.uploadrate=0
ftpserver.user.username.downloadrate=0
ftpserver.user.username.groups=users
4. Configure log4j.conf so that you can diagnose whats happening.
5. Now make changes according to your requirement in hdfs-over-ftp.conf
#uncomment this to run ftp server
port = 21
data-ports = 20
#uncomment this to run ssl ftp server
#ssl-port = 990
#ssl-data-ports = 989
# hdfs uri
hdfs-uri = hdfs://localhost:9000
# max number of login
max-logins = 1000
# max number of anonymous login
max-anon-logins = 1000
# have to be a user which runs HDFS
# this allows you to start ftp server as a root to use 21 port
# and use hdfs as a superuser
superuser = hadoop
Please provide hdfs-uri according to your requirement.
6. Now start ftp server: sudo ./hdfs-over-ftp.sh start
7. To login to hdfs as ftp client ftp {ip address of namenode machine}
(Note- Use username and password which you registered in user.conf)
8. To put file or write in any folder of hdfs you need to provide permission to your user through hadoop ‘chown’ command
bin/hadoop fs -chown -R group:username {path}
10. You can stop the ftp server by sudo ./hdfs-over-ftp.sh stop