博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
flume学习笔记
阅读量:7178 次
发布时间:2019-06-29

本文共 7151 字,大约阅读时间需要 23 分钟。

##########################################################################################################

##########################################################################################################

flume安装,解压后修改flume_env.sh配置文件,指定java_home即可。

cp hdfs jar包到flume lib目录下(否则无法抽取数据到hdfs上)

flume常见命令选项:

[hadoop@db01 flume-1.5.0]$ bin/flume-ng

commands:

  agent                     run a Flume agent

global options:

  --conf,-c <conf>          use configs in <conf> directory
  -Dproperty=value          sets a Java system property value
 

agent options:

  --name,-n <name>          the name of this agent (required)
  --conf-file,-f <file>     specify a config file (required if -z missing)

eg:

bin/flume-ng agent --conf /opt/cdh-5.3.6/flume-1.5.0/conf --name agent-test --conf-file test.conf

bin/flume-ng agent -c /opt/cdh-5.3.6/flume-1.5.0/conf -n agent-test -f test.conf

********************************************************************************************************

flume第一个案例:

定义配置文件/opt/cdh-5.3.6/flume-1.5.0/conf/a1.conf:

# The configuration file needs to define the sources,

# the channels and the sinks.

###################################

a1.sources = r1
a1.channels = c1
a1.sinks = k1

############define source#######################################

a1.sources.r1.type = netcat
a1.sources.r1.bind = db01
a1.sources.r1.port = 55555

#############define channel###################################

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

##########define sinks#########################

a1.sinks.k1.type = logger
a1.sinks.k1.maxBytesToLog = 1024

#######bind###############################

a1.sources.r1.channels=c1
a1.sinks.k1.channel = c1

安装telnet:

[root@db01 softwares]# rpm -ivh telnet-*

Preparing...                ########################################### [100%]
   1:telnet-server          ########################################### [ 50%]
   2:telnet                 ########################################### [100%]
[root@db01 softwares]#
[root@db01 softwares]#
[root@db01 softwares]# rpm -ivh xinetd-2.3.14-39.el6_4.x86_64.rpm
Preparing...                ########################################### [100%]
    package xinetd-2:2.3.14-39.el6_4.x86_64 is already installed
[root@db01 softwares]#
[root@db01 softwares]#
[root@db01 softwares]#
[root@db01 softwares]# /etc/rc.d/init.d/xinetd restart
Stopping xinetd:                                           [  OK  ]
Starting xinetd:                                           [  OK  ]

启动flume:

bin/flume-ng agent \

--conf /opt/cdh-5.3.6/flume-1.5.0/conf \
--name a1 \
--conf-file /opt/cdh-5.3.6/flume-1.5.0/conf/a1.conf \
-Dflume.root.logger=DEBUG,console

登录telnet 测试:

[root@db01 ~]# telnet db01 55555

Trying 192.168.100.231...
Connected to db01.
Escape character is '^]'.
hello flume
OK
chavin king   
OK

------------ 日志输出如下 -------------

2017-03-23 16:48:31,285 (netcat-handler-0) [DEBUG - org.apache.flume.source.NetcatSource$NetcatSocketHandler.run(NetcatSource.java:318)] Chars read = 13

2017-03-23 16:48:31,290 (netcat-handler-0) [DEBUG - org.apache.flume.source.NetcatSource$NetcatSocketHandler.run(NetcatSource.java:322)] Events processed = 1
2017-03-23 16:48:33,234 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 0D             hello flume. }
2017-03-23 16:48:39,224 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:/opt/cdh-5.3.6/flume-1.5.0/conf/a1.conf for changes
2017-03-23 16:48:47,031 (netcat-handler-0) [DEBUG - org.apache.flume.source.NetcatSource$NetcatSocketHandler.run(NetcatSource.java:318)] Chars read = 13
2017-03-23 16:48:47,032 (netcat-handler-0) [DEBUG - org.apache.flume.source.NetcatSource$NetcatSocketHandler.run(NetcatSource.java:322)] Events processed = 1
2017-03-23 16:48:48,235 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 63 68 61 76 69 6E 20 6B 69 6E 67 0D             chavin king. }
2017-03-23 16:49:09,225 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:/opt/cdh-5.3.6/flume-1.5.0/conf/a1.conf for changes

***************************************************************************

flume第二个案例:收集hive log

/user/hadoop/flume/hive-logs/

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p /user/hadoop/flume/hive-logs/

a2.conf文件:

# The configuration file needs to define the sources,

# the channels and the sinks.

###################################

a2.sources = r2
a2.channels = c2
a2.sinks = k2

############define source#######################################

a2.sources.r2.type = exec
a2.sources.r2.command = tail -f /opt/cdh-5.3.6/hive-0.13.1/data/logs/hive.log
a2.sources.r2.shell = /bin/bash -c

#############define channel###################################

a2.channels.c2.type = memory
a2.channels.c2.capacity = 1000
a2.channels.c2.transactionCapacity = 100

##########define sinks#########################

a2.sinks.k2.type = hdfs

#a2.sinks.k2.hdfs.path = hdfs://db02:8020/user/hadoop/flume/hive-logs/

#hadoop ha 配置方法,cp hadoop的配置文件到flume的conf目录下:
#cp /opt/cdh-5.3.6/hadoop-2.5.0/etc/hadoop/core-site.xml /opt/cdh-5.3.6/hadoop-2.5.0/etc/hadoop/hdfs-site.xml /opt/cdh-5.3.6/flume-1.5.0/conf/
a2.sinks.k2.hdfs.path = hdfs://ns1/user/hadoop/flume/hive-logs/

a2.sinks.k2.hdfs.fileType = DataStream

a2.sinks.k2.hdfs.writeFormat = Text
a2.sinks.k2.hdfs.batchSize = 10

#######bind###############################

a2.sources.r2.channels=c2
a2.sinks.k2.channel = c2

测试:

bin/flume-ng agent \
--conf /opt/cdh-5.3.6/flume-1.5.0/conf \
--name a2 \
--conf-file /opt/cdh-5.3.6/flume-1.5.0/conf/a2.conf \
-Dflume.root.logger=DEBUG,console

******************************************************************************

flume第三个案例:

编辑a3.conf文件:

# The configuration file needs to define the sources,

# the channels and the sinks.

######define agent#############################

a3.sources = r3
a3.channels = c3
a3.sinks = k3

############define source#######################################

a3.sources.r3.type = spooldir
a3.sources.r3.spoolDir = /opt/cdh-5.3.6/flume-1.5.0/spoolinglogs
a3.sources.r3.ignorePattern = ^(.)*\\.log$
a3.sources.r3.fileSuffix = .delete

#############define channel###################################

a3.channels.c3.type = file
a3.channels.c3.checkpointDir = /opt/cdh-5.3.6/flume-1.5.0/filechannel/checkpoint
a3.channels.c3.dataDirs = /opt/cdh-5.3.6/flume-1.5.0/filechannel/data

##########define sinks#########################

a3.sinks.k3.type = hdfs

#a3.sinks.k3.hdfs.path = hdfs://db02:8020/user/hadoop/flume/hive-logs/

a3.sinks.k3.hdfs.path = hdfs://ns1/user/hadoop/flume/splogs/%Y%m%d

a3.sinks.k3.hdfs.fileType = DataStream

a3.sinks.k3.hdfs.writeFormat = Text
a3.sinks.k3.hdfs.batchSize = 10
a3.sinks.k3.hdfs.useLocalTimeStamp = true
#######bind###############################
a3.sources.r3.channels=c3
a3.sinks.k3.channel = c3

测试:
bin/flume-ng agent \
--conf /opt/cdh-5.3.6/flume-1.5.0/conf \
--name a3 \
--conf-file /opt/cdh-5.3.6/flume-1.5.0/conf/a3.conf \
-Dflume.root.logger=DEBUG,console

转载地址:http://mwozm.baihongyu.com/

你可能感兴趣的文章
关于数据仓库的数据模型
查看>>
Vmware vSphere 5.0系列教程之四 vSphere网络原理及vSwitch简介
查看>>
Spring Cloud构建微服务架构:服务消费(Feign)【Dalston版】
查看>>
关于SQLServer2005的学习笔记——CTE递归和模拟测试数据
查看>>
ReplaceForm.cs
查看>>
transform.rotation和GetComponent<Rigidbody>().MoveRotation
查看>>
输入字符串的格式不正确(异常详细信息: System.FormatException: 输入字符串的格式不正确。)...
查看>>
Android Settings 导入eclipse
查看>>
IE和firefox缓存的两点不同【转】
查看>>
世界大部分的变化变革是一直有人在问为什么不能做得更好。
查看>>
akka 原理分析优秀博客
查看>>
Linux上安装Bugzilla4.4小记
查看>>
SSIM(结构相似度算法)不同实现版本的差异
查看>>
方法method
查看>>
python 回溯法 子集树模板 系列 —— 12、选排问题
查看>>
ArcGIS Engine中空间参照(地理坐标)相关方法总结
查看>>
关于CMS后台展示/操作方式的个人拙见
查看>>
git常用命令
查看>>
ubuntu安装字体
查看>>
详谈如何定制自己的博客园皮肤【转】
查看>>