一、背景
二、分析tmp文件
- 观察tmp文件,发现tmp文件的时间点都是10点28分一个时间点,并且所有tmp文件都是有flume2机器节点生成,因此去flume2机器agent日志,并对比flume1、flume3机器上的agent日志
三、定位原因
flume2机器上看到下面的报错:
- 15 Aug 2024 10:00:55,099 ERROR [[channel=channel1] - CheckpointBackUpThread] (org.apache.flume.channel.file.Serialization.copyFile:160) - Error while attempting to copy /data/datum/flume-prod/teflume_prod8/channel1/checkpoint/checkpoint to /data/datum/flume-prod/teflume_prod8/channel1/checkpoint_backup/checkpoint.
- java.io.IOException: Cannot allocate memory
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
at org.apache.flume.channel.file.Serialization.copyFile(Serialization.java:152)
at org.apache