采集过程中暂停采集器,checkpoint中保存的文件offset不正确 #1657
Unanswered
samtangweicheng
asked this question in
Help
Replies: 1 comment
-
正常的流程,在第三步停止时,logtail会自动dump checkpoint到本地,并在重启后自动读取checkpoint继续采集,不需要再用touch命令更新目标文件的修改时间。如果有完整的重启前后的日志可能更容易分析 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我们基于flusher_opentelemetry做了二次开发用于上报数据。
![image](https://private-user-images.githubusercontent.com/8476721/354553372-810abd97-1971-4eb3-8244-f600715f90e7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzOTgwNjQsIm5iZiI6MTczOTM5Nzc2NCwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MzM3Mi04MTBhYmQ5Ny0xOTcxLTRlYjMtODI0NC1mNjAwNzE1ZjkwZTcucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMjIwMjQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MDFjMDdlNmM1MTc3MGRhM2U4MDEyODIzNDIzM2RlY2JmZjVmMTU1Y2E2YjRhMjk4OTQ3NmU2NDJlZTMxYmEwNCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.CvnTRyNW5U34BDFfyfrkyIMyTXyuJ4OfEm6kqDhmtco)
新的flusher_xxx_otlp的flush方法中会读取各个logRecord的"file_offset"标签,然后打印出来。
在测试的时候,我们发现重启采集器后有漏采的情况:
1、采集配置:
{ "enable" : true, "global" : { "EnableTimestampNanosecond" : true }, "inputs" : [ { "Type" : "input_file", "FilePaths" : [ "/root/zhl/log_file1.log4" ], "MaxDirSearchDepth" : 5, "ExcludeFilePaths" : [ ], "TailSizeKB" : 10485760, "AppendingLogPositionMeta" : true, "AllowingIncludedByMultiConfigs" : true } ], "flushers" : [ { "Type" : "flusher_xxx_otlp", "Logs" : { "Endpoint" : "xxxxxx.xxxxxx.cn:12345", "Timeout" : 10000, "WaitForReady" : true, "Compression" : "gzip" } } ] }
2、启动采集器后,用touch命令更新目标文件的修改时间,让采集器开始采集。/root/xxx/log_file1.log2是一个20M大小左右的文件。
3、1-2秒后给采集器发送sigTrem信号停止采集器;
4、重新运行采集器,然后在用touch命令更新目标文件的修改时间,让采集器继续采集。
在第三步之后,打开ilogtail.LOG日志文件如下,能看到是从0开始采集器的:
![image](https://private-user-images.githubusercontent.com/8476721/354550203-195d3c3a-fdc2-493e-b9a9-8edae5b9e843.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzOTgwNjQsIm5iZiI6MTczOTM5Nzc2NCwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MDIwMy0xOTVkM2MzYS1mZGMyLTQ5M2UtYjlhOS04ZWRhZTViOWU4NDMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMjIwMjQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZjgzNmVmODNiZmIyOTdlYWY0NDNlYzkyYzZjYWUyMDA0MzU0NGRkMGEzNzdiYjRmMjI3NWMzMTlkMDFiMWIxMCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.zrKRSxvKZ35T1TWcngh5aHBl27tRhhrPROKKQz2o5qo)
logtail_plugin.LOG日志中也打印出了最后一次发送的数据包的最大offset是5229336
![image](https://private-user-images.githubusercontent.com/8476721/354551273-f896931b-354e-4e34-83a9-7696cd1e583a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzOTgwNjQsIm5iZiI6MTczOTM5Nzc2NCwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MTI3My1mODk2OTMxYi0zNTRlLTRlMzQtODNhOS03Njk2Y2QxZTU4M2EucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMjIwMjQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MzliODczMmU1MTkxMzU4ZmU2Mzg2ZDdhY2EwODEyOTAyNDIyMWRiYzk2YTZhZTc2MTU1YTNhMDBhNTY1MjUzMCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.l1-Ail2LGqWKT1McnX6zAo4GsUfMr7eBxNaFqwFgEHQ)
但是此时查看checkpoint文件,发现其中的offset是8388480
![企业微信截图_17225013855869](https://private-user-images.githubusercontent.com/8476721/354551538-72ebd417-bdc7-4a0b-8614-fa89e90481a5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzOTgwNjQsIm5iZiI6MTczOTM5Nzc2NCwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MTUzOC03MmViZDQxNy1iZGM3LTRhMGItODYxNC1mYTg5ZTkwNDgxYTUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMjIwMjQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OTI1NzEzZTYxNTkwZjRhNGNmNDM3Y2FjOWUxZDAyZDU3NzRhMGUyNWVkMzJhNTkyZTcxZjAzZjAyOTk5OWFjOSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.f5IjA3gsdjbOR5CDOHL1YAUZ74bsru5HeAKL-yMFmg4)
重启后,ilogtail.LOG中显示是从8388480处继续采集
![企业微信截图_17225014931361](https://private-user-images.githubusercontent.com/8476721/354552248-adf5e9f8-d034-4a01-ace5-2a0bb50605ca.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzOTgwNjQsIm5iZiI6MTczOTM5Nzc2NCwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MjI0OC1hZGY1ZTlmOC1kMDM0LTRhMDEtYWNlNS0yYTBiYjUwNjA1Y2EucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMjIwMjQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9N2JhNmQzZmU5MjU3YmQ3YzMxYjA2MzM5MWQ4NGIwZTFhYjE4NDRhZTNmNzJjYjA3NDY3YmM3YTAwMWJiYTg2NiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.qDo3t2F0Iie3DIa4mRfDGbRYVKtEm91AYTBgQJ_4F8w)
中间的5229336到8388480 这段日志漏采了。
Beta Was this translation helpful? Give feedback.
All reactions