checkpoint相关 (#27)

#28
Reviewed-on: #27
This commit is contained in:
LingZhaoHui 2023-08-25 16:11:40 +00:00
parent 0abb80528f
commit bcd6ab5e4d
2 changed files with 82 additions and 2 deletions

View File

@ -7,6 +7,7 @@
- [slot相关](./slot相关.md)
- [Flink基本架构](./Flink基本架构.md)
- [旁路输出](./旁路输出.md)
- [BlobServer](./blobServer.md)
- [广播](./broadcast.md)
- [Checkpoint](./checkpoint.md)

79
basic/checkpoint.md Normal file
View File

@ -0,0 +1,79 @@
# 常见报错
## The maximum number of queued checkpoint requests exceeded
未完成的Checkpoint排队超过了1000个。需要查看作业是否存在被压等。一般情况下作业被压会导致checkpoint失败。
## Periodic checkpoint scheduler is shut down
## The minimum time between checkpoints is still pending
## Not all required tasks are currently running
部分算子任务已经完成但是如果在维表join场景下flink 1.13版本之前可能无法恢复checkpoint
## An Exception occurred while triggering the checkpoint.
## Asynchronous task checkpoint failed.
## The checkpoint was aborted due to exception of other subtasks sharing the ChannelState file
## Checkpoint expired before completing
## Checkpoint has been subsumed
## Checkpoint was declined
## Checkpoint was declined (tasks not ready)
## Checkpoint was declined (task is closing)
## Checkpoint was canceled because a barrier from newer checkpoint was received
## Task received cancellation from one of its inputs
## Checkpoint was declined because one input stream is finished
## CheckpointCoordinator shutdown
## Checkpoint Coordinator is suspending
## FailoverRegion is restarting
## Task has failed
## Task local checkpoint failure
## Unknown task for the checkpoint to notify
## Failure to finalize checkpoint
## Trigger checkpoint failure