处理Springboot项目启动时streamBridge.send导致的卡住卡死问题

news/2024/11/23 22:56:05/

现象

  • 我们的Spring Boot 项目启动时,偶现卡死的现象,启动到一半卡主不动了
2023-01-16 10:23:10.338  INFO 1 --- [  restartedMain] com.hazelcast.core.LifecycleService      : [172.18.0.14]:5701 [dev] [4.2.4] [172.18.0.14]:5701 is STARTED
2023-01-16 10:23:12.391  INFO 1 --- [  restartedMain] c.h.h.HazelcastCacheRegionFactory        : Starting up HazelcastCacheRegionFactory
2023-01-16 10:23:12.396  INFO 1 --- [  restartedMain] c.h.h.i.IHazelcastInstanceFactory        : Using existing HazelcastInstance [unit].
2023-01-16 10:23:16.685  INFO 1 --- [ntLoopGroup-2-1] com.newatc.socketio.SocketIOServer       : SocketIO server started at port: 7654
2023-01-16 10:23:16.702  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerAgentServer bean listeners added
2023-01-16 10:23:16.972  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : channelParamHandler bean listeners added
2023-01-16 10:23:16.979  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerAlarmHandler bean listeners added
2023-01-16 10:23:16.988  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerFaultHandler bean listeners added
2023-01-16 10:23:16.995  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerModelHandler bean listeners added
2023-01-16 10:23:17.000  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerTimeHandler bean listeners added
2023-01-16 10:23:17.006  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : dayPlanParamHandler bean listeners added
2023-01-16 10:23:17.011  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : detectorEventHandler bean listeners added
2023-01-16 10:23:17.016  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : detectorParamHandler bean listeners added
2023-01-16 10:23:17.022  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : lampOffHandler bean listeners added
2023-01-16 10:23:17.027  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : patternParamHandler bean listeners added
2023-01-16 10:23:17.033  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : phaseFlowHandler bean listeners added
2023-01-16 10:23:17.038  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : phaseParamHandler bean listeners added
2023-01-16 10:23:17.044  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : phaseStatusHandler bean listeners added
2023-01-16 10:23:17.050  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : scheduleParamHandler bean listeners added
2023-01-16 10:23:17.057  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : signalIPParamHandler bean listeners added
2023-01-16 10:23:17.062  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : softwareVersionHandler bean listeners added
2023-01-16 10:23:17.068  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : stageParamHandler bean listeners added
2023-01-16 10:23:17.073  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : trafficParamsHandler bean listeners added
2023-01-16 10:23:17.079  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : vehGPSTrajectoryHandler bean listeners added
2023-01-16 10:23:17.084  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessOpLogHandler bean listeners added
2023-01-16 10:23:17.091  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessSecurityHandler bean listeners added
2023-01-16 10:23:18.018  INFO 1 --- [ntLoopGroup-3-1] c.n.socketio.handler.AuthorizeHandler    : channel active cb27f0b9-236a-404c-8cb1-369366729ce2
2023-01-16 10:23:18.466  INFO 1 --- [ntLoopGroup-3-1] c.h.i.p.impl.PartitionStateManager       : [172.18.0.14]:5701 [dev] [4.2.4] Initializing cluster partition table arrangement...
2023-01-16 10:23:18.776  INFO 1 --- [ntLoopGroup-3-1] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.101 Port=38167 unitId:101
2023-01-16 10:23:18.981  INFO 1 --- [ntLoopGroup-3-2] c.n.socketio.handler.AuthorizeHandler    : channel active 539fb94d-a5de-4fef-b81d-82803885973f
2023-01-16 10:23:19.008  INFO 1 --- [ntLoopGroup-3-2] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.93 Port=39568 unitId:93
2023-01-16 10:23:22.177  INFO 1 --- [        Timer-0] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.101
2023-01-16 10:23:22.244  INFO 1 --- [        Timer-2] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.93
2023-01-16 10:23:25.953  INFO 1 --- [ntLoopGroup-3-3] c.n.socketio.handler.AuthorizeHandler    : channel active 874e3919-489d-4970-8a53-e51e8a880057
2023-01-16 10:23:25.978  INFO 1 --- [ntLoopGroup-3-3] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45166 unitId:107
2023-01-16 10:23:26.821  WARN 1 --- [  restartedMain] onConfiguration$FunctionBindingRegistrar : You have defined function definition that does not exist: streamBridge
2023-01-16 10:23:29.213  INFO 1 --- [        Timer-4] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:24:00.174  INFO 1 --- [        Timer-1] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.101
2023-01-16 10:24:00.248  INFO 1 --- [        Timer-3] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.93
2023-01-16 10:24:07.212  INFO 1 --- [        Timer-5] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.107
2023-01-16 10:24:37.351  INFO 1 --- [ntLoopGroup-3-4] c.n.socketio.handler.AuthorizeHandler    : channel active bd580b9b-8050-4315-bf0d-07e8acc797b2
2023-01-16 10:24:37.368  INFO 1 --- [ntLoopGroup-3-4] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45167 unitId:107
2023-01-16 10:24:40.595  INFO 1 --- [        Timer-6] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:24:43.373  INFO 1 --- [ntLoopGroup-3-5] c.n.socketio.handler.AuthorizeHandler    : channel active a7ffe375-462d-4c18-bd67-1811933b8084
2023-01-16 10:24:43.388  INFO 1 --- [ntLoopGroup-3-5] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.93 Port=39570 unitId:93
2023-01-16 10:24:46.611  INFO 1 --- [        Timer-8] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.93
2023-01-16 10:25:18.595  INFO 1 --- [        Timer-7] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.107
2023-01-16 10:25:24.612  INFO 1 --- [        Timer-9] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.93
2023-01-16 10:25:51.726  INFO 1 --- [ntLoopGroup-3-6] c.n.socketio.handler.AuthorizeHandler    : channel active fb525bf4-fdd7-4f88-9a0d-88c923ef24df
2023-01-16 10:25:51.743  INFO 1 --- [ntLoopGroup-3-6] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45168 unitId:107
2023-01-16 10:25:54.964  INFO 1 --- [       Timer-10] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:26:32.965  INFO 1 --- [       Timer-11] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.107
  • 启动卡死不是必现,综合了出现卡死时的情况,发现,当出现kafka发送消息时,偶尔会卡主。这让我想起之前处理的一个启动报错 Spring Cloud Stream kafka项目启动时报错
  • 项目使用的是spring-cloud-starter-stream-kafka,函数式编程。代码业务逻辑是,程序接收到硬件的socket.io消息推送,就会使用kafka转发消息
  • 之前报错原因是项目启动时,在还未与kafka建立连接时,程序就接收到socket消息开始按照程序逻辑发送kafka消息,导致空指针异常,之后加了判断与初始化处理,问题解决
  • 本次现象类似,但初始化部署,程序没有连接硬件时,启动都是成功的。但程序已经使用一段时间,连接上硬件,接收硬件上报数据,就出现了偶发的更新启动卡死现象

排查和解决

  • 先是对比了启动失败和启动成功的日志,没有发现什么异常报错,启动成功日志如下
2023-01-16 10:13:32.705  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : vehGPSTrajectoryHandler bean listeners added
2023-01-16 10:13:32.709  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessOpLogHandler bean listeners added
2023-01-16 10:13:32.715  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessSecurityHandler bean listeners added
2023-01-16 10:13:33.786  INFO 1 --- [ntLoopGroup-3-1] c.n.socketio.handler.AuthorizeHandler    : channel active 65c277be-ed56-496a-9a3f-7243b1e7adff
2023-01-16 10:13:33.786  INFO 1 --- [ntLoopGroup-3-2] c.n.socketio.handler.AuthorizeHandler    : channel active f9f1acf4-ada2-4590-b949-46e0214e879b
2023-01-16 10:13:34.263  INFO 1 --- [ntLoopGroup-3-1] c.h.i.p.impl.PartitionStateManager       : [172.18.0.14]:5701 [dev] [4.2.4] Initializing cluster partition table arrangement...
2023-01-16 10:13:34.583  INFO 1 --- [ntLoopGroup-3-1] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.93 Port=39558 unitId:93
2023-01-16 10:13:34.583  INFO 1 --- [ntLoopGroup-3-2] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45152 unitId:107
2023-01-16 10:13:37.960  INFO 1 --- [        Timer-0] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.93
2023-01-16 10:13:37.960  INFO 1 --- [        Timer-1] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:13:38.637  INFO 1 --- [ntLoopGroup-3-3] c.n.socketio.handler.AuthorizeHandler    : channel active 6dce6405-53af-4914-aa6c-dcbf2860babc
2023-01-16 10:13:38.660  INFO 1 --- [ntLoopGroup-3-3] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.101 Port=38154 unitId:101
2023-01-16 10:13:41.930  INFO 1 --- [        Timer-4] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.101
2023-01-16 10:13:42.544  WARN 1 --- [  restartedMain] onConfiguration$FunctionBindingRegistrar : You have defined function definition that does not exist: streamBridge
2023-01-16 10:13:43.942  INFO 1 --- [ntLoopGroup-3-1] c.n.socketio.protocol.PacketDecoder      : Rcv Signal Machine IP:172.16.1.93
2023-01-16 10:13:43.942  INFO 1 --- [ntLoopGroup-3-2] c.n.socketio.protocol.PacketDecoder      : Rcv Signal Machine IP:172.16.1.107
2023-01-16 10:13:43.942  INFO 1 --- [ntLoopGroup-3-3] c.n.socketio.protocol.PacketDecoder      : Rcv Signal Machine IP:172.16.1.101
2023-01-16 10:13:43.944  INFO 1 --- [ntLoopGroup-3-3] c.n.u.a.handler.SignalIPParamHandler     : rcv:172.16.1.101
2023-01-16 10:13:43.945  INFO 1 --- [ntLoopGroup-3-2] c.n.u.a.handler.SignalIPParamHandler     : rcv:172.16.1.107
2023-01-16 10:13:43.945  INFO 1 --- [ntLoopGroup-3-1] c.n.u.a.handler.SignalIPParamHandler     : rcv:172.16.1.93
2023-01-16 10:13:47.039  INFO 1 --- [  restartedMain] org.jboss.threads                        : JBoss Threads version 3.1.0.Final
2023-01-16 10:13:47.304  INFO 1 --- [  restartedMain] com.newatc.unit.UnitApp                  : Started UnitApp in 41.426 seconds (JVM running for 44.067)
2023-01-16 10:13:47.319  INFO 1 --- [  restartedMain] com.newatc.unit.UnitApp                  : 
----------------------------------------------------------Application 'unit' is running! Access URLs:Local: 		http://localhost:8282/External: 	http://172.18.0.14:8282/当前版本为: 		1.4.2代码打包时间为: 		202301060900Profile(s): 	[prod, api-docs, no-liquibase]
----------------------------------------------------------
2023-01-16 10:13:47.320  INFO 1 --- [  restartedMain] com.newatc.unit.UnitApp                  : 
----------------------------------------------------------Config Server: 	Connected to Consul Server running in Docker
----------------------------------------------------------
  • 这个启动卡死现象是偶发的,确定与kafka发送消息有关,但不是必现,无法通过错误日志和日志对比找问题,只好另寻他法
  • 由于启动成功的日志里,也有接收硬件消息的日志打印,想着是不是有线程等待,暂时卡主了,需要使用一些自带的java命令来检查下
  • 在docker里使用 jstack 1,结果提示bash: jstack: command not found,去java文件夹看了下,确实没这些命令
    在这里插入图片描述
  • 为了减少镜像包大小,我们使用的是jre版本的java 11 环境docker镜像eclipse-temurin:11-jre-focal(程序镜像包170M左右),没有自带一些java命令,只好换成jdk版本的eclipse-temurin:11-jdk-focal重新打镜像包(300M左右)部署
  • 部署后重现了启动卡死的问题,使用 jstack 1命令查看此时的线程情况,一下子发现了问题Found 1 deadlock.,发现了一个死锁线程,具体日志如下(省略了部分日志):
I have no name!@315764d2cd4b:/$ jstack 1
2023-01-16 11:39:57
Full thread dump OpenJDK 64-Bit Server VM (11.0.17+8 mixed mode, sharing):JNI global refs: 29, weak refs: 0Found one Java-level deadlock:
=============================
"restartedMain":waiting to lock monitor 0x00007f730406bb00 (object 0x00000000e34c2160, a org.springframework.cloud.stream.binder.DefaultBinderFactory),which is held by "nioEventLoopGroup-3-2"
"nioEventLoopGroup-3-2":waiting to lock monitor 0x00007f725c019500 (object 0x00000000e1516e80, a java.util.concurrent.ConcurrentHashMap),which is held by "restartedMain"Java stack information for the threads listed above:
===================================================
"restartedMain":at org.springframework.cloud.stream.binder.DefaultBinderFactory.getBinder(DefaultBinderFactory.java:129)- waiting to lock <0x00000000e34c2160> (a org.springframework.cloud.stream.binder.DefaultBinderFactory)at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.determineTargetProtocol(FunctionConfiguration.java:640)at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.createFunctionHandler(FunctionConfiguration.java:579)at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.bindFunctionToDestinations(FunctionConfiguration.java:562)at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.afterPropertiesSet(FunctionConfiguration.java:421)at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1863)at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1800)at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:620)at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)at org.springframework.beans.factory.support.AbstractBeanFactory$$Lambda$287/0x00000008401f6040.getObject(Unknown Source)at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)- locked <0x00000000e1516e80> (a java.util.concurrent.ConcurrentHashMap)at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953)at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918)at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583)- locked <0x00000000e10d9e48> (a java.lang.Object)at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:145)at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:740)at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:415)at org.springframework.boot.SpringApplication.run(SpringApplication.java:303)at com.newatc.unit.UnitApp.main(UnitApp.java:72)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.17/Native Method)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.17/NativeMethodAccessorImpl.java:62)at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.17/DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(java.base@11.0.17/Method.java:566)at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49)
"nioEventLoopGroup-3-2":at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:216)- waiting to lock <0x00000000e1516e80> (a java.util.concurrent.ConcurrentHashMap)at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:213)at org.springframework.context.event.AbstractApplicationEventMulticaster.retrieveApplicationListeners(AbstractApplicationEventMulticaster.java:264)at org.springframework.context.event.AbstractApplicationEventMulticaster.getApplicationListeners(AbstractApplicationEventMulticaster.java:221)at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:138)at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:427)at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)- locked <0x00000000e3d686a0> (a java.lang.Object)at org.springframework.cloud.stream.binder.DefaultBinderFactory.initializeBinderContextSimple(DefaultBinderFactory.java:471)at org.springframework.cloud.stream.binder.DefaultBinderFactory.getBinderInstance(DefaultBinderFactory.java:265)at org.springframework.cloud.stream.binder.DefaultBinderFactory.doGetBinder(DefaultBinderFactory.java:223)at org.springframework.cloud.stream.binder.DefaultBinderFactory.getBinder(DefaultBinderFactory.java:151)- locked <0x00000000e34c2160> (a org.springframework.cloud.stream.binder.DefaultBinderFactory)at org.springframework.cloud.stream.binding.BindingService.getBinder(BindingService.java:389)at org.springframework.cloud.stream.binding.BindingService.bindProducer(BindingService.java:273)at org.springframework.cloud.stream.function.StreamBridge.resolveDestination(StreamBridge.java:282)- locked <0x00000000e3453128> (a org.springframework.cloud.stream.function.StreamBridge)at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:210)at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:164)at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:144)at com.newatc.unit.agent.ControllerAgentServer.onConnect(ControllerAgentServer.java:184)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.17/Native Method)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.17/NativeMethodAccessorImpl.java:62)at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.17/DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(java.base@11.0.17/Method.java:566)at com.newatc.socketio.annotation.OnConnectScanner$1.onConnect(OnConnectScanner.java:41)at com.newatc.socketio.namespace.Namespace.onConnect(Namespace.java:186)at com.newatc.socketio.handler.AuthorizeHandler.connect(AuthorizeHandler.java:151)at com.newatc.socketio.handler.AuthorizeHandler.channelActive(AuthorizeHandler.java:90)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:230)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:216)at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:209)at io.netty.channel.DefaultChannelPipeline$HeadContext.channelActive(DefaultChannelPipeline.java:1398)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:230)at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:216)at io.netty.channel.DefaultChannelPipeline.fireChannelActive(DefaultChannelPipeline.java:895)at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:522)at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429)at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486)at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503)at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)Found 1 deadlock.
  • 看了stack日志,看到了死锁互相等待的线程,就是这个问题了。具体触发死锁的代码为at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:210),还是kafka问题
  • 发现问题,想着等启动完成后,再去执行发送kafka消息的逻辑。再启动类里增加一个公共变量,启动前为false,启动完成后为true,在kafka发送时,使用此变量判断

// 启动类UnitApp/*** 是否启动完成,默认false,启动完成后置true*/public static Boolean START_COMPLETED = false;public static void main(String[] args) {SpringApplication app = new SpringApplication(UnitApp.class);DefaultProfileUtil.addDefaultProfile(app);Environment env = app.run(args).getEnvironment();logApplicationStartup(env);START_COMPLETED = true;}// 使用时try {if (streamBridge != null && UnitApp.START_COMPLETED) {streamBridge.afterSingletonsInstantiated();streamBridge.send(OUTPUT_BINDING_NAME, mapper.writeValueAsString(offlineEventDTO));} else {log.error("streamBridge is null or UnitApp.START_COMPLETED is {}--------!", UnitApp.START_COMPLETED);}} catch (JsonProcessingException e) {log.error(e.getMessage());}
  • 问题解决

http://www.ppmy.cn/news/623362.html

相关文章

互联网面试知识技术总结(javacore、jvm、redis、kafka、zookeeper、mysql、hystrix)-第四章...

<3>Dubbo的负载均衡策略 负载均衡的实现是在客户端实现的&#xff0c;当服务提供方是集群的时候&#xff0c;为了避免大量请求一直落到一个或几个服务提供方机器上&#xff0c;从而使这些机器负载很高&#xff0c;甚至打死&#xff0c;需要做一定的负载均衡策略。Dubbo提…

macOS Catalina 10.15.6(19G2006)原版镜像 by OpenCore-0.6.1-08-06编译版

Mac 的本领&#xff0c;突飞猛进。 音乐、播客&#xff0c;联袂登台 iTunes 曾深刻影响了人们的视听娱乐方式。如今&#xff0c;音乐和播客这两款全新 app 携手登场&#xff0c;让一切再次改变。每款 app 都彻彻底底重新设计&#xff0c;只为让你能在 Mac 上尽享娱乐的精彩。请…

AppArmor零知识学习二、相识

本文内容参考&#xff1a; AppArmor GitBook&#xff0c; Linux安全模块AppArmor总结-CSDN博客&#xff0c; AppArmor快速入门-CSDN博客&#xff0c; apparmor 初识&#xff08;一&#xff09;_domybest_nsg的博客-CSDN博客&#xff0c; AppArmor与SElinux_apparmor seli…

我的k8s随笔:Kubernetes 1.17.0 部署讲解

k8s集群部署过程实践笔记共两种版本&#xff1a;一为专注部署操作&#xff0c;一为涉及部署操作讲解。本文为后者。 本文介绍了如何在两台 ubuntu 16.04 64 bit 双核 CPU 虚拟机上使用 kubeadm 部署 Kubernetes 1.17.0 集群的过程&#xff0c;网络插件为 flannel v0.11.0&#…

csredis-in-asp.net core理论实战-使用示例

csredis GitHub https://github.com/2881099/csredis 示例源码 https://github.com/luoyunchong/dotnetcore-examples/blob/master/Caching/OvOv.CsRedis/ 前提 安装并配置好redis服务&#xff0c;可用。vs2017或vs2019或vscode.net core 2.2 sdk 创建一个. NET Core Web…

printf段错误(core dump): 一个格式化输出引起的问题

1、printf段错误&#xff08;core dump&#xff09;: 一个格式化输出引起的问题 贴一个简单的例子&#xff1a; #include <stdio.h>int main(){int len sizeof(int);printf("%s\n",len);return 0; }rootubuntu:test#gcc test.c test.c: In function ‘main’…

关于使用ESP报错问题解决记录

1.封装udp透传函数报错 … WiFi connected CORRUPT HEAP: Bad head at 0x3ffdfbe8. Expected 0xabba1234 got 0x3ffdfcb0 abort() was called at PC 0x4008a481 on core 0 ELF file SHA256: 0000000000000000 Backtrace: 0x4008dad4:0x3ffd5a40 0x4008dd4d:0x3ffd5a60 0x4008a4…

Linux core dump在Android上的应用

&#xfeff;&#xfeff; 之前整理过一篇linux core dump的文章&#xff0c;一直想把这个特性在手机上应用起来&#xff0c;帮助排查错误&#xff0c;今天终于如愿以偿&#xff0c;记录如下。 【1】概述 在Android系统上&#xff0c;java应用程序出错时很容易通过logcat获取出…