[GH-ISSUE #1842] Sleep can break ability to reconnect #1454

Closed
opened 2026-05-05 12:55:19 -06:00 by gitea-mirror · 4 comments
Owner

Originally created by @keyoti on GitHub (Jun 5, 2020).
Original GitHub issue: https://github.com/fatedier/frp/issues/1842

I am seeing a lot of this in my logs

2020/06/05 04:40:26 [I] [service.go:514] [ce09965c55e3308d] client login info: ip [172.x.x.x:58672] version [0.33.0] hostname [] os [windows] arch [amd64]
2020/06/05 04:40:26 [I] [control.go:289] [ce09965c55e3308d] Replaced by client [ce09965c55e3308d]
2020/06/05 04:40:56 [I] [service.go:514] [ce09965c55e3308d] client login info: ip [172.x.x.x:59868] version [0.33.0] hostname [] os [windows] arch [amd64]
2020/06/05 04:40:56 [I] [control.go:289] [ce09965c55e3308d] Replaced by client [ce09965c55e3308d]
Repeat every 30 seconds for ever

What version of frp are you using (./frpc -v or ./frps -v)?

0.33.0

What operating system and processor architecture are you using (go env)?

Server is Ubuntu x64, client is Windows 10

Configures you used:

Running FRPS as a TCP backend node behind HAProxy
HTTPS proxy service only

Steps to reproduce the issue:

  1. Run frps
  2. Connect from frpc
  3. Sleep computer for a long time (seems to need hours??)

Describe the results you received:

frpc cannot reconnect properly, it keeps trying to reconnect and stalls. User must kill frpc manually and restart it.

Describe the results you expected:

frpc should be able to reconnect like normal

Additional information you deem important (e.g. issue happens only occasionally):

I am new at Go, but I suspect the problem is because of this:

In server Service.RegisterControl, you do

ctl := NewControl(ctx, svr.rc, svr.pxyManager, svr.pluginManager, svr.authVerifier, ctlConn, loginMsg, svr.cfg)
	if oldCtl := svr.ctlManager.Add(loginMsg.RunId, ctl); oldCtl != nil {
		oldCtl.allShutdown.WaitDone()
	}

	ctl.Start()

In my situation I don't think ctl.Start() is ever called because oldCtl.allShutdown.WaitDone() is hanging.

oldCtl.allShutdown.WaitDone() is:

func (s *Shutdown) WaitDone() {
	<-s.doneCh
}

Is it true that a channel can block indefinitely if the TCP connection is dead? Is it possible to time it out?

Thanks!

Originally created by @keyoti on GitHub (Jun 5, 2020). Original GitHub issue: https://github.com/fatedier/frp/issues/1842 I am seeing a lot of this in my logs ``` 2020/06/05 04:40:26 [I] [service.go:514] [ce09965c55e3308d] client login info: ip [172.x.x.x:58672] version [0.33.0] hostname [] os [windows] arch [amd64] 2020/06/05 04:40:26 [I] [control.go:289] [ce09965c55e3308d] Replaced by client [ce09965c55e3308d] 2020/06/05 04:40:56 [I] [service.go:514] [ce09965c55e3308d] client login info: ip [172.x.x.x:59868] version [0.33.0] hostname [] os [windows] arch [amd64] 2020/06/05 04:40:56 [I] [control.go:289] [ce09965c55e3308d] Replaced by client [ce09965c55e3308d] Repeat every 30 seconds for ever ``` **What version of frp are you using (./frpc -v or ./frps -v)?** 0.33.0 **What operating system and processor architecture are you using (`go env`)?** Server is Ubuntu x64, client is Windows 10 **Configures you used:** Running FRPS as a TCP backend node behind HAProxy HTTPS proxy service only **Steps to reproduce the issue:** 1. Run frps 2. Connect from frpc 3. Sleep computer for a long time (seems to need hours??) **Describe the results you received:** frpc cannot reconnect properly, it keeps trying to reconnect and stalls. User must kill frpc manually and restart it. **Describe the results you expected:** frpc should be able to reconnect like normal **Additional information you deem important (e.g. issue happens only occasionally):** I am new at Go, but I suspect the problem is because of this: In server Service.RegisterControl, you do ``` ctl := NewControl(ctx, svr.rc, svr.pxyManager, svr.pluginManager, svr.authVerifier, ctlConn, loginMsg, svr.cfg) if oldCtl := svr.ctlManager.Add(loginMsg.RunId, ctl); oldCtl != nil { oldCtl.allShutdown.WaitDone() } ctl.Start() ``` In my situation I don't think ctl.Start() is ever called because oldCtl.allShutdown.WaitDone() is hanging. oldCtl.allShutdown.WaitDone() is: ``` func (s *Shutdown) WaitDone() { <-s.doneCh } ``` Is it true that a channel can block indefinitely if the TCP connection is dead? Is it possible to time it out? Thanks!
Author
Owner

@fatedier commented on GitHub (Jun 5, 2020):

You can check if your log has client exit success message? It means your old control was stopped and allShutdown.Done() has been called.

So allShutdown.WaitDone() will not be blocked.

Set log_level to debug to see more info. You may also see if your client relogin success from logs of frpc.

<!-- gh-comment-id:639261821 --> @fatedier commented on GitHub (Jun 5, 2020): You can check if your log has `client exit success` message? It means your old control was stopped and `allShutdown.Done()` has been called. So `allShutdown.WaitDone()` will not be blocked. Set log_level to `debug` to see more info. You may also see if your client relogin success from logs of frpc.
Author
Owner

@keyoti commented on GitHub (Jun 11, 2020):

You can check if your log has client exit success message? It means your old control was stopped and allShutdown.Done() has been called.

It is logging client exit success for some clients at some times. But I just reproduced the problem and for my client it is not logging that.

Set log_level to debug to see more info. You may also see if your client relogin success from logs of frpc.

I set it to trace and this is what I am seeing right now on my server and client:

FRPS:

ubuntu@ip-172-31-x-x:~/goprojects/bin$ sudo tail frps.log -f | grep 666f
2020/06/11 04:32:41 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:35650] version [0.33.0] hostname [] os [windows] arch [amd64]
2020/06/11 04:32:41 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae]
2020/06/11 04:33:11 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:37224] version [0.33.0] hostname [] os [windows] arch [amd64]
2020/06/11 04:33:11 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae]
2020/06/11 04:33:41 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:38892] version [0.33.0] hostname [] os [windows] arch [amd64]
2020/06/11 04:33:41 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae]
2020/06/11 04:34:11 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:40132] version [0.33.0] hostname [] os [windows] arch [amd64]
2020/06/11 04:34:11 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae]

FRPC:

2020/06/10 21:27:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...
2020/06/10 21:27:50 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached
2020/06/10 21:28:10 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...
2020/06/10 21:28:20 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached
2020/06/10 21:28:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...
2020/06/10 21:28:50 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached
2020/06/10 21:29:10 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...
2020/06/10 21:29:20 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached
2020/06/10 21:29:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...
2020/06/10 21:29:50 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached
2020/06/10 21:30:10 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...
2020/06/10 21:30:20 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached
2020/06/10 21:30:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server...

I will try to debug further tomorrow, hopefully I can see on the client what is happening.

<!-- gh-comment-id:642401017 --> @keyoti commented on GitHub (Jun 11, 2020): > You can check if your log has client exit success message? It means your old control was stopped and allShutdown.Done() has been called. It is logging `client exit success` for some clients at some times. But I just reproduced the problem and for my client it is not logging that. > Set log_level to debug to see more info. You may also see if your client relogin success from logs of frpc. I set it to trace and this is what I am seeing right now on my server and client: FRPS: ``` ubuntu@ip-172-31-x-x:~/goprojects/bin$ sudo tail frps.log -f | grep 666f 2020/06/11 04:32:41 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:35650] version [0.33.0] hostname [] os [windows] arch [amd64] 2020/06/11 04:32:41 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae] 2020/06/11 04:33:11 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:37224] version [0.33.0] hostname [] os [windows] arch [amd64] 2020/06/11 04:33:11 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae] 2020/06/11 04:33:41 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:38892] version [0.33.0] hostname [] os [windows] arch [amd64] 2020/06/11 04:33:41 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae] 2020/06/11 04:34:11 [I] [service.go:521] [666f8df9186830ae] client login info: ip [172.31.x.x:40132] version [0.33.0] hostname [] os [windows] arch [amd64] 2020/06/11 04:34:11 [I] [control.go:289] [666f8df9186830ae] Replaced by client [666f8df9186830ae] ``` FRPC: ``` 2020/06/10 21:27:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... 2020/06/10 21:27:50 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached 2020/06/10 21:28:10 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... 2020/06/10 21:28:20 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached 2020/06/10 21:28:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... 2020/06/10 21:28:50 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached 2020/06/10 21:29:10 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... 2020/06/10 21:29:20 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached 2020/06/10 21:29:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... 2020/06/10 21:29:50 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached 2020/06/10 21:30:10 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... 2020/06/10 21:30:20 [W] [service.go:187] [666f8df9186830ae] reconnect to server error: i/o deadline reached 2020/06/10 21:30:40 [I] [service.go:184] [666f8df9186830ae] try to reconnect to server... ``` I will try to debug further tomorrow, hopefully I can see on the client what is happening.
Author
Owner

@keyoti commented on GitHub (Jun 11, 2020):

Ok, I'm sorry it seems to only be present in my fork, because I added to client admin API the ability to close proxies; The reconnect issue is only present when there are no proxies in use.

So I will figure it out, but in case it is of any use to anyone here is how I remove proxies in client/admin_api.go

func (svr *Service) apiCloseProxy(w http.ResponseWriter, r *http.Request) {
	var (
		buf []byte
		res GeneralResponse
	)
	defer func() {
		log.Info("Http response [/api/closeProxy]: code [%d]", res.Code)
		buf, _ = json.Marshal(&res)
		w.Header().Set("Content-Type", "application/json")
		w.Write(buf)
	}()

	getParams := r.URL.Query()

	log.Info("Http request: [/api/closeProxy]")

	mProxyName := "web-" + getParams.Get("s")

	svr.ctl.sendCh <- &msg.CloseProxy{
		ProxyName: mProxyName,
	}

	stopproxy := svr.ctl.pm.GetProxies()[mProxyName]
	if stopproxy != nil {
		stopproxy.StopWithNoRestart()
	}

	res.Code = 200
	res.Msg = "OK"
	return
}

<!-- gh-comment-id:642977499 --> @keyoti commented on GitHub (Jun 11, 2020): Ok, I'm sorry it seems to only be present in my fork, because I added to client admin API the ability to close proxies; The reconnect issue is only present when there are no proxies in use. So I will figure it out, but in case it is of any use to anyone here is how I remove proxies in client/admin_api.go ``` func (svr *Service) apiCloseProxy(w http.ResponseWriter, r *http.Request) { var ( buf []byte res GeneralResponse ) defer func() { log.Info("Http response [/api/closeProxy]: code [%d]", res.Code) buf, _ = json.Marshal(&res) w.Header().Set("Content-Type", "application/json") w.Write(buf) }() getParams := r.URL.Query() log.Info("Http request: [/api/closeProxy]") mProxyName := "web-" + getParams.Get("s") svr.ctl.sendCh <- &msg.CloseProxy{ ProxyName: mProxyName, } stopproxy := svr.ctl.pm.GetProxies()[mProxyName] if stopproxy != nil { stopproxy.StopWithNoRestart() } res.Code = 200 res.Msg = "OK" return } ```
Author
Owner

@fatedier commented on GitHub (Jun 12, 2020):

@keyoti Maybe we can support add, update, delete proxies API in the future.

Now frpc only support reload proxies configures by configuration file. The simple way to do this is to translate add, update, delete operation to modify configurion file and call reload API. So frpc will compare the difference and do corresponding operations.

<!-- gh-comment-id:643047066 --> @fatedier commented on GitHub (Jun 12, 2020): @keyoti Maybe we can support `add, update, delete` proxies API in the future. Now frpc only support reload proxies configures by configuration file. The simple way to do this is to translate `add, update, delete` operation to modify configurion file and call reload API. So frpc will compare the difference and do corresponding operations.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/frp#1454
No description provided.