NAS (威联通)重启后，无法访问safeline-mgt，显示无法连接DB

神经蛙

更新于 1 年前

What happened?

在威联通NAS 通过雷池一键安装脚本安装safeline后一切正常。但是一旦NAS重启后，便无法访问mgt页面。但是之前配置的站点路由还可以生效。
卸载后再安装也不行。

Error log

safeline-luigi

12024/08/17 16:14:10 [ERROR] cmd/main.go:36 pg addr not found, wait mgt to set it up.

safeline-mgt :

1panic: failed to connect to `host=safeline-pg user=safeline-ce database=safeline-ce`: dial error (timeout: dial tcp 172.22.222.2:5432: connect: connection timed out)
2
3goroutine 1 [running]:
4main.initDeps()
5        /work/cmd/server/main.go:282 +0x34e
6main.main()
7        /work/cmd/server/main.go:318 +0x54
82024/08/17 16:24:08 [notice] 7#7: using the "epoll" event method
92024/08/17 16:24:08 [notice] 7#7: nginx/1.25.5
102024/08/17 16:24:08 [notice] 7#7: built by gcc 13.2.1 20231014 (Alpine 13.2.1_git20231014) 
112024/08/17 16:24:08 [notice] 7#7: OS: Linux 5.10.60-qnap
122024/08/17 16:24:08 [notice] 7#7: getrlimit(RLIMIT_NOFILE): 65535:65535
132024/08/17 16:24:08 [notice] 8#8: start worker processes
142024/08/17 16:24:08 [notice] 8#8: start worker process 10
152024/08/17 16:24:08 [notice] 8#8: start worker process 11
162024/08/17 16:24:39 [error] 10#10: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"
172024/08/17 16:25:10 [error] 11#11: *3 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"

safeline-mario

1connect database failed

safeline-bridge

1time=2024-08-17T16:12:26.928+08:00 level=INFO msg="start grpc server" network=unix address=/app/run/safeline.sock`

safeline-pg

1PostgreSQL init process complete; ready for start up.
2
32024-08-17 16:12:54.818 HKT [1] LOG:  starting PostgreSQL 15.2 (Debian 15.2-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
42024-08-17 16:12:54.889 HKT [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
52024-08-17 16:12:54.889 HKT [1] LOG:  listening on IPv6 address "::", port 5432
62024-08-17 16:12:55.076 HKT [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
72024-08-17 16:12:55.289 HKT [72] LOG:  database system was shut down at 2024-08-17 16:12:53 HKT
82024-08-17 16:12:55.424 HKT [1] LOG:  database system is ready to accept connections
92024-08-17 16:17:55.388 HKT [70] LOG:  checkpoint starting: time
102024-08-17 16:18:03.376 HKT [70] LOG:  checkpoint complete: wrote 34 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=5.006 s, sync=1.222 s, total=7.988 s; sync files=11, longest=0.296 s, average=0.112 s; distance=151 kB, estimate=151 kB

神经蛙

更新于 1 年前

compose.yaml文件有改动，需要更新一下

神经蛙

更新于 1 年前

用了https://github.com/chaitin/SafeLine/blob/main/compose.yaml ，但是还是访问不了主页。
safeline-mgt log:

12024/09/23 13:55:29 [notice] 7#7: using the "epoll" event method
22024/09/23 13:55:29 [notice] 7#7: nginx/1.25.5
32024/09/23 13:55:29 [notice] 7#7: built by gcc 13.2.1 20231014 (Alpine 13.2.1_git20231014) 
42024/09/23 13:55:29 [notice] 7#7: OS: Linux 5.10.60-qnap
52024/09/23 13:55:29 [notice] 7#7: getrlimit(RLIMIT_NOFILE): 65535:65535
62024/09/23 13:55:29 [notice] 8#8: start worker processes
72024/09/23 13:55:29 [notice] 8#8: start worker process 9
82024/09/23 13:55:29 [notice] 8#8: start worker process 10
92024/09/23 13:55:59 [error] 9#9: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"
102024/09/23 13:56:29 [error] 10#10: *3 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"
112024/09/23 13:57:00 [error] 9#9: *5 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"
122024/09/23 13:57:30 [error] 9#9: *7 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"
13panic: failed to connect to `host=safeline-pg user=safeline-ce database=safeline-ce`: dial error (timeout: dial tcp 169.254.222.2:5432: connect: connection timed out)
14
15goroutine 1 [running]:
16main.initDeps()
17        /work/cmd/server/main.go:291 +0x44e
18main.main()
19        /work/cmd/server/main.go:331 +0x54
202024/09/23 13:57:46 [notice] 6#6: using the "epoll" event method
212024/09/23 13:57:46 [notice] 6#6: nginx/1.25.5
222024/09/23 13:57:46 [notice] 6#6: built by gcc 13.2.1 20231014 (Alpine 13.2.1_git20231014) 
232024/09/23 13:57:46 [notice] 6#6: OS: Linux 5.10.60-qnap
242024/09/23 13:57:46 [notice] 6#6: getrlimit(RLIMIT_NOFILE): 65535:65535
252024/09/23 13:57:46 [notice] 7#7: start worker processes
262024/09/23 13:57:46 [notice] 7#7: start worker process 8
272024/09/23 13:57:46 [notice] 7#7: start worker process 9
282024/09/23 13:58:16 [error] 8#8: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /api/open/health HTTP/2.0", upstream: "http://127.0.0.1:8000/api/open/health", host: "localhost:1443"

神经蛙

更新于 1 年前

看起来 safeline-mgt 连不上数据库，需要看看 safeline-pg 容器状态是否正常或者容器间通信是否存在问题

神经蛙

更新于 1 年前

pg没发现有error log，但是就是start了6个小时都没收到一个request

12024-09-23 13:55:29.004 HKT [1] LOG:  starting PostgreSQL 15.2 (Debian 15.2-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
22024-09-23 13:55:29.014 HKT [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
32024-09-23 13:55:29.014 HKT [1] LOG:  listening on IPv6 address "::", port 5432
42024-09-23 13:55:29.960 HKT [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
52024-09-23 13:55:30.536 HKT [28] LOG:  database system was shut down at 2024-09-23 08:43:44 HKT
62024-09-23 13:55:30.957 HKT [1] LOG:  database system is ready to accept connections
72024-09-23 14:00:30.634 HKT [26] LOG:  checkpoint starting: time
82024-09-23 14:00:30.907 HKT [26] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.046 s, sync=0.041 s, total=0.273 s; sync files=2, longest=0.023 s, average=0.021 s; distance=0 kB, estimate=0 kB

神经蛙

更新于 1 年前

那看起来是容器网络有问题，请求应该就没到 pg，应该是在哪里被拦了

齐天大圣孙悟空

更新于 1 年前

感觉是 docker 服务不太正常，不行你重启一下 docker

神经蛙

更新于 1 年前

hmm...重启过好多次了，而且奇怪的是，第一次是可以的，NAS重启之后就不行了...

齐天大圣孙悟空

更新于 1 年前

docker ps 看看

神经蛙

更新于 1 年前

哪些status小于1小时的，基本上都在不停restart

1CONTAINER ID   IMAGE                                               COMMAND                  CREATED        STATUS                        PORTS                                          NAMES
2de88c8043b61   docker.1ms.run/chaitin/safeline-bridge-g:latest     "/app/bridge serve -…"   6 hours ago    Up 6 hours                                                                   safeline-bridge
3c839c8fd1e5b   docker.1ms.run/chaitin/safeline-luigi-g:latest      "/bin/sh -c /app/lui…"   6 hours ago    Up 18 seconds                 80/tcp                                         safeline-luigi
4b8bfbebf3555   docker.1ms.run/chaitin/safeline-mgt-g:latest        "/docker-entrypoint.…"   6 hours ago    Up 2 minutes (unhealthy)      80/tcp, 0.0.0.0:9443->1443/tcp                 safeline-mgt
56bef8d5d07c7   docker.1ms.run/chaitin/safeline-fvm-g:latest        "./fvm /app/config.y…"   6 hours ago    Up 6 hours                                                                   safeline-fvm
69d7bd0c511a6   docker.1ms.run/chaitin/safeline-chaos-g:latest      "./entrypoint.sh"        6 hours ago    Up 6 hours                    9000/tcp                                       safeline-chaos
7a98a27f291ea   docker.1ms.run/chaitin/safeline-tengine-g:latest    "entrypoint.sh nginx…"   6 hours ago    Up 6 hours                                                                   safeline-tengine
8dd9d9deb4251   docker.1ms.run/chaitin/safeline-detector-g:latest   "/detector/entrypoin…"   6 hours ago    Up About a minute (healthy)   8000-8001/tcp                                  safeline-detector
96ac9c31b811f   docker.1ms.run/chaitin/safeline-mario-g:latest      "/mario/entrypoint.sh"   6 hours ago    Up 4 minutes (unhealthy)                                                     safeline-mario
10438fde11b172   docker.1ms.run/chaitin/safeline-postgres:15.2       "docker-entrypoint.s…"   45 hours ago   Up 6 hours (healthy)          5432/tcp                                       safeline-pg
11

随风

更新于 1 年前

我也碰到同样的问题

这个有解法了吗

神经蛙

更新于 1 年前

docker logs safeline-pg看看为啥数据库没起来

长亭百川云 - 技术讨论

长亭百川云

长亭百川云 - 技术讨论

长亭百川云

NAS (威联通)重启后，无法访问safeline-mgt，显示无法连接DB

What happened?

Error log