落地结构化日志与基础监控告警
This commit is contained in:
@@ -0,0 +1,43 @@
|
||||
# 日志收集与基础监控落地
|
||||
|
||||
对应 issue:`#11 [P1][T8] 日志收集与基础监控落地`
|
||||
|
||||
## 1. 结构化日志
|
||||
|
||||
- 中间件:`internal/observability/http.go::RequestLogMiddleware`
|
||||
- 每个请求输出一条 JSON 日志,包含:
|
||||
- `request_id`
|
||||
- `method/path/status`
|
||||
- `latency_ms`
|
||||
- `client_ip`
|
||||
- `uid`(若已鉴权)
|
||||
- `errors`(若有)
|
||||
|
||||
## 2. 核心指标
|
||||
|
||||
- 采集器:`internal/observability/collector.go`
|
||||
- 暴露接口:`GET /metrics/basic`
|
||||
- 指标字段:
|
||||
- `total_requests`
|
||||
- `client_errors`
|
||||
- `server_errors`
|
||||
- `client_error_rate_pct`
|
||||
- `server_error_rate_pct`
|
||||
- `avg_latency_ms`
|
||||
- `max_latency_ms`
|
||||
|
||||
## 3. 基础告警阈值
|
||||
|
||||
- 脚本:`scripts/ops/check_basic_metrics.sh`
|
||||
- 默认阈值:
|
||||
- `SERVER_ERROR_RATE_THRESHOLD=5`(%)
|
||||
- `AVG_LATENCY_THRESHOLD_MS=800`(ms)
|
||||
- 触发后行为:
|
||||
- 返回非 0
|
||||
- 若配置 `OPS_ALERT_WEBHOOK` 则发送告警
|
||||
|
||||
推荐 cron(每 5 分钟):
|
||||
|
||||
```bash
|
||||
*/5 * * * * METRICS_URL=http://127.0.0.1:8080/metrics/basic SERVER_ERROR_RATE_THRESHOLD=5 AVG_LATENCY_THRESHOLD_MS=800 OPS_ALERT_WEBHOOK="https://example.com/webhook" /path/to/wx_service/scripts/ops/check_basic_metrics.sh >> /var/log/wx_service-metrics-check.log 2>&1
|
||||
```
|
||||
@@ -0,0 +1,20 @@
|
||||
# 可观测性落地验证记录(2026-02-28)
|
||||
|
||||
对应 issue:`#11 [P1][T8] 日志收集与基础监控落地`
|
||||
|
||||
## 验证项
|
||||
|
||||
1. 单元测试
|
||||
- `go test ./internal/observability -v`
|
||||
|
||||
2. 全量测试
|
||||
- `go test ./...`
|
||||
|
||||
3. 告警脚本语法检查
|
||||
- `bash -n scripts/ops/check_basic_metrics.sh`
|
||||
|
||||
## 结果
|
||||
|
||||
- 结构化日志中间件与指标采集逻辑可编译并通过测试。
|
||||
- `/metrics/basic` 指标接口已接入启动流程。
|
||||
- 基础告警阈值脚本语法检查通过。
|
||||
Reference in New Issue
Block a user