驱蚊器喵的插座

给 glider 增加从机场订阅链接获取节点的功能

2024-01-10T00:10:59.000Z

前言

最近在爬一个国外的网站，网站反爬措施比较全，我需要频繁切换 IP 来访问网站。

方式1: clash for windows 使用脚本检测切换节点

开始我找到了在 clash for windows 里面增加脚本的方式，实现切换 IP ，但是这个方法策略只是保证 IP 是高可用的，并没有一直切换 IP。

参考
https://github.com/Fndroid/clash_for_windows_pkg/issues/1556#issuecomment-1189829336

这个网页是我记录在备忘录的。因为”clash风波”，这个 repo 被删了，我也没有找到别的博客记录的脚本代码。

方式2: 在golang中使用clash作为http客户端的代理

参考

https://p00q.cn/posts/906.html

这个办法是在程序中创建一个 clash 的实例，实例读取 clash 配置文件，因为 clash 库🈚️了，导入代码库不是很方便，我找备份都找了很久。目前找到了一个较新的备份是 https://github.com/Ieooo/clash

但是这样聚合多个机场的节点不是很方便，难道我要创建多个实例么。。。？

方式3: 使用 glider 聚合代理节点为代理池

后来找到了 glider，glider 是一个将机场节点变为爬虫代理池的神器，完美满足我的需求。

但是当前版本 0.16.3 ，还不支持 base64 方式的节点链接。

一开始，我使用这个仓库的脚本对我机场配置文件进行转换。

glider 启动后运行正常，但是这样也不是个长久之计。

机场一般一天会更换节点，频繁去解码拷贝节点到配置文件也不太现实，然后看到了这篇文章，给我了启发，于是我开始魔改代码，

基于目前最新版 0.16.3，首先增加了解码 base64 的功能，后面增加了订阅链接的功能。

本来还想定期检测订阅链接，自动更新节点，但是涉及要替换里面已经存在的节点，我觉得处理起来有点麻烦，暂时先这样了。

所以就是启动的时候会从订阅去获取节点，然后放在代理组内，代理组的代理仍然遵循配置文件的时间，定期检测。

总的来说改动如下：

增加了支持 base64 的 vmess 连接（base64decode 后是 json 的格式）
增加了支持 base64 的 ss 连接（这种连接的 method 和 pass 是 base64 格式的）
修复了 trojan 链接格式，我这边的链接都是用的 sni，而不是 serverName，skipVerify 的值是 true。（glider 0.16.3 的 trojan 还是不支持 alpn 和 udp）
增加了支持机场订阅链接，可以填写多个机场订阅链接，仅在【4.multiple_forwarders】使用场景测试可以用，可能破坏了其他功能。（订阅链接限制：base64 编码，且非 clash 格式，base64 解码后是多条节点链接，不带规则配置。）

配置文件增加了一项

1	forwardprovider=https://www.xxx.com/api/v1/client/subscribe?token=xxxxxx

可以配置多行

1
2
3

forwardprovider=https://www.xxx.com/api/v1/client/subscribe?token=xxxxxx
forwardprovider=https://www.xxx2.com/api/v1/client/subscribe?token=xxxxxx
forwardprovider=https://www.xxx3.com/api/v1/client/subscribe?token=xxxxxx

测试命令，查看是否是轮换代理ip去访问

1	for i in {1..20};do curl -s -k https://api.ip.sb/ip -H 'user-agent: zsh-proxy' -x "http://127.0.0.1:8443" ;done

运行效果

在做 base64 支持的同时，发现 glider 对代理的新特性不支持：

不支持 trogan 的 alpn 和 udp
不支持 vmess 的 udp 以及 network: grpc

如果发现有的节点因为找不到 dns 记录连接失败

需要这样配置dns服务器

# Setup a dns forwarding server
dns=:53
# global remote dns server (you can specify different dns server in rule file)
dnsserver=1.1.1.1
dnsserver=8.8.8.8

启动后会通过可以连接的代理连接 dns 服务器查找域名

如果能支持 doh 就更好了，之前也有人提过 pr （见 https://github.com/nadoo/glider/pull/208），作者觉得会引入 net/http 使得编译后的二进制文件变得很大。后面 pr 因为超时（大于 90天）没有回复被关闭了。

我这边测试，因为要获取订阅内容，所以也引入了 net/http，编译后的二进制文件 12M，相对改动之前的 8M，多了 4M，我认为是可以接受的吧。

彩蛋

开始看了下代码仓库，README.md 全英文，作者的账号信息也没写是不是国内的开发者，但是我直觉感觉是，除了XX，还有哪个国家的人会有这样的需求呢？

在谷歌上搜了下，发现作者开源代码时在 v2ex 上发帖

https://www.v2ex.com/t/375186

所以 glider 一开始只是开发者分享的自用小工具，后面才广为流传，一如曾经 ss。

当作者分享了小工具后，有人甚至帮忙打了一个包放到 Arch 里。

参考

使用 curl，golang，python 访问 kerberos 安全页面

2023-08-05T00:00:34.000Z

最近要采集 yarn 队列使用量，自从集群升级到 HDP 3.1.5 后，访问 yarn resourcemanager 页面需要 kerberos 认证才可访问，这里总结了 3 种方式访问 kerberos 安全页面的方式

curl

用于调试，配合 jq 解析 json 使用

参考 https://docs.cloudera.com/runtime/7.2.10/scaling-namespaces/topics/hdfs-curl-url-http-spnego.html

命令

在执行前，请确认你已经通过 kinit 完成认证

1	curl -u : --negotiate "http://rm.example:8088/jmx"

-u :：使用空用户名和密码进行基本身份验证。在 Kerberos 认证中，实际的身份验证是通过票据而不是用户名和密码完成的，因此这里使用空用户名和密码只是为了满足 curl 的基本身份验证要求。
--negotiate：启用 GSS-Negotiate 认证，这是 Kerberos 的一种认证机制。

实例

访问 active namenode

注意：-I, -s, -v 不影响访问过程
-I 用于查看头信息，不看响应内容
-v 启用 curl 的详细模式，会显示请求和响应的全部信息，包括请求头、响应头和数据内容。
-s 静默模式，不显示进度信息或错误消息

$ curl -I -v -s -u : --negotiate http://:50070/jmx
* About to connect() to  port 50070 (#0)
*   Trying ...
* Connected to  () port 50070 (#0)
> HEAD /jmx HTTP/1.1
> User-Agent: curl/7.29.0
> Host: :50070
> Accept: */*
> 
< HTTP/1.1 401 Authentication required
HTTP/1.1 401 Authentication required
< Date: Sat, 05 Aug 2023 10:00:20 GMT
Date: Sat, 05 Aug 2023 10:00:20 GMT
< Date: Sat, 05 Aug 2023 10:00:20 GMT
Date: Sat, 05 Aug 2023 10:00:20 GMT
< Pragma: no-cache
Pragma: no-cache
< X-FRAME-OPTIONS: SAMEORIGIN
X-FRAME-OPTIONS: SAMEORIGIN
< WWW-Authenticate: Negotiate
WWW-Authenticate: Negotiate
< Set-Cookie: hadoop.auth=; Path=/; HttpOnly
Set-Cookie: hadoop.auth=; Path=/; HttpOnly
< Cache-Control: must-revalidate,no-cache,no-store
Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
Content-Type: text/html;charset=iso-8859-1
< Content-Length: 263
Content-Length: 263

< 
* Connection #0 to host  left intact
* Issue another request to this URL: 'http://:50070/jmx'
* Found bundle for host : 0x1c6cfa0
* Re-using existing connection! (#0) with host 
* Connected to  () port 50070 (#0)
* Server auth using GSS-Negotiate with user ''
> HEAD /jmx HTTP/1.1
> Authorization: Negotiate 
> User-Agent: curl/7.29.0
> Host: :50070
> Accept: */*
> 
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Date: Sat, 05 Aug 2023 10:00:20 GMT
Date: Sat, 05 Aug 2023 10:00:20 GMT
< Cache-Control: no-cache
Cache-Control: no-cache
< Expires: Sat, 05 Aug 2023 10:00:20 GMT
Expires: Sat, 05 Aug 2023 10:00:20 GMT
< Date: Sat, 05 Aug 2023 10:00:20 GMT
Date: Sat, 05 Aug 2023 10:00:20 GMT
< Pragma: no-cache
Pragma: no-cache
< Content-Type: application/json; charset=utf8
Content-Type: application/json; charset=utf8
< X-FRAME-OPTIONS: SAMEORIGIN
X-FRAME-OPTIONS: SAMEORIGIN
< WWW-Authenticate: Negotiate 
WWW-Authenticate: Negotiate 
< Set-Cookie: hadoop.auth="u=&p=&t=kerberos&e=1691265620290&s="; Path=/; HttpOnly
Set-Cookie: hadoop.auth="u=&p=&t=kerberos&e=1691265620290&s="; Path=/; HttpOnly
< Access-Control-Allow-Methods: GET
Access-Control-Allow-Methods: GET
< Access-Control-Allow-Origin: *
Access-Control-Allow-Origin: *
< Content-Length: 542143
Content-Length: 542143

< 
* Closing connection 0

过程解释

当你运行这个 curl 命令时，首先它会尝试连接到指定的 URL，并发送一个不包含身份验证信息的 HTTP 请求。
如果目标 URL 受到 Kerberos 认证保护，服务器会返回一个 HTTP 401 状态码（未授权）的响应，并在头部中包含一个WWW-Authenticate: Negotiate的字段。这告诉客户端要使用Negotiate 机制来进行身份验证。
客户端接收到这个响应后，会通过 Kerberos 库生成一个 SPNEGO token，这个 token 包含了客户端的身份信息、时间戳、随机数等数据，并使用 Kerberos 的加密机制进行保护。(也就是第二次请求头中 WWW-Authenticate: Negotiate 后面的那很大一串)

SPNEGO 代表 Simple and Protected GSS-API Negotiation Mechanism，你可以理解成 kerberos 在 HTTP 交互认证使用的机制。

如果服务器成功验证了 SPNEGO token，说明客户端的 Kerberos 身份验证通过，服务器将返回 HTTP 200 状态码，表示认证成功。之后，客户端和服务器之间的通信将继续在已认证的状态下进行。
同时会设置一个 Cookie. Set-Cookie: hadoop.auth="u=&p=&t=kerberos&e=1691265620290&s=

u 代表 kerberos 用户名
p 代表 kerberos principal
t 可能是 type 的意思
s 代表 sign，是一个签名

Golang

使用 https://github.com/jcmturner/gokrb5

我已经在 fork 的 https://github.com/meoww-bot/hadoop_exporter 以及 https://github.com/meoww-bot/hadoop_jmx_exporter 使用此库作为 kerberos 认证的方式

例子

具体可以参考 https://github.com/meoww-bot/hadoop_jmx_exporter/blob/master/lib/krb.go

这里仅简单列出使用 keytab 认证后进行请求的相关代码

// 读取 keytab
kt, err := keytab.Load(ktPath)
if err != nil {
    return nil, fmt.Errorf("failed to load keytab file: %v", err)
}

// 读取 krb5 配置文件
krb5Conf, err := config.Load("/etc/krb5.conf")
if err != nil {
    return nil, fmt.Errorf("failed to load Kerberos config: %v", err)
}

// 从 pricipal 提取 username 和 realm
username, realm := ExtractUsernameAndRealm(principal)

if username == "" {
    return nil, fmt.Errorf("failed to extract username and realm from principal")
}

cli := client.NewClientWithKeytab(username, realm, kt, krb5Conf)

// 登陆 client，获取到已经认证的 client
err = cli.Login()
if err != nil {
    return nil, fmt.Errorf("failed to login krb5 client")
}

// 新建一个请求
r, err := http.NewRequest("GET", url, nil)
if err != nil {
    log.Errorf("could not create request: %v", err)
    return nil, fmt.Errorf("could not create request: %v", err)

}

// 从 url 提取 fqdn 域名
fqdn, err := ExtractDomainFromURL(url)
if err != nil {
    log.Errorf("could not extract fqdn from url: %v", err)
    return nil, fmt.Errorf("could not extract fqdn from url: %v", err)

}

// 生成 spenego 服务 principal
spn := fmt.Sprintf("HTTP/%s", fqdn)

// 从 client 获取 spnego client
spnegoCl := spnego.NewClient(client, nil, spn)

// 发送请求
resp, err := spnegoCl.Do(r)

因为在写 https://github.com/meoww-bot/hadoop_jmx_exporter 的时候遇到一个坑，所以去研究了下源码，结果发现请求的原理和 curl 是一样的

spnegoCl.Do(r) 的源码

// Do is the SPNEGO enabled HTTP client's equivalent of the http.Client's Do method.
func (c *Client) Do(req *http.Request) (resp *http.Response, err error) {
var body bytes.Buffer
if req.Body != nil {
// Use a tee reader to capture any body sent in case we have to replay it again
teeR := io.TeeReader(req.Body, &body)
teeRC := teeReadCloser{teeR, req.Body}
req.Body = teeRC
}
resp, err = c.Client.Do(req)
if err != nil {
if ue, ok := err.(*url.Error); ok {
if e, ok := ue.Err.(redirectErr); ok {
// Picked up a redirect
e.reqTarget.Header.Del(HTTPHeaderAuthRequest)
c.reqs = append(c.reqs, e.reqTarget)
if len(c.reqs) >= 10 {
return resp, errors.New("stopped after 10 redirects")
}
if req.Body != nil {
// Refresh the body reader so the body can be sent again
e.reqTarget.Body = ioutil.NopCloser(&body)
}
return c.Do(e.reqTarget)
}
}
return resp, err
}
if respUnauthorizedNegotiate(resp) {
err := SetSPNEGOHeader(c.krb5Client, req, c.spn)
if err != nil {
return resp, err
}
if req.Body != nil {
// Refresh the body reader so the body can be sent again
req.Body = ioutil.NopCloser(&body)
}
return c.Do(req)
}
return resp, err
}

可以从源码，if respUnauthorizedNegotiate(resp) 当请求是 401 的时候，通过 SetSPNEGOHeader(c.krb5Client, req, c.spn)设置 SPNEGO 头，然后再次调用方法自身来请求目标。

这里的 SPNEGO 头的 token 实际上是加密后的 Service Ticket，包含用户的身份信息和对服务的权限。也就是说，你，啊，虽然是已经认证了的用户，但是 HTTP 服务端并不知道你的权限是什么样的，你得先找 TGS 拿一张 Service Ticket 给 HTTP 服务端，HTTP 服务端才让你访问。

python

项目组前运维大佬采集 yarn resourcemanager 用量用的是 python 写的，因为内网安装 python 库比较麻烦，这个版本我没有再继续维护，转而使用 go 版本了。

在初次看到这份代码之前还是很好奇的，毕竟当时认为 kerberos 是个很复杂的东西。

使用了requests_kerberos 的 HTTPKerberosAuth

精简代码如下

from requests_kerberos import HTTPKerberosAuth
import requests
import os

keytabfile="/path/to/user.keytab"
pricipal="user@EXAMPLE.COM"

shell_cmd = 'kinit -kt %s %s' % (keytabfile, principal)os.system(shell_cmd)

krb5auth = HTTPKerberosAuth(hostname_override=fqdn, principal=principal)

r = requests.get(active_nn_url, auth=krb5auth)

....

MacOS 莫名失去焦点

2023-07-03T15:24:27.000Z

最近不知道怎么回事，明明没做什么当前窗口就失去焦点了，比如打字打一半发现文字不上屏了，才发现左上角三个圆点灰掉了，戴着耳机的时候就会听到「滴滴滴滴」的提示音，不得不鼠标点一点重新聚焦，非常烦躁。

我的系统版本：MacOS 12.5
而且最近没有升级过版本

网上搜了下，发现不止我一个人这种情况，也有人写了脚本来检测(见参考1)

#!/usr/bin/python

from AppKit import NSWorkspace
import time
t = range(1,100)
for i in t:
    time.sleep(3)
    activeAppName = NSWorkspace.sharedWorkspace().activeApplication()['NSApplicationName']
    print(activeAppName)

但是运行报错，提示 ModuleNotFoundError: No module named 'AppKit'

明明也安装了 AppKit

┌─[@MacBook-Pro] - [/usr/local/lib/python3.11/site-packages/appkit] - [Sat Jul 01, 22:54]
└─[$] <> ll
total 3232
-rwxr-xr-x  1   staff   620K Jul  1 22:48 _AppKit.cpython-311-darwin.so
-rw-r--r--  1   staff   5.0K Jul  1 22:48 __init__.py
drwxr-xr-x  7   staff   224B Jul  1 22:48 __pycache__
-rwxr-xr-x  1   staff   114K Jul  1 22:48 _inlines.cpython-311-darwin.so
-rw-r--r--  1   staff   851K Jul  1 22:48 _metadata.py
-rw-r--r--  1   staff   650B Jul  1 22:48 _nsapp.py
drwxr-xr-x  9   staff   288B Jul  1 02:36 api
-rw-r--r--  1   staff   3.6K Jul  1 02:36 app.py
-rw-r--r--  1   staff   4.8K Jul  1 02:36 app.pyc
-rw-r--r--  1   staff   1.3K Jul  1 02:36 test_app.py

项目主页：https://github.com/TinKurbatoff/appkit

后来在参考2 看到，这是个李鬼

解决方法：

卸载 AppKit，安装 pyobjc

python3 的环境处理如下

1 2	python3 -m pip uninstall appkit python3 -m pip install --upgrade --force-reinstall PyObjC PyObjC-core

然后运行上面的程序，找到了元凶，竟然是

1	iShotHelper

这个可恶的程序，困扰了我一周！！！

参考：

Client cannot authenticate via:[TOKEN, KERBEROS] 问题解决

2023-03-17T15:24:27.000Z

问题

同事要在一台老服务器上部署测试环境，发现 kerberos 有问题，找我看看

当前用户已经 kinit

$ klist
Ticket cache: FILE:/tmp/krb5cc_1059
Default principal: carpo@OSS5.COM

Valid starting     Expires            Service principal
03/17/23 17:29:32  03/16/24 17:29:32  krbtgt/OSS5.COM@OSS5.COM
renew until 03/14/33 17:29:32

执行 hdfs dfs -ls / 报错

23/03/17 17:29:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/03/17 17:29:34 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
23/03/17 17:29:34 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
23/03/17 17:29:34 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
23/03/17 17:29:34 INFO retry.RetryInvocationHandler: java.io.IOException: DestHost:destPort :8020 , LocalHost:localPort /:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS], while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over /:8020 after 1 failover attempts. Trying to failover after sleeping for 1160ms.
23/03/17 17:29:35 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
23/03/17 17:29:35 INFO retry.RetryInvocationHandler: java.io.IOException: DestHost:destPort :8020 , LocalHost:localPort /:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS], while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over /:8020 after 2 failover attempts. Trying to failover after sleeping for 2093ms.

折腾过程

开始我以为是小问题，检查了 hosts 文件，不过应该不是 hosts 的问题，如果 hosts 有问题，就没法解析 NAMENODE hostname 成 IP 了。

然后检查了这台服务器到 NAMENODE 的连通性，没问题。

之前没遇到过这个问题，之前遇到的问题，鉴权失败会写具体的 reason，比如时间不同步，偏移太大啥的。这个也没写。

开启 DEBUG 看看

1 2	export HADOOP_ROOT_LOGGER=DEBUG,console export HADOOP_OPTS="-Dsun.security.krb5.debug=true -Djavax.net.debug=ssl"

查看用户信息试试

1	hadoop org.apache.hadoop.security.UserGroupInformation

输出

$ hadoop org.apache.hadoop.security.UserGroupInformation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Getting UGI for current user
23/03/17 18:00:43 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true
23/03/17 18:00:43 DEBUG util.Shell: setsid exited with exit code 0
Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
23/03/17 18:00:43 DEBUG security.Groups:  Creating new Groups object
23/03/17 18:00:43 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
23/03/17 18:00:43 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /usr/hdp/3.1.5.0-152/hadoop/lib/native/libhadoop.so: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/hdp/3.1.5.0-152/hadoop/lib/native/libhadoop.so)
23/03/17 18:00:43 DEBUG util.NativeCodeLoader: java.library.path=:/usr/hdp/3.1.5.0-152/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.1.5.0-152/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.1.5.0-152/hadoop/lib/native
23/03/17 18:00:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/03/17 18:00:43 DEBUG util.PerformanceAdvisory: Falling back to shell based
23/03/17 18:00:43 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
23/03/17 18:00:43 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=7200000; warningDeltaMs=5000
Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
>>> KdcAccessibility: reset
>>> KdcAccessibility: reset
>>>KinitOptions cache name is /tmp/krb5cc_1059
>>>DEBUG   client principal is carpo@
>>>DEBUG  server principal is krbtgt/@
>>>DEBUG  key type: 18
>>>DEBUG  auth time: Fri Mar 17 17:52:12 CST 2023
>>>DEBUG  start time: Fri Mar 17 17:52:12 CST 2023
>>>DEBUG  end time: Sat Mar 16 17:52:12 CST 2024
>>>DEBUG  renew_till time: Mon Mar 14 17:52:12 CST 2033
>>> CCacheInputStream: readFlags()  FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH;
>>>DEBUG   client principal is carpo@
>>>DEBUG  server principal is X-CACHECONF:/krb5_ccache_conf_data/fast_avail/krbtgt/@@
>>>DEBUG  key type: 0
>>>DEBUG  auth time: Thu Jan 01 08:00:00 CST 1970
>>>DEBUG  start time: null
>>>DEBUG  end time: Thu Jan 01 08:00:00 CST 1970
>>>DEBUG  renew_till time: null
>>> CCacheInputStream: readFlags() 
>>> unsupported key type found the default TGT: 18
23/03/17 18:00:43 DEBUG security.UserGroupInformation: hadoop login
23/03/17 18:00:43 DEBUG security.UserGroupInformation: hadoop login commit
23/03/17 18:00:43 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: carpo
23/03/17 18:00:43 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: carpo" with name carpo
23/03/17 18:00:43 DEBUG security.UserGroupInformation: User entry: "carpo"
23/03/17 18:00:43 DEBUG security.UserGroupInformation: UGI loginUser:carpo (auth:SIMPLE)
User: carpo
Group Ids: 
23/03/17 18:00:43 DEBUG security.Groups: GroupCacheLoader - load.
Groups: user 
UGI: carpo (auth:SIMPLE)
Auth method SIMPLE
Keytab false
============================================================

如果鉴权成功的话，输出是这样（没开 DEBUG）

User: carpo@OSS5.COM
Group Ids: 
Groups: carpo 
UGI: carpo@OSS5.COM (auth:KERBEROS)
Auth method KERBEROS
Keytab false

上面就是失败了，失败原因

1	unsupported key type found the default TGT: 18

这个是因为这台服务器上的 java 不支持 18 类型的加密方式，也就是 AES

搜了一下，网上有的说，在 /etc/krb5.conf 改默认的加密方式

[libdefaults]
  default_tkt_enctypes = rc4-hmac aes256-cts aes128-cts des3-cbc-sha1 des-cbc-md5 des-cbc-crc
  default_tgs_enctypes = rc4-hmac aes256-cts aes128-cts des3-cbc-sha1 des-cbc-md5 des-cbc-crc
  permitted_enctypes = rc4-hmac aes256-cts aes128-cts des3-cbc-sha1 des-cbc-md5 des-cbc-crc

这个我尝试了，不行

可能是因为我们的 kdc 只支持 aes256-cts 的方式

# cat /var/kerberos/krb5kdc/kdc.conf
[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88
 restrict_anonymous_to_tgt = true

[realms]
 OSS3.COM = {
  master_key_type = aes256-cts
  max_life = 365d
  max_renewable_life = 3650d
  acl_file = /var/kerberos/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  default_principal_flags = +preauth
;  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
  pkinit_identity = FILE:/var/kerberos/krb5kdc/kdc.crt,/var/kerberos/krb5kdc/kdc.key
  pkinit_anchors = FILE:/var/kerberos/krb5kdc/kdc.crt
  pkinit_anchors = FILE:/var/kerberos/krb5kdc/cacert.pem
  pkinit_pool = FILE:/var/lib/ipa-client/pki/ca-bundle.pem
 }

还可能是因为 keytab 里面的 key 是用的 aes256-cts 吧

klist -kte carpo.keytab
Keytab name: FILE://carpo.keytab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   4 12/13/22 14:30:49 carpo@OSS5.COM (aes256-cts-hmac-sha1-96) 
   4 12/13/22 14:30:49 carpo@OSS5.COM (aes128-cts-hmac-sha1-96)

看了下这台服务器的系统

1 2	# cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.5 (Santiago)

是不是因为系统太老了呀，不支持 type 18 的这种加密方式，我们部署新的集群环境都是限制了要 CentOS 7.6 的

我当时慌着下班，我说“不行，系统太老了”

组长说：“你在开玩笑吗？这台服务器以前部署过XX平台的，以前肯定是可以鉴权成功的”

知道这个结论后，激起了我的胜负欲。

这就好像有人告诉你了，挖到 100 米就肯定能有水，目前没有水，是因为我目前还没挖到 100 米。

之前可以的，现在不行（因为迁移集群的缘故，配置变化了），说明和系统版本没有关系。

“Linux 一切皆文件”，肯定是某个配置文件的问题了。

( 惹毛了我就 rsync 全部同步文件。)

后来搜索发现，JDK 不支持 aes256 的加密方式是因为 JCE。

因为漂亮国的什么密码出口条例，aes256 这种高强度加密方式是有限制出口的。

只需要下载 JCE，放到 $JAVA_HOME/jre/lib/security/

我去 $JAVA_HOME/jre/lib/security/ 这个路径一看，JCE 是已经安装的呀

# cd /usr/java/jre/lib/security/
# ll
total 172
-rw-r--r-- 1 root root   4054 Dec 12  2017 blacklist
-rw-r--r-- 1 root root   1273 Dec 12  2017 blacklisted.certs
-rw-r--r-- 1 root root 113484 Dec 12  2017 cacerts
-rw-r--r-- 1 root root   2466 Dec 12  2017 java.policy
-rw-r--r-- 1 root root  33404 Dec 12  2017 java.security
-rw-r--r-- 1 root root     98 Dec 12  2017 javaws.policy
-rw-r--r-- 1 root root   3527 Dec 12  2017 local_policy.jar
-rw-r--r-- 1 root root      0 Dec 12  2017 trusted.libraries
-rw-r--r-- 1 root root   3026 Dec 12  2017 US_export_policy.jar

但是别人报错原因说的很清楚，就是不支持。

我对比另一台同样是 Redhat 6.5 但是能够鉴权成功的服务器的这个目录，发现那台服务器和这台鉴权失败的服务器的 JCE 的两个 jar 包文件大小不一样

$ cd /usr/java/jre/lib/security/
$ ll
total 180
-rw-r--r-- 1 root root   4054 Mar 17 18:16 blacklist
-rw-r--r-- 1 root root   1273 Mar 17 18:16 blacklisted.certs
-rw-r--r-- 1 root root 113484 Mar 17 18:16 cacerts
-rw-r--r-- 1 root root   2466 Mar 17 18:16 java.policy
-rw-r--r-- 1 root root  33404 Mar 17 18:16 java.security
-rw-r--r-- 1 root root     98 Mar 17 18:16 javaws.policy
-rw-r--r-- 1 root root   3035 Mar 17 18:16 local_policy.jar
-rw-r--r-- 1 root root   3527 Mar 17 18:16 local_policy.jar.20181029
-rw-r--r-- 1 root root      0 Mar 17 18:16 trusted.libraries
-rw-r--r-- 1 root root   3023 Mar 17 18:16 US_export_policy.jar
-rw-r--r-- 1 root root   3026 Mar 17 18:16 US_export_policy.jar.20181029

scp 拷贝过来后，发现 hdfs dfs -ls / 鉴权成功了。

总结

有的时候网上的信息不是那么直接刚好能解决你的问题，有的时候还需要运气，你的认知，以及一点点🤏坚持。

当然，如果不是组长给我说了“肯定可以”的结论，估计我也就以“系统太老不支持”的原因放弃了。

参考

在 xorm 使用 go-ora 连接 oracle 数据库 - 导入指定 commit 的包

2023-02-21T15:24:27.000Z

强烈建议连接 oracle 数据库的 golang 用户使用这个库 https://github.com/sijms/go-ora

这个是纯 golang 编写的 oracle 驱动

也就是说不用装 instant client 了！！！

我推测是通过抓包写的

之前的用于连接 oracle 两个库，都有些不方便的地方

github.com/godror/godror
- 需要安装 instant client
- 因为依赖 CGO，MacOS 无法跨平台编译 linux 程序，我每次是 git push 后，到 linux 平台拉代码编译
- CentOS 6 不支持，因为 GLIBC 版本太低，服务器升级风险高
github.com/mattn/go-oci8
- 需要写 OCI 配置文件
- 需要安装 instant client

食用

需要使用 v2 版本

1	go get github.com/sijms/go-ora/v2

搭配 xorm 食用

xorm 官网上没写支持 go-ora 驱动，但是在国内的 gitea 仓库 README 上写了支持 go-ora

是 2月4日的 commit 支持的，目前（2023年2月21日）还没有发布版本

https://gitea.com/xorm/xorm/commit/0c9963c6379477764ab4adbab74195f92d3b89dc

通过指定 commit 方式导入包

1	go get xorm.io/xorm@0c9963c6379477764ab4adbab74195f92d3b89dc

连接

连接串有稍许变动

godror

const driverName = "godror"

oralInfo = fmt.Sprintf("%s/%s@%s:%d/%s", user, password, host, port, dbname)

db, err := xorm.NewEngine(driverName, oralInfo)

go-ora

const driverName = "oracle"

oralInfo = fmt.Sprintf("oracle://%s:%s@%s:%d/%s", user, password, host, port, dbname)

db, err := xorm.NewEngine(driverName, oralInfo)

我如何为 Prometheus 设计一款简易的 CMDB

2023-02-10T15:24:27.000Z

背景

在 2022年12月，我们新搭建了一套 HDP 3.1.5 的 Hadoop 集群之后，内网域已有 5 套 hadoop 集群，服务器数量达到 2000+。

要将这么多服务器接入到 Prometheus 可不是件容易的事，更别说我们偶尔会变更服务器用途、集群扩容。

比如，我们正在进行 HDP2 集群往 HDP3 集群迁移，为了完成数据生产程序的版本适配，前期已经将 4G 5G xDR 的 ETL 集群扩容，hdfs 输出双送到 HDP2 HDP3 的两套集群。

一旦 HDP3 迁移完成，可以预见即将带来的变更：

HDP2 集群拆掉
4G xDR ETL 仅输出到 HDP3，机器缩容
5G xDR ETL 仅输出到 HDP3，机器数量不变（因为之前有一个大业务没有上，加上后机器负载刚好达到饱和）
HDP2 集群拆掉的机器扩到 HDP3 集群

….

这些变更带来的影响：

主机名域名解析更新
集群主机变更
告警规则，ETL 集群集群监控统计适配

计划需求

我希望设计一个程序，最少满足以下需求

对 Prometheus 提供自动发现（ Prom 支持 http_sd ）
可以快捷导出 hosts 文件（后来这个需求优化成了 DNS 服务器）

调研选型

在外网上搜了很久，服务器裸机方面的监控，一般是用 zabbix，或者是 ansible-awx。

zabbix 不适合统计业务指标
ansible-awx 搭建麻烦

最后我决定自己写，用 Golang，在外网测试好，交叉编译成 linux 二进制放到内网执行即可。

开发环境搭建

我们内网中的关系性数据库是 Oracle，在这之前我没有使用 golang 连接 oracle 的，需要一个开发环境，用 docker 可以启动一个 oracle

1	docker run -d --name oracle --privileged -v $(pwd)/oradata:/u01/app/oracle -p 8080:8080 -p 1521:1521 absolutapps/oracle-12c-ee

go 驱动使用 https://github.com/godror/godror

按照文档安装 oracle 驱动程序即可连接上

将表格入库

目前我们的机器信息在一个 Excel 文件中，需要读取出来，然后写入到 oracle 表中。

入库程序用其他语言写也行，如果也用 go 写，入库程序和 cmdb 程序的 struct 可以复用，刚好在 github 上发现一个解析 excel 的程序，可以开箱即用。

参考：https://github.com/douyacun/go-struct-excel

入库程序： https://github.com/meoww-bot/read-excel-go-oracle

错误的 IP

这里有个小插曲，就是我的 navicat 能连上 oracle 数据库，golang 程序连不上。

然后我找了一圈，发现 navicat 用的是域名，golang 程序里面配置的是 IP 地址，域名是在 cloudflare 上配置的，但是 IP 是我用 dig +short 命令查的

后来仔细一看，原来是我前段时间入手了 Surge，开了增强模式，Surge 会创建一个虚拟网卡 (Surge VIF) 并配置其为默认路由。所用的 DNS 请求都会得到一个位于 198.18.0.0/15 段的虚拟地址。

来源：https://www.v2ex.com/t/899087

HTTP 服务发现接口

参考
https://prometheus.io/docs/prometheus/latest/http_sd/

格式

[
  {
    "targets": [ "", ... ],
    "labels": {
      "": "", ...
    }
  },
  ...
]

因为考虑在 labels 增加机器健康状况，机器健康状况是对应每个机器的，所以我只能在 targets 里面塞一个 host

结果

[
{

    "targets": [
        "xxxx001:9100"
    ],
    "labels": {
        "biz": "master",
        "ip": "10.110.1.1",
        "job": "ose",
        "status": "OK"
    }

},
{

    "targets": [
        "xxxx002:9100"
    ],
    "labels": {
        "biz": "master",
        "ip": "10.110.1.2",
        "job": "ose",
        "status": "OK"
    }

},
{

    "targets": [
        "xxxx003:9100"
    ],
    "labels": {
        "biz": "master",
        "ip": "10.110.1.3",
        "job": "ose",
        "status": "OK"
    }

},
{

    "targets": [
        "xxxx004:9100"
    ],
    "labels": {
        "biz": "master",
        "ip": "10.110.1.4",
        "job": "ose",
        "status": "OK"
    }

},
...
]

把 IP 搞出来是因为有时候需要给外部系统提供 IP，比如

有次云池交换机割接，割接完后可能有些机器网络还是没有恢复，需要知道确定的 IP
其他厂家给我们传送数据时，数据不均衡，也需要提供给对方我们机器的 IP

DNS 服务器

consul

开始的计划是使用 consul 来完成 Prometheus 的服务发现，因为

Prometheus 支持 consul
只有一个二进制，部署方便
自带一个 DNS 服务器
可以配置主机的维护信息（ https://developer.hashicorp.com/consul/api-docs/agent/service#enable-maintenance-mode ）
自带了 health check，health check 结果也有 label，可以同步到 prometheus

部署了后发现，consul 的 DNS 是一种特定格式
https://developer.hashicorp.com/consul/docs/discovery/dns#node-lookups

.node[.].

所以我放弃了 consul

使用 Go 实现

然后准备想用 bind 或者 dnsmasq。

后来突然开窍，DNS 服务器只是一个监听在 udp 53 端口上的，能对特定请求进行相应的服务端程序。

那么我可以自己写一个吧

使用 github.com/miekg/dns 这个库

这个库也是 coredns 和 consul 使用的

参考：https://jameshfisher.com/2017/08/04/golang-dns-server/

注意，域名后有个点 .

所以我只需要将数据库的结果查询出来，组装一下放到 map[string]string 即可

代码截取

func GetAllHosts() map[string]string {

d := make(map[string]string)

invArray, err := db.GetInventoryHosts("all")

if err != nil {
panic(err)
}

for _, inv := range invArray {

if inv.Domain != "" {
fqdn := inv.ShortHostname + "." + inv.Domain
d[fqdn+"."] = inv.ServiceIp
}
d[inv.Hostname+"."] = inv.ServiceIp
d[strings.ToLower(inv.ShortHostname)+"."] = inv.ServiceIp

}

return d

}

var dict = GetAllHosts()

func (h *Handler) ServeDNS(w dns.ResponseWriter, r *dns.Msg) {
msg := dns.Msg{}
msg.SetReply(r)

switch r.Question[0].Qtype {
case dns.TypeA:
msg.Authoritative = true
domain := msg.Question[0].Name
address, ok := dict[domain]
if ok {
msg.Answer = append(msg.Answer, &dns.A{
Hdr: dns.RR_Header{Name: domain, Rrtype: dns.TypeA, Class: dns.ClassINET, Ttl: 3600},
A:   net.ParseIP(address),
})
}
}
w.WriteMsg(&msg)
}

DNS 从数据库重载主机信息

考虑到 CMDB 信息修改后，需要同步到 DNS 服务器，我不可能每次修改后要重启这个 cmdb-server 程序。

参考 prometheus 的设计，增加了一个 reload endpoint

handler Post

1	api.POST("/inventory/dns/reload", handler.InventoryDnsReload)

只需要再次请求获取全量机器信息即可

func InventoryDnsReload(c *gin.Context) {
dict = GetAllHosts()
log.Println("[DNS] reload DNS records success")
}

使用 curl 请求 reload

1	curl -X POST -u 'user':'pass' "http://127.0.0.1:8000/api/inventory/dns/reload"

自动化重载，可以通过 oracle 触发器执行外部脚本调用 curl 请求

压力测试

要考虑整个内网都用这个 DNS 服务器，还需要压力测试解析能力

使用 dnsperf 进行压力测试，参考： https://www.cnblogs.com/cobbliu/p/3872255.html

压力测试结果

同时启动 gin 和 DNS server

将 gin 的路由监听放到 go 协程中

1	go router.Run(":8000")

使用 DNS server 的监听阻塞整个程序

srv := &dns.Server{Addr: ":53", Net: "udp"}
srv.Handler = &handler.Handler{}

if err := srv.ListenAndServe(); err != nil {
    log.Fatalf("Failed to set udp listener %s\n", err.Error())
}

使用 CMDB 接入 Prometheus

prometheus 配置

- job_name: 'canal_4gxdr_new'
  http_sd_configs:
    - url: http://:8000/api/inventory/sd?cluster=canal_4gxdr
      basic_auth:
          username: ...
          password: ...
  relabel_configs:
  - source_labels: [__address__]
    regex: "([^:]+):\\d+"
    target_label: instance

效果图

⚠️注意：

relabel_configs 用于去掉 instance 中的端口号，grafana 面板会更加整洁
去掉端口号不会影响 Prometheus 抓取指标，Prometheus 会从 relabel 前的 __address__ 抓取指标
labels 里面的 job 标签比配置文件指定的优先级还高：配置文件我指定的 job 名称为 canal_4gxdr_new，是想和原有的 canal_4gxdr 区分开，结果却发现这些 targets 还是跑到老的 job 里面去了，把集群组的监控指标值都给污染了 😓 。我只能把老的 canal_4gxdr 去掉了。

我原以为为要为每个 job 单独配置服务发现链接。现在看来只需要配置一个，带出所有的即可，labels 里面有 job，prometheus 会根据这个自动分组

后续更新

更换 oracle 驱动

参考另文在 xorm 使用 go-ora 连接 oracle 数据库 - 导入指定 commit 的包

导出 DNS 服务器的 metrics

我想要知道 DNS 服务器提供了多少次 DNS 解析

同样参考 prometheus+node_exporter 的设计

给 gin 添加一个 metrics 端点

1	router.GET("/metrics", gin.WrapH(promhttp.Handler()))

handler/dns.go

定义指标名称

var (
dns_request_total = promauto.NewCounter(
prometheus.CounterOpts{
Name: "dns_request_total",
Help: "The total number of processed dns requests",
},
)
)

在解析处理时给指标 inc
所以将 inc 放到 if ok 后面

func (h *Handler) ServeDNS(w dns.ResponseWriter, r *dns.Msg) {
msg := dns.Msg{}
msg.SetReply(r)

switch r.Question[0].Qtype {
case dns.TypeA:
msg.Authoritative = true
domain := msg.Question[0].Name
address, ok := dict[domain]
if ok {
--->        dns_request_total.Inc()
msg.Answer = append(msg.Answer, &dns.A{
Hdr: dns.RR_Header{Name: domain, Rrtype: dns.TypeA, Class: dns.ClassINET, Ttl: 3600},
A:   net.ParseIP(address),
})
}
}
w.WriteMsg(&msg)
}

如果有必要的话可以更加细化

一共收到的解析请求数量 dns_request_total

成功处理的解析请求数量 dns_answered_request_total

反向 DNS 记录（PTR）

有的时候别的系统给了个 IP，我想知道这个 IP 的主机名，难道必须要用 ping 吗

或者是 grep 'ip' /etc/hosts 这么low 的方式？？

有没有高大上一点的，我们现在有 DNS 了，有没有通过 DNS 查询的方式呢？

有的

答案是 PTR 记录

A/AAAA 记录用于域名转换为 IP 地址
PTR 记录则相反，将 IP 地址转换为域名

PTR记录的定义和实现可以参考 RFC 1035

Cloudflare 网站上对 PTR 记录的介绍 https://www.cloudflare.com/learning/dns/dns-records/dns-ptr-record/

查询过程

我简单归纳一下查询过程

比如我们要查询 1.2.3.4 这个 IP 的 PTR 记录

客户端实际发起的请求是查询 4.3.2.1.in-addr.arpa. 这个域名的 PTR 记录

是把你查询的 IP 地址按每段倒转过来，在后面加上.in-addr.arpa.，因为 PTR 记录存储在 DNS 的 .arpa 顶级域中。

.arpa 是一个主要用于管理网络基础设施的域，是为互联网定义的第一个顶级域名。

ARPA 是 Advanced Research Projects Agency(美国国防部高级研究计划署）的缩写。

可能 ARPA 这个单词有些陌生，那么在后面加上 NET 呢？ ARPANET 有没有更熟悉一点？

ARPANET 是 Internet 的前身。1969年由 ARPA 制定，用于军事连接的网络。

in-addr.arpa 是 .arpa 中的一个命名空间，用于在 IPv4 中进行反向 DNS 查找。

实际使用

以 dig 命令为例查询 PTR 记录

1	dig -x 106.10.150.171

响应

; <<>> DiG 9.10.6 <<>> -x 106.10.150.171
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2929
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 6, ADDITIONAL: 7

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;171.150.10.106.in-addr.arpa.   IN      PTR

;; ANSWER SECTION:
171.150.10.106.in-addr.arpa. 300 IN     PTR     unknown.yahoo.com.

;; AUTHORITY SECTION:
in-addr.arpa.           285     IN      NS      b.in-addr-servers.arpa.
in-addr.arpa.           285     IN      NS      f.in-addr-servers.arpa.
in-addr.arpa.           285     IN      NS      a.in-addr-servers.arpa.
in-addr.arpa.           285     IN      NS      c.in-addr-servers.arpa.
in-addr.arpa.           285     IN      NS      d.in-addr-servers.arpa.
in-addr.arpa.           285     IN      NS      e.in-addr-servers.arpa.

;; ADDITIONAL SECTION:
a.in-addr-servers.arpa. 285     IN      A       199.180.182.53
b.in-addr-servers.arpa. 285     IN      A       199.253.183.183
c.in-addr-servers.arpa. 285     IN      A       196.216.169.10
d.in-addr-servers.arpa. 285     IN      A       200.10.60.53
e.in-addr-servers.arpa. 285     IN      A       203.119.86.101
f.in-addr-servers.arpa. 285     IN      A       193.0.9.1

;; Query time: 2481 msec
;; SERVER: 198.18.0.2#53(198.18.0.2)
;; WHEN: Sun Mar 05 15:46:54 CST 2023
;; MSG SIZE  rcvd: 295

可以看到我们的请求

1 2	;; QUESTION SECTION: ;171.150.10.106.in-addr.arpa. IN PTR

结果

1 2	;; ANSWER SECTION: 171.150.10.106.in-addr.arpa. 300 IN PTR unknown.yahoo.com.

程序实现

还是要生成一个 PTR 记录的关系，即 .in-addr.arpa. 和域名的 map。

dns.ReverseAddr() 函数可以用于转换 IP 地址到 .in-addr.arpa. 格式

func GetAllHostsPTR() map[string]string {

d := make(map[string]string)

invArray, err := db.GetInventoryHosts("all")

if err != nil {
panic(err)
}

for _, inv := range invArray {

ptr_address, _ := dns.ReverseAddr(inv.ServiceIp)

d[ptr_address] = inv.Hostname + "."

}

return d

}

在 DNS 请求处理部分新增一个 case

var ptr_dict = GetAllHostsPTR()

func (h *Handler) ServeDNS(w dns.ResponseWriter, r *dns.Msg) {
msg := dns.Msg{}
msg.SetReply(r)

switch r.Question[0].Qtype {
case dns.TypeA:
msg.Authoritative = true
domain := msg.Question[0].Name
address, ok := dict[domain]
if ok {
dns_request_total.Inc()
msg.Answer = append(msg.Answer, &dns.A{
Hdr: dns.RR_Header{
Name:   domain,
Rrtype: dns.TypeA,
Class:  dns.ClassINET,
Ttl:    3600},
A: net.ParseIP(address),
})
}
case dns.TypePTR:
msg.Authoritative = true
ptr_address := msg.Question[0].Name
_, ok := ptr_dict[ptr_address]
if ok {
dns_request_total.Inc()
msg.Answer = append(msg.Answer, &dns.PTR{
Hdr: dns.RR_Header{
Name:   ptr_address,
Rrtype: dns.TypePTR,
Class:  dns.ClassINET,
Ttl:    3600},
Ptr: domain,
})
            }
}
w.WriteMsg(&msg)

效果

dig @127.0.0.1 -p5300 -x 1.2.3.4

; <<>> DiG 9.10.6 <<>> @127.0.0.1 -p5300 -x 1.2.3.4
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18540
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;4.3.2.1.in-addr.arpa.    IN      PTR

;; ANSWER SECTION:
4.3.2.1.in-addr.arpa. 3600 IN     PTR     google.com.

;; Query time: 0 msec
;; SERVER: 127.0.0.1#5300(127.0.0.1)
;; WHEN: Thu Mar 02 09:53:28 CST 2023
;; MSG SIZE  rcvd: 148

注意：

返回的域名结尾要带 .，否则 dig 不会有结果，dig 会认为这个结果是 invalid 的，比如baidu.com不行，得baidu.com.
如果域名格式不合法，比如返回 .baidu.com，就会出现Got bad packet这样的结果

新增多个 PTR 记录

后来我想，能不能把 fqdn 和短域名加上呢？这是一对多的关系。

IP 对应多个域名。

因为我看到

1	msg.Answer = append(...)

如果要新增一个结果，再 append 即可。

但是一对多的 map 关系怎么生成呢？

我想到了两种办法

将 DNS 请求中的 .in-addr.arpa. 格式转换成 ip，从 allhosts 里面反向提取域名
提前生成好所有 IP 对应的 .in-addr.arpa. 格式，因为要一对多的关系，所以将域名作为 k，ptr记录作为 v 放到 dict 中（map[string]string）

我这里采用第 2 种

更新生成 map 的代码

func GetAllHostsPTR() map[string]string {

d := make(map[string]string)

invArray, err := db.GetInventoryHosts("all")

if err != nil {
panic(err)
}

for _, inv := range invArray {

ptr_address, _ := dns.ReverseAddr(inv.ServiceIp)

if inv.Domain != "" {
fqdn := inv.ShortHostname + "." + inv.Domain
d[fqdn+"."] = ptr_address
}
d[inv.Hostname+"."] = ptr_address
d[strings.ToLower(inv.ShortHostname)+"."] = ptr_address
}

return d

}

DNS 请求处理

var ptr_dict = GetAllHostsPTR()

func (h *Handler) ServeDNS(w dns.ResponseWriter, r *dns.Msg) {
msg := dns.Msg{}
msg.SetReply(r)

switch r.Question[0].Qtype {
case dns.TypeA:
msg.Authoritative = true
domain := msg.Question[0].Name
address, ok := dict[domain]
if ok {
dns_request_total.Inc()
msg.Answer = append(msg.Answer, &dns.A{
Hdr: dns.RR_Header{
Name:   domain,
Rrtype: dns.TypeA,
Class:  dns.ClassINET,
Ttl:    3600},
A: net.ParseIP(address),
})
}

case dns.TypePTR:
msg.Authoritative = true
ptr_address := msg.Question[0].Name
for k, v := range ptr_dict {

if v == ptr_address {
dns_request_total.Inc()
msg.Answer = append(msg.Answer, &dns.PTR{
Hdr: dns.RR_Header{
Name:   ptr_address,
Rrtype: dns.TypePTR,
Class:  dns.ClassINET,
Ttl:    3600},
Ptr: k,
})
}

}

}
w.WriteMsg(&msg)
}

调试发现， ptr_dict 这个 map 的长度是 5340.

查询结果

可以看到，虽然每次要从 5340 中遍历结果，但是性能没受到影响

1	Query time: 0 msec

distcp 卡在 build file listing

2022-12-21T02:24:27.000Z

背景

最近又搭建了一套新集群（OSE），版本 HDP 3.1.5.0

于是我们又要开始迁移数据啦

因为 OSD 不能重启，所以我配置了 OSD 到 OSE 的单向信任：

修改了 OSE 的 auth_to_local 配置并重启 HDFS 服务
使 OSD 的 pricipal 访问 OSE 时能够通过 auth_to_local 映射规则正确映射到对应的 OS 用户
并且测试：使用 OSD 的域用户能够访问 OSE 的 HDFS 文件

但是当我们准备将 OSD 的数据拷贝到 OSE 时，运行 distcp 发现卡住了

没有任何报错，我开始以为是 renew HDFS_DELEGATION_TOKEN 的问题，把双方集群的 NN IP 地址加在 -Dmapreduce.job.hdfs-servers.token-renewal.exclude= 还是不行。

看来是我学艺不精，只好求助谷歌了。

搜索关键字 “distcp stuck Build file listing completed”

探索过程

谷歌结果，除了第一个是 “Apache Hadoop Distributed Copy – DistCp Guide” 。

这是 Apache Hadoop 的 distcp 教程，我只想直接解决问题，后面的两个网页和我问题相似，为我解决问题提供了思路和帮助。

参考1:
http://people.apache.org/~liuml07/2017/07/05/DistCp-gets-stuck-with-build-listing/

参考2:
https://community.cloudera.com/t5/Support-Questions/Distcp-got-stuck-with-the-below-and-doesn-t-do-anything/m-p/292259

我将简述下我如何解决这个问题

我的环境：

当前客户端配置是 OSE 的
当前使用 OSD 的 kerberos 用户进行 kinit

我已经确认 kerberos 单向信任是正确配置，并且生效

测试参考1 的步骤

1 2	hadoop fs cp hdfs:///tmp/testfile /tmp/testfile

证明 HDFS 服务正常

据参考2 Arun66 的解决过程
We tried to run a sample MR job to test, then it failed with the following exception

我想到书里写的最基础的 MR 任务，wordcount

参考 Apache Hadoop 官网：https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0

WordCount.java 代码

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

  public static class TokenizerMapper
       extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }

  public static class IntSumReducer
       extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable values,
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

将代码放入到 WordCount.java 文件

导入环境变量

1
2
3

export JAVA_HOME=/usr/java/default
export PATH=${JAVA_HOME}/bin:${PATH}
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar

如果你的环境上已经有 JAVA_HOME 了，就不要执行第一条了

编译

1 2	$ hadoop com.sun.tools.javac.Main WordCount.java $ jar cf wc.jar WordCount*.class

上面导入 HADOOP_CLASSPATH 是为了让 hadoop 找到 com.sun.tools.javac.Main

1 2	/user/joe/wordcount/input - input directory in HDFS /user/joe/wordcount/output - output directory in HDFS

准备输入文件

vi file01
Hello World Bye World

vi file02
Hello Hadoop Goodbye Hadoop

hdfs dfs -put file* /user/joe/wordcount/input

执行 wordcount

1	$ hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output

然后果然报错了

1	Error: Java.io.IOException: initialization of all the collectors failed. Error in last collector was:java.io.IOException: Invalid “mapreduce.task.io.sort.mb”:3276.

一些关键字使用[REDACTED]代替

详细报错，点击展开

私有离线服务器基于 Prometheus 告警监控实践

2022-10-10T00:00:34.000Z

背景

2022年4月20日，我们的监控平台，zabbix 所在的服务器硬盘坏了，zabbix 终于迎来它的寿终正寝。

这台服务器有 12 个数据盘，因为早期某些原因（懒吧），数据盘是做的单盘 raid0，zabbix 的数据在其中的一个数据盘上

但是我们平时事情太多了，监控平台这种毫无价值产出的事情就一直拖着。

拖到 8 月（竟然这么能拖），某个契机发现有些服务器的数据盘已经满了，结果因为 zabbix 坏了所以没有告警出来，而且这些服务器上数据还比较重要。。。。又被坑了一次

监控系统重建迫在眉睫，于是我开始着手准备必须要开始重建了。

监控选型：Zabbix vs Prometheus

是继续用老的 Zabbix 系统，还是部署当下流行的 Prometheus 呢？

我查阅了一些资料，对比如下：

Zabbix

配置都在网页上
使用外部数据库存储
告警，自动发现 All in one。
对于 clickhouse，hadoop 有 integrations 可以集成。
读取 disk IO 需要专门配置，prometheus 则不需要，只需要一个 node_exporter 便可以收集自己想要的信息

Prometheus

在配置文件中配置，缺点，写配置文件麻烦，优点，批量操作，编辑快速，便于备份移植。
内置 tsdb 数据库，轻量。
支持监控的 exporter 很多，方便导出指标。（比如 kafka_exporter， zabbix 似乎无法监控；再比如 clickhouse_exporter…,更多 exporter 见 https://prometheus.io/docs/instrumenting/exporters/ ）
可以使用 PromQL 在 grafana 中进行指标运算
指标监控可以在 grafana 配置，也可以在配置文件中配置
告警是用的 alermanager 组件，自动发现可以读取文件自动发现，也可以使用 consul
可以配置自定义监控指标

对比结果

原有的 Zabbix 不再满足我们的监控需求
我们的业务复杂，数据库多样，需要一套现代化的监控体系
不仅对 Linux 系统监控，还有业务流程，数据库，中间件监控。
Prometheus 的 node_exporter 可以通过 textfile
模块来采集我们自定义的监控指标
可以在 grafana 面板上写告警规则，直观
缺陷不支持多维数据
也就是说，一个图里面的一个指标告警后，另一个指标再次达到告警阈值，grafana 不会将其视为一个新的告警

监控拓扑

10.110.38.1 是原来的 zabbix 服务器，因为项目服务器紧缺，只规划了这么一台监控服务器

配置：

cpu: 32 核
ram: 128 G
disk:
- / ssd 986G
- /data01 raid5 28T
- /data02 raid5 28T

10.101.235.6 是一台可以通外网的服务器，上面运行 webhook 程序，接收告警，调用脚本发短信

组件说明

node_exporter

启动脚本

CentOS 7 - /etc/systemd/system/node_exporter.service

[Unit]
Description=Node Exporter
After=network.target


[Service]
User=nodeusr
Group=nodeusr
Type=simple
ExecStart=/usr/local/bin/node_exporter --collector.textfile.directory=/var/lib/node_exporter/textfile_collector --collector.systemd --collector.systemd.unit-include=(sshd|supervisor).service --collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run|boot|host|etc)($|/) --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs|rootfs)$ --no-collector.hwmon --no-collector.nfsd

[Install]
WantedBy=multi-user.target

说明：

–collector.textfile.directory=/var/lib/node_exporter/textfile_collector 指定收集自定义监控项的位置，textfile 是默认开启的
–collector.systemd 启动 systemd 监控，与上不同，systemd 收集器默认不开启，所以需要显性指定开启
–collector.systemd.unit-include=(sshd|supervisor).service 指定收集 systemd 的 service，不指定的话会有 700 多条指标，占存储
–collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run|boot|host|etc)($|/) 忽略收集一些没必要收集的挂载点
–collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs|rootfs)$ 排除收集一些文件系统
–no-collector.hwmon 关闭 hwmon 收集器。hwmon 是服务器的硬件监控，电源（power），芯片（chip），传感器（sensor），温度（temp），硬件问题有专门的厂商维护，我们不用管理，如果你是云服务器，也可以考虑关闭
–no-collector.nfsd 关闭 nfsd 收集器

如果你还有其他要关的收集器，你想看默认开了哪些收集器，参考：
https://github.com/prometheus/node_exporter/blob/master/README.md#collectors

CentOS 6 - node_exporter.rhel6.service

因为只有少数几台，暂时还没有优化收集器

#!/bin/bash
#
# /etc/rc.d/init.d/node_exporter
#
#  Prometheus node exporter
#
#  description: Prometheus node exporter
#  processname: node_exporter

# Source function library.
. /etc/rc.d/init.d/functions

PROGNAME=node_exporter
PROG=/usr/local/bin/$PROGNAME
USER=nodeusr
LOGFILE=/var/log/node_exporter.log
LOCKFILE=/var/run/$PROGNAME.pid

start() {
    echo -n "Starting $PROGNAME: "
    cd /usr/local/bin/
    daemon --user $USER --pidfile="$LOCKFILE" "$PROG &>$LOGFILE &"
    echo $(pidofproc $PROGNAME) >$LOCKFILE
    echo
}

stop() {
    echo -n "Shutting down $PROGNAME: "
    killproc $PROGNAME
    rm -f $LOCKFILE
    echo
}


case "$1" in
    start)
    start
    ;;
    stop)
    stop
    ;;
    status)
    status $PROGNAME
    ;;
    restart)
    stop
    start
    ;;
    reload)
    echo "Sending SIGHUP to $PROGNAME"
    kill -SIGHUP $(pidofproc $PROGNAME)#!/bin/bash
    ;;
    *)
        echo "Usage: service node_exporter {start|stop|status|reload|restart}"
        exit 1
    ;;

安装脚本

一键安装

bash <(curl -s http://10.110.38.1/files/install_node_exporter.sh)

兼容了 centos 6 ，我们有一些老机器还在运行这个版本

#!/bin/bash

if [ "$USER" != "root" ];then
        echo "You must use  user to run..."
        exit 1
else
        echo "USER: ROOT"
fi


echo "INFO - check node_exporter if installed already"

service node_exporter status >> /dev/null

if [ $? -eq 0 ];then
  echo "Already installed, exit ..."
  exit 1
else
  echo "NOT install .. install now"
fi

command_exists() {
command -v "$@" > /dev/null 2>&1
}

# check os version
# learn from https://get.docker.com
get_distribution() {

  if command_exists lsb_release; then
          dist_version="$(lsb_release --release | cut -f2)"
  fi
  if [ -z "$dist_version" ] && [ -r /etc/os-release ]; then
          dist_version="$(. /etc/os-release && echo "$VERSION_ID")"
  fi
  echo "$dist_version"
}



do_install() {
  dist_version=$( get_distribution )

  useradd -rs /bin/false nodeusr
  mkdir -p /var/lib/node_exporter/textfile_collector
  chmod -R 777 /var/lib/node_exporter/
  yum install -y -q wget

  echo "install node_exporter ..."
  wget -q http://10.110.38.1/files/node_exporter
  mv node_exporter /usr/local/bin/
  chmod +x /usr/local/bin/node_exporter

  echo "install node_exporter directory-size.sh"
  wget -q http://10.110.38.1/files/directory-size.sh
  chmod +x directory-size.sh
  mv directory-size.sh /usr/local/bin/

  echo "System dist version: $dist_version"

  dist_version=$( echo "$dist_version" | cut -d'.' -f1)

  echo "install node_exporter service ..."
  if [ "$dist_version" == "7" ]; then

    wget -q http://10.110.38.1/files/node_exporter.service
    mv node_exporter.service /etc/systemd/system/

    systemctl daemon-reload
    systemctl start node_exporter
    systemctl enable node_exporter

  elif [ "$dist_version" == "6" ]; then

    wget -q http://10.110.38.1/files/node_exporter.rhel6.service
    mv node_exporter.rhel6.service /etc/init.d/node_exporter
    chmod +x /etc/init.d/node_exporter

    touch /var/log/node_exporter.log
    chmod 777 /var/log/node_exporter.log

    /etc/init.d/node_exporter start
    # auto start
    # chkconfig --add node_exporter
  
  else

    echo "Unsupport dist version"
    exit 0
  
  fi

}


do_install

kafka_exporter

我们的 kafka 版本比较老，0.10.1

使用 https://github.com/danielqsj/kafka_exporter

尝试了好几个版本，最后发现 1.1.0 版本支持我们的 kafka

主要用于监控 kafka 消费积压

/etc/systemd/system/kafka_exporter.service

[Unit]
Description=kafka_exporter
After=local-fs.target network-online.target network.target
Wants=local-fs.target network-online.target network.target

[Service]
ExecStart=/opt/kafka_exporter/kafka_exporter --kafka.server=x.x.x.x:6667
Restart=on-failure

[Install]
WantedBy=multi-user.target

grafana 面板：https://grafana.com/grafana/dashboards/7589-kafka-exporter-overview/

prometheus （未设置鉴权）

10.110.38.1:9090

收集 exporter 展示的指标

告警项

可以参考 https://awesome-prometheus-alerts.grep.to/rules

告警优先级

severity: critical 高 - > 发短信
severity: warning 低 - > 在 karma 面板展示

我们正在用的一些告警项

HostOutOfDiskSpace 服务器磁盘快满了

- alert: HostOutOfDiskSpace
    expr: (node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes < 10 and ON (instance, device, mountpoint) node_filesystem_readonly == 0
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: Host out of disk space (instance {{ $labels.instance }})
      description: "Disk is almost full (< 10% left)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

PrometheusTargetMissing 主机消失（可能是宕机或者网络中断）

- alert: PrometheusTargetMissing
  expr: up == 0
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: Prometheus target missing (instance {{ $labels.instance }})
    description: "A Prometheus target has disappeared. An exporter might be crashed.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

还有一个业务相关的监控作为例子

/data/input/ 目录大小超过 500G

- alert: OutputErrorSizeAbove500G
  expr: node_directory_size_bytes{directory="/data/input/",job="canal_xxxx"}/1024/1024/1024 > 500
  for: 10m
  labels:
    team: node
    severity: critical
  annotations:
    summary: "Canal xxxx  Output Error Size High"
    description: '{{$labels.instance}}: Error Size: {{ $value | printf "%.2f" }})'

printf "%.2f" 是将 $value 保留2位小数

注意事项

重启前先检查配置文件是否无误

config
promtool check config /etc/prometheus/prometheus.yml
rules
promtool check rules /etc/prometheus/rules/*.yml

Consul Agent （未设置鉴权）

10.110.38.1:8500

注册中心，为 prometheus 提供服务自动发现，就不用频繁修改 prometheus 的配置文件以及重启服务来加载新的主机监控

这个组件暂时还没有用起来

Alertmanager (未设置鉴权)

http://10.110.38.1:9093/

Prometheus 包含的一个报警模块
主要用于接收 prom 发送的告警信息
支持丰富的告警通知渠道
对告警信息去重，降噪，分组

Karma - Alertmanager UI （无鉴权）

github 地址：https://github.com/prymitive/karma

Alertmanager 自带一个 UI 界面，可以用来查看报警和静默管理，但是还缺乏一个 Dashboard 必要的一些功能，比如报警历史记录等等，karma 这个工具就可以来帮助增强 Alertmanager 的可视化功能。

前身是 cloudflare/unsee

配置文件：
karma.yaml

alertmanager:
  interval: 60s
  servers:
    - name: local
      uri: http://:9093
      timeout: 10s
      proxy: true
      readonly: false
annotations:
  default:
    hidden: false
  hidden:
    - help
  visible: []
debug: false
karma:
  name: karma-prod
listen:
  address: "0.0.0.0"
  port: 8080
  prefix: /
log:
  config: false
  level: info
ui:
  refresh: 30s
  hideFiltersWhenIdle: true
  colorTitlebar: true
  minimalGroupWidth: 420
  alertsPerGroup: 5
  collapseGroups: collapsedOnMobile

启动

1	nohup karma --config.file /etc/prometheus/karma.yaml &

grafana （有鉴权）

10.110.38.1:3000

可视化 prometheus 收集到的指标信息，也可以做部分告警规则

版本：v8.2.7

grafana 9 重构了告警功能吧，名词改的乱七八糟，我也没时间学习新的，于是我开始回滚 8

但并不是 8 的所有版本都是 legacy 的，8 版本的末尾几个版本也用了重写后的告警功能，比如 8.5.x，最终我降级到 8.2.7

局限性

我们经常在监控报警的查询中会返回多个序列，Grafana 的报警中的聚合函数和阈值检测都会去评估每一个序列，但是目前 Grafana 不会去跟踪每个序列的报警规则状态，所以这会影响到我们的报警结果，比如：

报警查询条件返回 2 个序列：server1 和 server2
server1 序列触发了报警规则并切换到报警状态
发送消息通知出去，比如发送的消息是：负载达到了峰值（server1）
如果在同一报警规则的后续评估中，server2 序列也导致触发了报警
这个时候不会发送新的通知，因为报警规则已经处于报警状态之下了
所以从上面的场景可以看出，如果规则已经处于报警状态了，当其他序列也达到了报警条件后，Grafana 不会发送通知，目前 Grafana 官方有计划针对多个序列查询的支持，会在未来的版本中跟踪每个序列的状态，所以这个也是目前 Grafana 告警功能的一些局限性。

实测还发现，grafana 发给 alertmanager 的告警，firing 和 resolved 混合, webhook 脚本不方便提取整合

注意事项

grafana dashboard 可以通过 save as 来克隆面板
dashboard json model 可以直接 import 进去（通过备份 json 来备份 dashboard）

webhook 以及处理脚本

用于 alertmanager 的告警渠道

altermanager 自带的告警渠道配置没有调用脚本来告警（只有 webhook），他们自己也不想做（ https://github.com/prometheus/alertmanager/issues/2046 ），只能考虑用 webhook 来调用脚本

可以用 python-flask 自己写，也有现成的
https://github.com/adnanh/webhook
配置方法
https://github.com/prometheus/alertmanager/issues/2046#issuecomment-535072123

或者 https://github.com/imgix/prometheus-am-executor

我最终使用了 https://github.com/adnanh/webhook

配置

hooks.yaml

- id: alertmanager
  execute-command: "/home//alertmanager/sms.sh"
  command-working-directory: "/home//alertmanager/"

  pass-arguments-to-command:
  - source: payload
    name: status
  - source: payload
    name: alerts

sms.sh

#!/bin/bash

status=$1
alerts=$2

if [ "$status" == "firing" ];then
status="!!警告!!"
else
status="xx恢复xx"
fi

alertname=`echo $alerts | jq '.[0]["labels"]["alertname"]'`

title=`echo "[$status] $alertname"`
content=`echo $alerts | jq '.[]["annotations"]["description"]'`

sms=`echo -e "${title}\n${content}"`

phone='136xxxxxxxx'

curl -v -X POST http://ip:8768/sms/cmppSender -F phoneNumbers="$phone" -F "smContent=$sms"

启动 webhook

1	./webhook -hooks hooks.yaml -verbose

收到的短信效果

[!!警告!!] "PrometheusTargetMissing"
"A Prometheus target has disappeared. An exporter might be crashed.
 VALUE = 0
 LABELS = map[__name__:up instance:sbi193:9100 job:canal_xxxxx]"

恢复的短信

[xx恢复xx] "PrometheusTargetMissing"
"A Prometheus target has disappeared. An exporter might be crashed.
 VALUE = 0
 LABELS = map[__name__:up instance:sbi193:9100 job:canal_xxxx]"

除了告警，还可以写一些遇到告警后可以采取的恢复脚本（自愈）
例子：https://www.modb.pro/db/194943

alertmanager 给 webhook 发送 http post 请求的内容

{
  "version": "4",
  "groupKey": ,    // key identifying the group of alerts (e.g. to deduplicate)
  "status": "",
  "receiver": ,
  "groupLabels": ,
  "commonLabels": ,
  "commonAnnotations": ,
  "externalURL": ,  // backlink to the Alertmanager.
  "alerts": [
    {
      "labels": ,
      "annotations": ,
      "startsAt": "",
      "endsAt": ""
    }
  ]
}

🌰

{
  "version": "4",
  "groupKey": ,    // key identifying the group of alerts (e.g. to deduplicate)
  "status": "",
  "receiver": “web.hook",
  "groupLabels": {"alertname":"OutputErrorSize”}, // 对应 am 的 group by
  "commonLabels": ,
  "commonAnnotations":  {"description":"sbi163:9100: Output Error is above 500G (current value is: 1129.2498016357422)","summary":"sbi163:9100: Canal Worker Output Error High"},
  "externalURL": ,  // backlink to the Alertmanager.
  "alerts": [
{
  "annotations": {
    "summary": "/data/input/error/ size alert"
  },
  "endsAt": "0001-01-01T00:00:00Z",
  "fingerprint": "ac7d1a1af47b8b92",
  "generatorURL": "http://10.110.38.1:3000/d/WKkYjbiVk/canal_xxxxx?tab=alert&viewPanel=3&orgId=1",
  "labels": {
    "__name__": "node_directory_size_bytes",
    "alertname": "/data/input/error/ size alert",
    "directory": "/data/input/error/",
    "instance": "sbi139:9100",
    "job": "canal_xxxxx"
  },
  "startsAt": "2022-08-20T18:20:30Z",
  "status": "firing"
}
{
  "annotations": {
    "summary": "/data/input/error/ size alert"
  },
  "endsAt": "0001-01-01T00:00:00Z",
  "fingerprint": "2b0fb28362ff0526",
  "generatorURL": "http://10.110.38.1:3000/d/WKkYjbiVk/canal_xxxxx?tab=alert&viewPanel=3&orgId=1",
  "labels": {
    "__name__": "node_directory_size_bytes",
    "alertname": "/data/input/error/ size alert",
    "directory": "/data/input/error/",
    "instance": "sbi140:9100",
    "job": "canal_xxxxx"
  },
  "startsAt": "2022-08-20T18:20:30Z",
  "status": "firing"
}]}

监控配置

目录大小监控

exporter.sh

#!/bin/bash

LockFile="/var/tmp/exporter.lock"

if [ -f $LockFile ];then
    echo "Compare time:"
    Time=`date +%s`
    LogTime=`stat -c %Y $LockFile`
    if [ $[$Time - $LogTime ] -lt 900 ];then
    echo "Another process is running."
    exit 1;
    else
    rm -f $LockFile;
    kill -15 `pgrep exporter.sh`;
    exit 1;
    fi
fi

touch $LockFile;

# make sure sub-dir put in front
/usr/local/bin/directory-size.sh /data/input/ /data/output/ > /tmp/metrics.prom.$$ && mv /tmp/metrics.prom.$$ /var/lib/node_exporter/textfile_collector/metrics.prom

rm -f $LockFile;
kill -15 `pgrep exporter.sh`;
exit 0;

再给 exporter.sh 配置一个定时任务，每10分钟执行一次即可。

directory-size.sh 来源官方仓库 https://github.com/prometheus-community/node-exporter-textfile-collector-scripts/blob/master/directory-size.sh

#!/bin/sh
#
# Expose directory usage metrics, passed as an argument.
#
# Usage: add this to crontab:
#
# */5 * * * * prometheus directory-size.sh /var/lib/prometheus | sponge /var/lib/node_exporter/directory_size.prom
#
# sed pattern taken from https://www.robustperception.io/monitoring-directory-sizes-with-the-textfile-collector/
#
# Author: Antoine Beaupré 
echo "# HELP node_directory_size_bytes Disk space used by some directories"
echo "# TYPE node_directory_size_bytes gauge"
du --block-size=1 --summarize "$@" \
  | sed -ne 's/\\/\\\\/;s/"/\\"/g;s/^\([0-9]\+\)\t\(.*\)$/node_directory_size_bytes{directory="\2"} \1/p'

生成的文件需要原子写入 sponge ，这个命令在 moreutils 包里

我们内网系统没有这个包，我也懒得装，于是使用这样的方式

1	/usr/local/bin/directory-size.sh /data/input/ /data/output/ > /tmp/metrics.prom.$$ && mv /tmp/metrics.prom.$$ /var/lib/node_exporter/textfile_collector/metrics.prom

详细参考 https://www.modb.pro/db/150234

自定义监控

自定义监控需要满足 exporter 导出指标的格式，还要写注释信息，注释里面有指标的类型，prometheus 收集指标时需要知道指标的类型

1
2
3

# HELP 指标名
# TYPE 指标名 指标类型
指标名{维度1="1", 维度2="2"} 值

我尝试自己写了一个，作为示例
canal-size.sh

#!/bin/sh
#
# Author: Meow-bot  
LOG_PATH=/app/canal_xxx/log/
#DATA_MODEL=xxxx
day_id=`date +%Y%m%d`
#
#echo "# HELP canal_output_current_day_size_bytes canal output file size only current day"
#echo "# TYPE canal_output_current_day_size_bytes gauge"
#
#cat ${LOG_PATH}/info.log  | grep "${DATA_MODEL}_${day_id}" | awk -F '[:,]' 'BEGIN{sum=0}{sum+=$12}END{print "canal_output_current_day_size_bytes " sum}'
#
#echo "# HELP canal_output_size_bytes canal output file size"
#echo "# TYPE canal_output_size_bytes gauge"
#
#cat ${LOG_PATH}/info.log  | grep "${DATA_MODEL}_" | awk -F '[:,]' 'BEGIN{sum=0}{sum+=$12}END{print "canal_output_size_bytes " sum}'

echo "# HELP canal_error_log_size_bytes canal error log file size"
echo "# TYPE canal_error_log_size_bytes gauge"

ls -l ${LOG_PATH}/error.log  | awk  '{print "canal_error_log_size_bytes " $5}'

echo "# HELP canal_input_size_bytes canal input file size"
echo "# TYPE canal_input_size_bytes gauge"

du -s --block-size=1 /data/input/bak/${day_id} | awk '{print "canal_input_size_bytes " $1}'

exporter.sh

1	/usr/local/bin/canal-size.sh > /tmp/canal.prom.$$ && mv /tmp/canal.prom.$$ /var/lib/node_exporter/textfile_collector/canal.prom

进程监控

进程是 system service 参考 https://medium.com/kartbites/process-level-monitoring-and-alerting-in-prometheus-915ed7508058

进程监控（通过读取 /proc
https://github.com/ncabatoff/process-exporter

手写脚本，发给 pushgateway
https://devconnected.com/monitoring-linux-processes-using-prometheus-and-grafana/

我采用第一种，在 node_exporter.service 启动参数启用 systemd 收集器，配置简单，前提你监控的进程是 system service。如果不是，做成 service 也很简单。

fsimage 监控

监控 fsimage
https://github.com/marcelmay/hadoop-hdfs-fsimage-exporter

TODO 未来规划

prometheus 需要优化存储配置
当前接入了 200 + 服务器，从 8.25 至 10.10，存储已经 229G ，当前我们一共 1500+ 服务器，全部接入也能装得下，只是可能查询速度会变慢，这个需要注意。
hadoop yarn 队列资源监控
通过 jmx 待定
https://github.com/prometheus/jmx_exporter

[x] 3. 集成 Ambari metrics
版本要求： Grafana 4.5.x - 5.x.x 不满足
github: https://github.com/prajwalrao/ambari-metrics-grafana
可以考虑自己做适配

其他参考资料

验证告警规则 https://blog.cloudflare.com/monitoring-our-monitoring/

clickhouse 容量预测框架
https://translation.meow.page/post/clickhouse-capacity-estimation-framework/
（原文： https://blog.cloudflare.com/clickhouse-capacity-estimation-framework/ ）

逆向某终端安全助手后续的事

2022-06-19T23:24:27.000Z

⚠️ disclaimer 声明：

本文仅做技术研究，请勿用作非法用途。本文不提供文件下载。
本文仅做技术研究，请勿用作非法用途。本文不提供文件下载。
本文仅做技术研究，请勿用作非法用途。本文不提供文件下载。

前言

书接上回，咱说到，我用 golang 写了个程序，可以替换掉办公室的终端安全助手接入办公网络。

虽然仍然需要账号和每日变化的验证码才能登陆（登陆后不退出就可以一直用下去），但是因为 golang 可以跨平台变异的特性，这个程序不再局限于只能在 Windows 上使用，而且程序只有必要的登陆请求，去掉了策略检测的过程，使得程序更加简洁和绿色（以及让网络更加不安全😆）。

这几个月我一直在寻找一个适合装 openwrt 的路由器。

6月份开始，我换了个挂靠的公司，这家公司连电脑都不提供了，我的 Mac 本无法接入到办公网，我也不想开虚拟机来压榨这可怜的电脑，这一事件也加速了我对完成路由器登陆办公网的需求。

在我日积月累的刷咸鱼之下，终于咸鱼给我推荐了一款小巧而强大，同样不失颜值的路由器，最后在闲鱼上斥 160 巨资购买了下来。

路由器及配置

型号：GL-SFT1200
接口：Type-C 电源口x1，WANx1，LANx2，USB2.0x1
CPU：SF19A28，Dual-Core @ 1GHz
（国产芯片，架构是 mips）
内存：DDR3 128 MB / Nand Flash 128 MB
（在排除路由器自身的操作系统后，还剩 80M，编译后的程序是 9.7M，完全够用）
以太网口：千兆
无线速率：2.4GHz 300 Mbps / 5GHz 867 Mbps

比 GL-SFT1200 好一点的是 GL-MT1300 ，价格是 379 元
CPU 是联发科的 MT7621A，Dual-Core @ 880 MHz
无线速率相较 SFT1200 在 2.4GHz 提升了 100Mbps，没有什么用

完全不值花 379 去买

编译程序

一开始，我使用 mips 架构编译

1	CGO_ENABLED=0 GOOS=linux GOARCH=mips go build main.go

发现在路由器上不能运行

后来换为 mipsle 编译

1	CGO_ENABLED=0 GOOS=linux GOARCH=mipsle go build main.go

这才成功运行

将程序通过 scp 上传到路由器

1	scp xxxx-client root@192.168.80.1:/root/

编辑配置文件，填上账号/密码/附加码，然后运行程序

1 2	chmod +x xxxx-client ./xxxx-client

回车

接入办公网

在敲下回车的那一刹那，我的微信开始弹出消息，电脑有网了，我知道我成功了。

内网穿透

登录上后我开始尝试内网穿透，我要把网络穿透到外网。

我尝试了我最常用的 KSA ， kanxue security access。 KSA 有认证，有一定安全性，使用起来也很方便。

但是我尝试了 ksa 的 mips 和 mipsel ，无法在路由器上运行。

虽然 SF19A28 也是 mips 架构，但可能是经过了魔改，这让我有点犯恶心。

好在还有 frp，frp 也是 golang 编写的，提供了 mipsle 的二进制程序。我试了下可以运行，之前买了台国内的服务器，带宽有6m，内网穿透也足够了，但是我不想配置，而且我担心路由器的 RAM 不太撑得住。

在树莓派上运行 KSA

最后，我在路由器上接了个树莓派，跑 ksa，armhf 版，完美运行。

我的 Mac 通过连接 ksa，可以连回办公室的内网，而且网络稳定，就是这个组件有点恶心了。

hadoop 安全集群之间的拷贝

2022-04-25T00:00:34.000Z

在上一节中，我们的 OSD 可以单向信任访问 OSC，之所以只做了单向访问，是因为 OSD 是我们的生产主要集群，不方便重启，OSC 相较之下不是那么重要。

OSD –> OSC

虽然是单向信任，实际上也可以达到 OSC 集群访问 OSD 的目的。

正常情况是，OSD 的凭据可以访问 OSC
在 OSC 集群的接口服务器（客户端环境配置的是 OSC）上，专门设立一个用户，使用 OSD 的 keytab 进行 kinit，变相达到 OSC 访问 OSD。

那么两个可以访问的 hdfs 可以互相拷贝了

拷贝命令

参考官方文档

https://hadoop.apache.org/docs/stable/hadoop-distcp/DistCp.html#Update_and_Overwrite

参考命令：

/bin/hadoop distcp -Dmapreduce.job.hdfs-servers.token-renewal.exclude=, -Dipc.client.fallback-to-simple-auth-allowed=true -Dmapreduce.job.queuename=default -m 140 -pb -skipcrccheck  -update -filters /data/filters.txt /path/to/file hdfs://:8020/path/to/file

参数说明：

设置资源队列

mapred.job.queue.name=default

由于使用了kerberos认证，需要设置改属性

ipc.client.fallback-to-simple-auth-allowed=true

表示启用多少map，最大的同时拷贝数量

保持 block size

skipcrccheck

跳过 crc 校验，拷贝速度更快

update

拷贝的模式是 update

仅当目标文件和源文件的 size， blocksize 或者 checksum 不同时覆盖

filters

使用正则排除文件，这个参数指定的是包含正则的文件位置
比如要排除 .tmp 的文件

.*\.tmp

可以写多行，进行多个文件规则的排除，详见参考

https://cloudera.ericlin.me/2016/01/how-to-use-filters-to-exclude-files-when-in-distcp/

常见报错

Failed to renew token: Kind: HDFS_DELEGATION_TOKEN

1	Failed to renew token: Kind: HDFS_DELEGATION_TOKEN

参考：

https://community.cloudera.com/t5/Support-Questions/Problem-when-Distcp-between-two-HA-Cluster/td-p/216463

1	-Dmapreduce.job.hdfs-servers.token-renewal.exclude=

将两个集群的 nn 的 ip 地址加到这个属性值中，逗号分隔

目的是，指示2个集群上的 RM 去跳过或执行 NN 节点的委托令牌重认证

Caused by: java.io.IOException: Couldn’t run retriable-command

Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs:///20220331/13/_2022033113_20220331130307_.dat to hdfs://:8020/apps/5G_N1N2/20220331/30/_2022033113_20220331130307_.dat
at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:296)
... 10 more
Caused by: java.io.IOException: Check-sum mismatch between hdfs:///20220331/13/_2022033113_20220331130307_.dat and hdfs://:8020//20220331/30/.distcp.tmp.attempt_1642595741320_786418_m_000088_2. Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.)

可能是前一次拷贝中断，导致目的文件和源文件的 checksum 不一致

使用 -pb 在拷贝过程中保持 block-size

参考：

https://community.cloudera.com/t5/Support-Questions/not-able-to-distcp-from-insecure-cluster-to-secure-cluster/td-p/202187

https://docs.cloudera.com/cdp-private-cloud-base/7.1.4/replication-manager/topics/rm-dc-kerbs-distcp-secure-clstrs-wout-xrealm-auth.html

Hadoop 集群 Kerberos 认证跨域单向信任

2022-04-20T13:00:34.000Z

update： 2022年8月，终于有机会实践了，这次由我自己配置了 D 集群到 F 集群，按以下教程，亲测有效！

在我们的生产环境有多套 Hadoop 集群，这些集群都配置了 Kerberos 安全鉴权，有次的需求是要将其中一个集群(D)上的 hdfs 数据同步到另一个集群(C)，需要做信任关系。因为 D 集群比较重要，我们想尽量让 C 集群重启而不让 D 重启，所以考虑只做单向信任，在 C 集群增加 D 的访问，这样只用重启 C 集群。

D –> C

C 集群主体 OSS2.COM ( HDP-3.1.5.0 )
D 集群主体 OSS3.COM ( HDP-2.6.4.0 )

现将过程记录如下

准备工作

C 集群，D 集群所有的节点上 /etc/hosts 加上互相的集群主机
hdfs 客户端机器上的 /etc/krb5.conf 加上两边集群的主体(realm)配置。

开始配置

添加 krbtgt 的跨域信任

ssh 连到 C, D 集群的 ipa 主机

1
2
3

kinit admin

kadmin.local -q 'addprinc -pw 12345678 krbtgt/OSS2.COM@OSS3.COM' -x ipa-setup-override-restrictions

@ 后面是当前域，当前是 D 集群的域

理解：在 D 集群中加上 C 的凭据主体

如果要双向信任
两边加上 3@2

kadmin.local -q ‘addprinc -pw 12345678 krbtgt/OSS3.COM@OSS2.COM‘ -x ipa-setup-override-restrictions

配置 auth_to_local 规则

Ambari 页面上

HDFS - ADVANCED core-site - hadoop.security.auth_to_local

对应的是 core-site.xml

Ambari 上修改后，重启，这个配置会推送到所有的 hdfs 节点（datanode）上

在 DEFAULT 后面加上

RULE:[1:$1@$0](.*@OSS3.COM)s/@.*//
RULE:[2:$1@$0](dn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@OSS3.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@OSS3.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@OSS3.COM)s/.*/yarn/

当时看着专家组这么配置，具体什么含义专家也不会给你解释

这个 RULE 配置乍一看确实有点费解，Ambari 页面上的配置描述是这么写的

The mapping from kerberos principal names to local OS mapreduce.job.user.names.
  So the default rule is just "DEFAULT" which takes all principals in your default domain to their first component.
  "omalley@APACHE.ORG" and "omalley/admin@APACHE.ORG" to "omalley", if your default domain is APACHE.ORG.
The translations rules have 3 sections:
      base     filter    substitution
The base consists of a number that represents the number of components in the principal name excluding the realm and the pattern for building the name from the sections of the principal name. The base uses $0 to mean the realm, $1 to mean the first component and $2 to mean the second component.

[1:$1@$0] translates "omalley@APACHE.ORG" to "omalley@APACHE.ORG"
[2:$1] translates "omalley/admin@APACHE.ORG" to "omalley"
[2:$1%$2] translates "omalley/admin@APACHE.ORG" to "omalley%admin"

The filter is a regex in parens that must the generated string for the rule to apply.

"(.*%admin)" will take any string that ends in "%admin"
"(.*@ACME.COM)" will take any string that ends in "@ACME.COM"

Finally, the substitution is a sed rule to translate a regex into a fixed string.

"s/@ACME\.COM//" removes the first instance of "@ACME.COM".
"s/@[A-Z]*\.COM//" removes the first instance of "@" followed by a name followed by ".COM".
"s/X/Y/g" replaces all of the "X" in the name with "Y"

So, if your default realm was APACHE.ORG, but you also wanted to take all principals from ACME.COM that had a single component "joe@ACME.COM", you'd do:

RULE:[1:$1@$0](.@ACME.ORG)s/@.//
DEFAULT

To also translate the names with a second component, you'd make the rules:

RULE:[1:$1@$0](.@ACME.ORG)s/@.//
RULE:[2:$1@$0](.@ACME.ORG)s/@.//
DEFAULT

If you want to treat all principals from APACHE.ORG with /admin as "admin", your rules would look like:

RULE[2:$1%$2@$0](.%admin@APACHE.ORG)s/./admin/
DEFAULT

翻译一下

这个配置的作用是将 kerberos principal 名称映射为本地操作系统的 mapreduce.job.user.names 用户
所以默认的规则只是一个 “DEFAULT”，如果域是你的默认域(/etc/krb5.conf 中配置的 default_realm)，这条规则会取 principals 名称的第一部分作为系统用户名。
如果你的默认域是 APACHE.ORG，那么 “omalley@APACHE.ORG“ 和 “omalley/admin@APACHE.ORG“ 将被映射为 “omalley”.
映射规则由三部分组成：
base filter substitution
base 部分，由一个数字和一个匹配规则组成。数字代表 principal name (不包含域名)由几个部分构成，匹配规则是用于从服务主体名称构建用户名. $0 表示域名(realm) , $1 代表第一个部分，$2 第二个部分.

格式是这样的

[:]

数字为 1 ，就是 @ 前面只有 1 个字符串
数字为 2 ，就是 @ 前面有 2 个字符串，字符串间隔为 /

[1:$1@$0] 将 “omalley@APACHE.ORG“ 转化为 “omalley@APACHE.ORG“
[2:$1] 将 “omalley/admin@APACHE.ORG“ 转化为 “omalley”
[2:$1%$2] 将 “omalley/admin@APACHE.ORG“ 转化为 “omalley%admin”
filter 部分是一个正则表达式。用来匹配生成的字符串。
“(.%admin)” 将匹配以 “%admin” 为结尾的字符串
“(.@ACME.COM)” 将匹配以 “@ACME.COM” 为结尾的字符串

. 匹配任意的单个字符(any single character)
* 匹配前一个字符出现 0 次或者多次

substitution 部分是一个 sed 替换规则。使用固定字符串替换被匹配的正则表达式。
“s/@ACME.COM//“ 删除第一个匹配到的 “@ACME.COM”.
“s/@[A-Z]*.COM//“ 删除第一个匹配到的 “@” 后面跟着一个大写的名字，后面跟着 “.COM”.
“s/X/Y/g” 替换所有的 “X” 为 “Y”
如果你的默认域是 APACHE.ORG, 但是你想匹配从 ACME.COM 域来的所有只含一个部分的 principals “joe@ACME.COM“, 你可以这么写 rule:
RULE:1:$1@$0s/@.//
DEFAULT
如果还要映射有第二个部分的 principal 名称, 你要加上第二条 rule:
RULE:1:$1@$0s/@.//
RULE:2:$1@$0s/@.//
DEFAULT
如果你想将来自 APACHE.ORG 域的所有带有 /admin 的 principal 作为 “admin”, 你的 rule 要这么写:
RULE2:$1%$2@$0s/./admin/
DEFAULT

这个配置讲的有些不清楚，可以再看看 https://web.mit.edu/kerberos/krb5-latest/doc/admin/conf_files/krb5_conf.html#realms 这里讲的要清楚些

总的来说，格式是这样的：

1	[n:string](regexp)s/pattern/replacement/g

n 确定 principal name 有几个部分
string 转化原文到指定的输出格式
regexp 正则表达式过滤
s/pattern/replacement/g 替换输出

可以编写多个规则，一旦主体与规则匹配，则会跳过其余规则。

再来看看我们的配置，因为 DEFAULT 是转化默认域(OSS2.COM)的，所以跨域部分的配置要在 DEFAULT 之后配置

RULE:[1:$1@$0](.*@OSS3.COM)s/@.*//
RULE:[2:$1@$0](dn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@OSS3.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@OSS3.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@OSS3.COM)s/.*/yarn/

详细理解下：

第一行

1	RULE:[1:$1@$0](.@OSS3.COM)s/@.//

表示将 aaa@OSS3.COM 映射为 aaa

我们可以使用hadoop org.apache.hadoop.security.HadoopKerberosName 进行测试

1 2	$ hadoop org.apache.hadoop.security.HadoopKerberosName aaa@OSS3.COM Name: aaa@OSS3.COM to aaa

第二行

1	RULE:[2:$1@$0](dn@OSS3.COM)s/.*/hdfs/

principal 包含两个部分时，且第一部分@主体的结果是 dn@OSS3.COM,映射到 hdfs
测试如下：

1 2	$ hadoop org.apache.hadoop.security.HadoopKerberosName dn/host2@OSS3.COM Name: dn/host2@OSS3.COM to hdfs

这种 rule 主要是映射服务类的 principal

1	Service/Hostname@REALM

服务名为 dn 的是 datanode ，需要被映射到 hdfs 用户

后面几行依次是

nm, nodemanager，映射到 yarn 用户
nn, namenode，映射到 hdfs 用户
rm, resourcemanager，映射到 yarn 用户
yarn, 映射到 yarn 用户

鉴于我们的需求只涉及 hdfs 文件跨集群拷贝，所以以上组件的规则已经足够。

注意：这些都是将 OSS3.COM 域的 principal 进行转换。

再看看除了这些，一般还会配置什么 rule，这里我贴出 DEFAULT 以上的 rule

RULE:[1:$1@$0](ambari-qa-ossd@OSS3.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-ossd@OSS3.COM)s/.*/hdfs/
RULE:[1:$1@$0](spark-ossd@OSS3.COM)s/.*/spark/
RULE:[1:$1@$0](.*@OSS3.COM)s/@.*//
RULE:[2:$1@$0](amshbase@OSS3.COM)s/.*/ams/
RULE:[2:$1@$0](amszk@OSS3.COM)s/.*/ams/
RULE:[2:$1@$0](beacon@OSS3.COM)s/.*/beacon/
RULE:[2:$1@$0](dn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](hive@OSS3.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@OSS3.COM)s/.*/mapred/
RULE:[2:$1@$0](jn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](knox@OSS3.COM)s/.*/knox/
RULE:[2:$1@$0](nm@OSS3.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@OSS3.COM)s/.*/hdfs/
RULE:[2:$1@$0](rangeradmin@OSS3.COM)s/.*/ranger/
RULE:[2:$1@$0](rangertagsync@OSS3.COM)s/.*/rangertagsync/
RULE:[2:$1@$0](rangerusersync@OSS3.COM)s/.*/rangerusersync/
RULE:[2:$1@$0](rm@OSS3.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@OSS3.COM)s/.*/yarn/
RULE:[1:$1@$0](ambari-qa-kylin_ossa@OSS.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hbase-kylin_ossa@OSS.COM)s/.*/hbase/
RULE:[1:$1@$0](hdfs-kylin_ossa@OSS.COM)s/.*/hdfs/
RULE:[1:$1@$0](spark-kylin_ossa@OSS.COM)s/.*/spark/
RULE:[1:$1@$0](.*@OSS.COM)s/@.*//
RULE:[2:$1@$0](amshbase@OSS.COM)s/.*/ams/
RULE:[2:$1@$0](amszk@OSS.COM)s/.*/ams/
RULE:[2:$1@$0](beacon@OSS.COM)s/.*/beacon/
RULE:[2:$1@$0](dn@OSS.COM)s/.*/hdfs/
RULE:[2:$1@$0](hbase@OSS.COM)s/.*/hbase/
RULE:[2:$1@$0](hive@OSS.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@OSS.COM)s/.*/mapred/
RULE:[2:$1@$0](jn@OSS.COM)s/.*/hdfs/
RULE:[2:$1@$0](knox@OSS.COM)s/.*/knox/
RULE:[2:$1@$0](nm@OSS.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@OSS.COM)s/.*/hdfs/
RULE:[2:$1@$0](rangeradmin@OSS.COM)s/.*/ranger/
RULE:[2:$1@$0](rangerusersync@OSS.COM)s/.*/rangerusersync/
RULE:[2:$1@$0](rm@OSS.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@OSS.COM)s/.*/yarn/
DEFAULT

好吧，为什么他又把这些 OSS3 OSS 的添加到 DEFAULT 上面了，看来是没有和顺序相关，那就不 care 了

可以看到组件还是很多的

配置 `/etc/krb5.conf` 文件的 `[capaths]`

这是配置信任域的映射关系

需要注意的是，这个是修改客户端机器上的 /etc/krb5.conf, 集群节点上的 /etc/krb5.conf 是没有修改的。

在 /etc/krb5.conf 末尾加上

[capaths]

OSS3.COM = {
                OSS2.COM = .
 }

最后配置出来的 krb5.conf

#File modified by ipa-client-install

includedir /etc/krb5.conf.d/
includedir /var/lib/sss/pubconf/krb5.include.d/

[libdefaults]
  default_realm = OSS3.COM
  dns_lookup_realm = false
  dns_lookup_kdc = false
  rdns = false
  dns_canonicalize_hostname = false
  ticket_lifetime = 365d
  renew_lifetime = 3650d
  forwardable = true
  udp_preference_limit = 0
  default_ccache_name = /tmp/krb5cc_%{uid}


[realms]
   OSS3.COM = {
     kdc = ipa010.oss3.com
     kdc = ipa011.oss3.com
     kdc = ipa012.oss3.com
     master_kdc = ipa010.oss3.com
     master_kdc = ipa011.oss3.com
     master_kdc = ipa012.oss3.com
     admin_server = ipa010.oss3.com
     admin_server = ipa011.oss3.com
     admin_server = ipa012.oss3.com
     kpasswd_server = ipa010.oss3.com
     kpasswd_server = ipa011.oss3.com
     kpasswd_server = ipa012.oss3.com
     default_domain = oss3.com
     pkinit_anchors = FILE:/var/lib/ipa-client/pki/kdc-ca-bundle.pem
     pkinit_pool = FILE:/var/lib/ipa-client/pki/ca-bundle.pem
 
   }

  OSS2.COM = {
    kdc = ipa006.oss2.com:88
    master_kdc = ipa006.oss2.com:88
    admin_server = ipa006.oss2.com:749
    kpasswd_server = ipa006.oss2.com:464
    kdc = ipa005.oss2.com:88
    master_kdc = ipa005.oss2.com:88
    admin_server = ipa005.oss2.com:749
    kpasswd_server = ipa005.oss2.com:464
    default_domain = oss2.com
    pkinit_anchors = FILE:/var/lib/ipa-client/pki/kdc-ca-bundle.pem
    pkinit_pool = FILE:/var/lib/ipa-client/pki/ca-bundle.pem

  }


 
 [domain_realm]
   .oss3.com = OSS3.COM
   oss3.com = OSS3.COM


   .oss2.com = OSS2.COM
   oss2.com = OSS2.COM
 
 [capaths]

 OSS3.COM = {
                 OSS2.COM = .
  }

hdfs-site.xml 中的 ‘dfs.namenode.kerberos.principal.pattern’

在 ambari 上配置

custom hdfs-site
dfs.namenode.kerberos.principal.pattern=*

Ambari 上重启 C 集群的 hdfs 组件。

这个不要点错了鸭，只需要重启 hdfs 组件就可以了，我上次把所有的组件重启了，吓得我一身冷汗

验证

如何检查是不是已经通了呢

在 C 和 D 上创建一个同名的 ipa 用户，密码/加密方式都一样，加密方式一般没有专门去指定，那就不用管

在使用 D 集群凭证的客户端机器上，使用 ip 方式，hdfs dfs -ls ，查看 C 集群文件

ip 使用 C 集群的 active namenode 的 ip

如果能查看就代表通了

我们之前的跨集群拷贝脚本中涉及到的集群版本比较老，所以脚本中通过这样的方式获取 active nn 的 ip

getActiveNameNode(){
  namenodes='10.110.123.1 10.110.123.2'
  for namenode in ${namenodes}
  do
    curl -s "http://${namenode}:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus" | grep 'active' > /dev/null
    if [ $? -eq 0 ]; then
        active_namenode=${namenode}
    fi
  done
}

C 集群是 HDP 3.x 的版本，访问 50070 都需要 kerberos 认证，我也没找到别的办法可以认证，于是采用了绕路的方式（骚方法），hdfs dfs -ls 能看的，自然就是 active nn ip 了（呲牙）

getActiveNameNode(){
  namenodes='10.110.123.1 10.110.123.2'
  for namenode in ${namenodes}
  do
    hdfs dfs -ls hdfs://${namenode}:8020/  > /dev/null
    if [ $? -eq 0 ]; then
        active_namenode=${namenode}
    fi
  done
}

参考：
https://www.cnblogs.com/xiaodf/p/10689092.html

https://community.cloudera.com/t5/Community-Articles/Kerberos-cross-realm-trust-for-distcp/ta-p/245590

https://medium.com/remotehero-co/how-to-configure-distcp-on-2-kerberized-clusters-22a24658a7e1

https://www.cnblogs.com/yinzhengjie2020/p/13655547.html

http://t.zoukankan.com/devos-p-5448938.html

强制 SSH 客户端使用密码校验

2022-04-15T00:00:34.000Z

背景

这其实是2021年的事情了，当时是要通过 SFTP 从另一个厂商的服务器下载数据，那个服务器的 IP 是浮动 IP，主备机共用一个 IP ，当主机发生故障的时候，IP 就会浮动到备机上。

问题来了，SFTP 是通过 SSH 通道连接，连接的时候默认会验证主机指纹，连接的远程机器变化了主机指纹也不一样了，导致 SSH 客户端认为你受到了中间人攻击：

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
24:48:c9:a3:de:04:57:9f:f8:a5:ab:28:e0:d6:1a:bb.

解决思路

清理主机指纹缓存

作为经常玩 vps 的 mjj ，可能大家也有这样的经历，买的 vps ，重装了系统后， ssh 连接就会报错，因为现在主机指纹和你之前连接的主机指纹不一样了，这时候只需要清理 ~/.ssh/known_hosts 中对应行，再次 SSH 连接就可以了。

或者是终端会提示你运行以下命令，效果也是一样的 (前提是你在当前用户的家目录中)

1	ssh-keygen -f ".ssh/known_hosts" -R

选项含义解释

-f filename
        Specifies the filename of the key file.

-R hostname | [hostname]:port
        Removes all keys belonging to the specified hostname (with optional port number) from a known_hosts file.  This
        option is useful to delete hashed hosts (see the -H option above).

可以在每次 SSH 连接前清理一次，也可以 grep 关键词/查看 SSH 状态码为失败/expect，反正就是怎么样去检测一下指纹发生了改变，再去清理。

使用选项忽略指纹校验（推荐）

但是我还是不满足，我觉得这不是最优雅的方法，是否能有选项使 SSH 客户端仅使用密码验证。因为 sshd （ssh 服务端）都有选项来指定是否禁用密码验证（PasswordAuthentication）。

然后，还真被我找到了。

参考
https://unix.stackexchange.com/questions/15138/how-to-force-ssh-client-to-use-only-password-auth
https://serverfault.com/questions/559885/temporarily-ignore-my-ssh-known-hosts-file
https://linux.livejournal.com/1884229.html

最后命令如下

1	sshpass -p password sftp -o PreferredAuthentications=password -o PubkeyAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null user@ip

解释：

sshpass 用于非交互式时，给 sftp 传送密码
-o PreferredAuthentications=password 优先使用密码验证
-o PubkeyAuthentication=no 不要使用公钥验证
-o StrictHostKeyChecking=no 不要检查主机的指纹
-o UserKnownHostsFile=/dev/null 不要使用 known_hosts 文件

最后尝试下来，得把这些选项都安排上才能忽略

使用配置文件忽略指纹校验（不推荐，不便于维护）

后来一次升级中，换成了用 lftp （另一种客户端，支持 sftp/ftp）拉取文件，似乎是因为 lftp 支持 mget (同时拉取多个文件)

使用-o选项来忽略就不起作用了，看了下 lftp 的源码，核心也是套用的 sftp，但是我是没找到办法把 -o 这些选项传给 sftp

但是 sftp/ssh 默认是会读取 ssh_config 的，这个不用显性指定（经验）

于是在 ~/.ssh/config 写了个配置

~/.ssh/config

Host 10.239.*.*
     PreferredAuthentications password
     PubkeyAuthentication no
     StrictHostKeyChecking no
     UserKnownHostsFile /dev/null

最后文件权限 600

1	chmod 600 ~/.ssh/config

这样最后也实现了目的，但是我觉得这样配置比较恶心，不便于维护。如果后续这个拉取客户端换到别的服务器上了，以后维护的人可能不会知道 ~/.ssh/config 里面还有配置.

ssh option 附录

这里是 ssh 的所有 option，有兴趣可以再花时间研究下，看看还支持什么样的配置。

ssh option
     -o ssh_option
             Can be used to pass options to ssh in the format used in ssh_config(5).  This is useful for specifying options for which there is no separate sftp command-line flag.
             For example, to specify an alternate port use: sftp -oPort=24.  For full details of the options listed below, and their possible values, see ssh_config(5).

                   AddressFamily
                   BatchModev
                   BindAddress
                   BindInterface
                   CanonicalDomains
                   CanonicalizeFallbackLocal
                   CanonicalizeHostname
                   CanonicalizeMaxDots
                   CanonicalizePermittedCNAMEs
                   CASignatureAlgorithms
                   CertificateFile
                   ChallengeResponseAuthentication
                   CheckHostIP
                   Ciphers
                   Compression
                   ConnectionAttempts
                   ConnectTimeout
                   ControlMaster
                   ControlPath
                   ControlPersist
                   GlobalKnownHostsFile
                   GSSAPIAuthentication
                   GSSAPIDelegateCredentials
                   HashKnownHosts
                   Host
                   HostbasedAuthentication
                   HostbasedKeyTypes
                   HostKeyAlgorithms
                   HostKeyAlias
                   Hostname
                   IdentitiesOnly
                   IdentityAgent
                   IdentityFile
                   IPQoS
                   KbdInteractiveAuthentication
                   KbdInteractiveDevices
                   KexAlgorithms
                   LogLevel
                   MACs
                   NoHostAuthenticationForLocalhost
                   NumberOfPasswordPrompts
                   PasswordAuthentication
                   PKCS11Provider
                   Port
                   PreferredAuthentications
                   ProxyCommand
                   ProxyJump
                   PubkeyAcceptedKeyTypes
                   PubkeyAuthentication
                   RekeyLimit
                   SendEnv
                   ServerAliveInterval
                   ServerAliveCountMax
                   SetEnv
                   StrictHostKeyChecking
                   TCPKeepAlive
                   UpdateHostKeys
                   User
                   UserKnownHostsFile
                   VerifyHostKeyDNS

ssh 登录出现 Are you sure you want to continue connecting (yes/no)? 解决方法

如果在脚本里面，没法交互式输入，可以使用ssh -o 的参数进行设置

1	ssh -o StrictHostKeyChecking=no root@192.168.111.22

再多点思考，指纹是怎么来的

后来我思考，这个指纹是怎么生成的呢，linux 一切皆文件，这个指纹一定是放在某个地方的吧

指纹信息如何生成
https://superuser.com/questions/421997/what-is-a-ssh-key-fingerprint-and-how-is-it-generated

里面提到是这个文件 /etc/ssh/ssh_host_rsa_key.pub

但是我们的报错是这样的

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
24:48:c9:a3:de:04:57:9f:f8:a5:ab:28:e0:d6:1a:bb.

注意： ED25519 key

实际用的是 /etc/ssh/ssh_host_ed25519_key.pub 这个文件

使用 ssh-keygen 计算指纹

1 2	ssh-keygen -E md5 -lf /etc/ssh/ssh_host_ed25519_key.pub 256 MD5:24:48:c9:a3:de:04:57:9f:f8:a5:ab:28:e0:d6:1a:bb no comment (ED25519)

可以看到和报错信息是一样了吧

离线环境下部署 ceph

2022-04-10T00:00:34.000Z

因为要在 k8s 内使用存储，考虑删除容器后文件还能持久保存，所以要装个分布式文件存储系统，当前比较火的莫过于 ceph 了。

本文记了个部署 ceph 的流水账。

工具选择

参考：https://docs.ceph.com/en/latest/install/

网页上的推荐方法

cephadm 使用容器
自动部署需要 pull docker io 的包，即使你先离线导入了，还是会去请求，获取最新的 commit id，离线不可用
rook 使用容器
ceph-ansible 使用 Ansible.
- 意味着服务端需要安装 ansible
- 没有集成 Nautlius 和 Octopus 版本新加入的 orchestrator API，不能使用新的管理功能，面板集成也不能用
ceph-deploy
- 快速部署 ceph 的工具
- 没有在活跃的维护，官方没有在 Nautilus 以后的版本测试过，不支持 RHEL8, CentOS 8, 或者更新的操作系统.

我们的机器环境是 CentOS 7，犹豫再三，相比手动安装，我选择了 ceph-deploy ，最新支持到 nautilus 版本

Host

10.101.235.84 ceph-1

sda 500G
data 3T raid0 x 12 ( sdb - sdm)
mem 128G
cpu 24c

10.101.235.217 ceph-2

sda 446G
data 6T raid0 x 12 ( sdb - sdm)
mem 256G
cpu 32c

10.101.235.252 ceph-3

sda 414G
sdb 60T 12*6T raid5 (还没有重做 raid)
mem 256G
cpu 32c

准备工作

host

hostnamectl set-hostname ceph-1
hostnamectl set-hostname ceph-2
hostnamectl set-hostname ceph-3

写入 /etc/hosts

1
2
3

10.101.235.84 ceph-1
10.101.235.217 ceph-2
10.101.235.38 ceph-3

ssh 免密

ssh-keygen

ssh-copy-id ceph-1

ssh-copy-id ceph-2

ssh-copy-id ceph-3

安全设置

关闭 selinux 和防火墙

for i in {1..3};do echo $i;ssh ceph-$i “systemctl disable –now firewalld”;done

for i in {1..3};do echo $i;ssh ceph-$i “setenforce 0”;done

for i in {1..3};do echo $i;ssh ceph-$i “sed -i ‘s/^SELINUX=.*/SELINUX=disabled/‘ /etc/selinux/config”;done

ceph-1 上
vi /etc/ntp.conf

1	server 10.110.38.240 minpoll 4 maxpoll 5

ntpq -pn 查看同步状态

[root@ceph-1 ceph-cluster]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*10.110.38.240   10.108.84.45     3 u   10   16  377    0.901  -283.35   0.407

ceph-2 ceph-3 上写 ceph-1 的地址 10.101.235.84
vi /etc/ntp.conf

1	server 10.101.235.84 iburst

ntpq -pn

查看同步状态

同步完成，ip 前会显示*号

生产环境最好配置多个 ntp server

for i in {1..3};do echo $i;ssh ceph-$i “date”;done

yum 源

ceph-deploy version 2.0.0+

http://download.ceph.com/rpm-nautilus/el7/noarch/

ceph.repo

[ceph]
name=ceph
baseurl=http://10.110.38.20/ceph-nautilus/
enable=1
gpgcheck=0

yum makecache

安装 ceph-deploy

在 ceph-1

yum install python-setuptools ceph-deploy

ceph-deploy –version

deploy

在 ceph-1

mkdir my-cluster
cd my-cluster

安装会生成 ceph.conf keyring

ceph-deploy new –cluster-network=10.101.235.1/24 ceph-1

[root@ceph-1 ceph]# ceph-deploy new --cluster-network=10.101.235.1/24 ceph-1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy new --cluster-network=10.101.235.1/24 ceph-1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  func                          : 
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  ssh_copykey                   : True
[ceph_deploy.cli][INFO  ]  mon                           : ['ceph-1']
[ceph_deploy.cli][INFO  ]  public_network                : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster_network               : 10.101.235.1/24
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  fsid                          : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO  ] Running command: /usr/sbin/ip link show
[ceph-1][INFO  ] Running command: /usr/sbin/ip addr show
[ceph-1][DEBUG ] IP addresses found: [u'10.101.235.84']
[ceph_deploy.new][DEBUG ] Resolving host ceph-1
[ceph_deploy.new][DEBUG ] Monitor ceph-1 at 10.101.235.84
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-1']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['10.101.235.84']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...

ceph-deploy install {ceph-node} […]
会自动配置覆盖 yum 源，这里就手动安装

yum install -y ceph ceph-mon ceph-mgr ceph-radosgw ceph-mds

Error: Package: librdkafka-0.11.5-1.el7.x86_64 (ceph)
           Requires: liblz4.so.1()(64bit)
Error: Package: 2:ceph-base-14.2.22-0.el7.x86_64 (ceph)
           Requires: liblz4.so.1()(64bit)
Error: Package: policycoreutils-2.5-34.el7.x86_64 (ceph)
           Requires: libsemanage >= 2.5-14
           Installed: libsemanage-2.5-8.el7.x86_64 (@anaconda)
               libsemanage = 2.5-8.el7
Error: Package: policycoreutils-2.5-34.el7.x86_64 (ceph)
           Requires: libsepol >= 2.5-10
           Installed: libsepol-2.5-6.el7.x86_64 (@anaconda)
               libsepol = 2.5-6.el7
Error: Package: selinux-policy-3.13.1-268.el7_9.2.noarch (ceph)
           Requires: libsemanage >= 2.5-13
           Installed: libsemanage-2.5-8.el7.x86_64 (@anaconda)
               libsemanage = 2.5-8.el7
Error: Package: 2:ceph-mon-14.2.22-0.el7.x86_64 (ceph)
           Requires: liblz4.so.1()(64bit)
Error: Package: policycoreutils-2.5-34.el7.x86_64 (ceph)
           Requires: libselinux-utils >= 2.5-14
           Installed: libselinux-utils-2.5-11.el7.x86_64 (@anaconda)
               libselinux-utils = 2.5-11.el7
Error: Package: 2:ceph-common-14.2.22-0.el7.x86_64 (ceph)
           Requires: liblz4.so.1()(64bit)
Error: Package: 2:ceph-osd-14.2.22-0.el7.x86_64 (ceph)
           Requires: liblz4.so.1()(64bit)
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

在公网的机器上

yumdownloader --resolve libsemanage.x86_64
yumdownloader --resolve libsepol.x86_64
yumdownloader --resolve libselinux-utils
yumdownloader --resolve libselinux
yumdownloader --resolve lz4

liblz4 https://serverfault.com/questions/917688/unable-to-update-centos-7-yum-update-broken

把 rpm 放在一个目录下

1	yum install *

如果单个 yum install 会提示你这样依赖不对，那样依赖不对，没那么智能，要一起安装才行

node 1 2 3 分别安装

1	yum install -y ceph ceph-mon ceph-mgr ceph-radosgw ceph-mds

在 node 1 上

cd my-cluster

ceph-deploy mon create-initial # node 1 初始化 mon

[root@ceph-1 ceph]# ceph-deploy mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create-initial
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : 
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-1 ...
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.4.1708 Core
[ceph-1][DEBUG ] determining if provided host has same hostname in remote
[ceph-1][DEBUG ] get remote short hostname
[ceph-1][DEBUG ] deploying mon to ceph-1
[ceph-1][DEBUG ] get remote short hostname
[ceph-1][DEBUG ] remote hostname: ceph-1
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-1][DEBUG ] create the mon path if it does not exist
[ceph-1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-1/done
[ceph-1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-1/done
[ceph-1][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-1.mon.keyring
[ceph-1][DEBUG ] create the monitor keyring file
[ceph-1][INFO  ] Running command: ceph-mon --cluster ceph --mkfs -i ceph-1 --keyring /var/lib/ceph/tmp/ceph-ceph-1.mon.keyring --setuser 167 --setgroup 167
[ceph-1][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-1.mon.keyring
[ceph-1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph-1][DEBUG ] create the init path if it does not exist
[ceph-1][INFO  ] Running command: systemctl enable ceph.target
[ceph-1][INFO  ] Running command: systemctl enable ceph-mon@ceph-1
[ceph-1][WARNIN] Created symlink from /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-1.service to /usr/lib/systemd/system/ceph-mon@.service.
[ceph-1][INFO  ] Running command: systemctl start ceph-mon@ceph-1
[ceph-1][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph-1][DEBUG ] ********************************************************************************
[ceph-1][DEBUG ] status for monitor: mon.ceph-1
[ceph-1][DEBUG ] {
[ceph-1][DEBUG ]   "election_epoch": 3, 
[ceph-1][DEBUG ]   "extra_probe_peers": [], 
[ceph-1][DEBUG ]   "feature_map": {
[ceph-1][DEBUG ]     "mon": [
[ceph-1][DEBUG ]       {
[ceph-1][DEBUG ]         "features": "0x3ffddff8ffecffff", 
[ceph-1][DEBUG ]         "num": 1, 
[ceph-1][DEBUG ]         "release": "luminous"
[ceph-1][DEBUG ]       }
[ceph-1][DEBUG ]     ]
[ceph-1][DEBUG ]   }, 
[ceph-1][DEBUG ]   "features": {
[ceph-1][DEBUG ]     "quorum_con": "4611087854035861503", 
[ceph-1][DEBUG ]     "quorum_mon": [
[ceph-1][DEBUG ]       "kraken", 
[ceph-1][DEBUG ]       "luminous", 
[ceph-1][DEBUG ]       "mimic", 
[ceph-1][DEBUG ]       "osdmap-prune", 
[ceph-1][DEBUG ]       "nautilus"
[ceph-1][DEBUG ]     ], 
[ceph-1][DEBUG ]     "required_con": "2449958747315912708", 
[ceph-1][DEBUG ]     "required_mon": [
[ceph-1][DEBUG ]       "kraken", 
[ceph-1][DEBUG ]       "luminous", 
[ceph-1][DEBUG ]       "mimic", 
[ceph-1][DEBUG ]       "osdmap-prune", 
[ceph-1][DEBUG ]       "nautilus"
[ceph-1][DEBUG ]     ]
[ceph-1][DEBUG ]   }, 
[ceph-1][DEBUG ]   "monmap": {
[ceph-1][DEBUG ]     "created": "2021-07-21 17:31:24.761362", 
[ceph-1][DEBUG ]     "epoch": 1, 
[ceph-1][DEBUG ]     "features": {
[ceph-1][DEBUG ]       "optional": [], 
[ceph-1][DEBUG ]       "persistent": [
[ceph-1][DEBUG ]         "kraken", 
[ceph-1][DEBUG ]         "luminous", 
[ceph-1][DEBUG ]         "mimic", 
[ceph-1][DEBUG ]         "osdmap-prune", 
[ceph-1][DEBUG ]         "nautilus"
[ceph-1][DEBUG ]       ]
[ceph-1][DEBUG ]     }, 
[ceph-1][DEBUG ]     "fsid": "d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb", 
[ceph-1][DEBUG ]     "min_mon_release": 14, 
[ceph-1][DEBUG ]     "min_mon_release_name": "nautilus", 
[ceph-1][DEBUG ]     "modified": "2021-07-21 17:31:24.761362", 
[ceph-1][DEBUG ]     "mons": [
[ceph-1][DEBUG ]       {
[ceph-1][DEBUG ]         "addr": "10.101.235.84:6789/0", 
[ceph-1][DEBUG ]         "name": "ceph-1", 
[ceph-1][DEBUG ]         "public_addr": "10.101.235.84:6789/0", 
[ceph-1][DEBUG ]         "public_addrs": {
[ceph-1][DEBUG ]           "addrvec": [
[ceph-1][DEBUG ]             {
[ceph-1][DEBUG ]               "addr": "10.101.235.84:3300", 
[ceph-1][DEBUG ]               "nonce": 0, 
[ceph-1][DEBUG ]               "type": "v2"
[ceph-1][DEBUG ]             }, 
[ceph-1][DEBUG ]             {
[ceph-1][DEBUG ]               "addr": "10.101.235.84:6789", 
[ceph-1][DEBUG ]               "nonce": 0, 
[ceph-1][DEBUG ]               "type": "v1"
[ceph-1][DEBUG ]             }
[ceph-1][DEBUG ]           ]
[ceph-1][DEBUG ]         }, 
[ceph-1][DEBUG ]         "rank": 0
[ceph-1][DEBUG ]       }
[ceph-1][DEBUG ]     ]
[ceph-1][DEBUG ]   }, 
[ceph-1][DEBUG ]   "name": "ceph-1", 
[ceph-1][DEBUG ]   "outside_quorum": [], 
[ceph-1][DEBUG ]   "quorum": [
[ceph-1][DEBUG ]     0
[ceph-1][DEBUG ]   ], 
[ceph-1][DEBUG ]   "quorum_age": 2, 
[ceph-1][DEBUG ]   "rank": 0, 
[ceph-1][DEBUG ]   "state": "leader", 
[ceph-1][DEBUG ]   "sync_provider": []
[ceph-1][DEBUG ] }
[ceph-1][DEBUG ] ********************************************************************************
[ceph-1][INFO  ] monitor: mon.ceph-1 is running
[ceph-1][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph_deploy.mon][INFO  ] processing monitor mon.ceph-1
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph_deploy.mon][INFO  ] mon.ceph-1 monitor has reached quorum!
[ceph_deploy.mon][INFO  ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO  ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory /tmp/tmplLWg_z
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] get remote short hostname
[ceph-1][DEBUG ] fetch remote file
[ceph-1][INFO  ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph-1][INFO  ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.admin
[ceph-1][INFO  ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-mds
[ceph-1][INFO  ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-mgr
[ceph-1][INFO  ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-osd
[ceph-1][INFO  ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO  ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO  ] Destroy temp directory /tmp/tmplLWg_z

生成了很多 keyring

将 keyring 推送到所有的 node 上

ceph-deploy admin ceph-1 ceph-2 ceph-3

[root@ceph-1 ceph-cluster]# ceph-deploy admin ceph-1 ceph-2 ceph-3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy admin ceph-1 ceph-2 ceph-3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  client                        : ['ceph-1', 'ceph-2', 'ceph-3']
[ceph_deploy.cli][INFO  ]  func                          : <function admin at 0x1e3e0c8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-1
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-2
[ceph-2][DEBUG ] connected to host: ceph-2 
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-3
[ceph-3][DEBUG ] connected to host: ceph-3 
[ceph-3][DEBUG ] detect platform information from remote host
[ceph-3][DEBUG ] detect machine type
[ceph-3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

输入 ceph -s

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_WARN
            mon is allowing insecure global_id reclaim
 
  services:
    mon: 1 daemons, quorum ceph-1 (age 3h)
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

提示 mon is allowing insecure global_id reclaim

解决：

1	ceph config set mon auth_allow_insecure_global_id_reclaim false

ceph -s

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-1 (age 3h)
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

monitor 1 个是 ceph-1
manager 还没有

当前 cluster 目录下

[root@ceph-1 ceph-cluster]# ll
total 48
-rw------- 1 root root   113 Jul 21 17:31 ceph.bootstrap-mds.keyring
-rw------- 1 root root   113 Jul 21 17:31 ceph.bootstrap-mgr.keyring
-rw------- 1 root root   113 Jul 21 17:31 ceph.bootstrap-osd.keyring
-rw------- 1 root root   113 Jul 21 17:31 ceph.bootstrap-rgw.keyring
-rw------- 1 root root   151 Jul 21 17:31 ceph.client.admin.keyring
-rw-r--r-- 1 root root   231 Jul 21 15:34 ceph.conf
-rw-r--r-- 1 root root 17956 Jul 21 21:09 ceph-deploy-ceph.log
-rw------- 1 root root    73 Jul 21 15:34 ceph.mon.keyring

将 ceph-1 作为 mgr

1	ceph-deploy mgr create ceph-1

[root@ceph-1 ceph-cluster]# ceph-deploy mgr create ceph-1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mgr create ceph-1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           : [('ceph-1', 'ceph-1')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : 
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-1:ceph-1
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-1
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-1][WARNIN] mgr keyring does not exist yet, creating one
[ceph-1][DEBUG ] create a keyring file
[ceph-1][DEBUG ] create path recursively if it doesn't exist
[ceph-1][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-1/keyring
[ceph-1][INFO  ] Running command: systemctl enable ceph-mgr@ceph-1
[ceph-1][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-1.service to /usr/lib/systemd/system/ceph-mgr@.service.
[ceph-1][INFO  ] Running command: systemctl start ceph-mgr@ceph-1
[ceph-1][INFO  ] Running command: systemctl enable ceph.target

ceph -s

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-1 (age 4h)
    mgr: ceph-1(active, since 2s)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

准备 osd

删除分区

umount

清除 /etc/fstab

fdisk /dev/sdb
d
w

ceph-deploy osd create ceph-1 –data /dev/sdb
ceph-deploy osd create ceph-2 –data /dev/sdb
ceph-deploy osd create ceph-3 –data /dev/sdb

[root@ceph-1 ceph-cluster]# ceph-deploy osd create ceph-1 --data /dev/sdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy osd create ceph-1 --data /dev/sdb
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  block_wal                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  journal                       : None
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  host                          : ceph-1
[ceph_deploy.cli][INFO  ]  filestore                     : None
[ceph_deploy.cli][INFO  ]  func                          : 
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.cli][INFO  ]  data                          : /dev/sdb
[ceph_deploy.cli][INFO  ]  block_db                      : None
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph-1
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-1][WARNIN] osd keyring does not exist yet, creating one
[ceph-1][DEBUG ] create a keyring file
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[ceph-1][WARNIN] usage: ceph-volume lvm create [-h] [--crush-device-class CRUSH_DEVICE_CLASS]
[ceph-1][WARNIN]                               [--data-slots DATA_SLOTS]
[ceph-1][WARNIN]                               [--data-size DATA_SIZE]
[ceph-1][WARNIN]                               [--cluster-fsid CLUSTER_FSID] [--dmcrypt]
[ceph-1][WARNIN]                               [--osd-fsid OSD_FSID] --data DATA
[ceph-1][WARNIN]                               [--osd-id OSD_ID] [--no-systemd]
[ceph-1][WARNIN]                               [--block.wal-size BLOCK_WAL_SIZE]
[ceph-1][WARNIN]                               [--block.wal-slots BLOCK_WAL_SLOTS]
[ceph-1][WARNIN]                               [--block.db-size BLOCK_DB_SIZE]
[ceph-1][WARNIN]                               [--block.wal BLOCK_WAL]
[ceph-1][WARNIN]                               [--block.db-slots BLOCK_DB_SLOTS] [--bluestore]
[ceph-1][WARNIN]                               [--block.db BLOCK_DB] [--filestore]
[ceph-1][WARNIN]                               [--journal-size JOURNAL_SIZE]
[ceph-1][WARNIN]                               [--journal JOURNAL]
[ceph-1][WARNIN]                               [--journal-slots JOURNAL_SLOTS]
[ceph-1][WARNIN] ceph-volume lvm create: error: GPT headers found, they must be removed on: /dev/sdb
[ceph-1][ERROR ] RuntimeError: command returned non-zero exit status: 2
[ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs

[ceph-1][WARNIN] ceph-volume lvm create: error: GPT headers found, they must be removed on: /dev/sdb

https://www.jianshu.com/p/d7fcf1cb5a48

该错误关键点就是 GPT headers found, they must be removed，发生原因应该是之前磁盘被分区过，虽然删掉了分区，但是还存在 GPT 数据结构，使用 sgdisk 命令进行清除。

1	#sgdisk --zap-all /dev/sdX

sgdisk 我们的 yum 源没有

https://zhuanlan.zhihu.com/p/73479251

可以看到, 新增的硬盘中发现了 GPT 分区表, 所以添加失败了, 我们要手动清理掉硬盘的分区表 (当然如果硬盘是全新的, 这里应该就成功了).

这里我们直接暴力干掉分区表, 不用费事的操作 PV 和 VG 了.

注意, 一定要再三检查目标硬盘是否是期望的硬盘, 如果操作错了硬盘, 分区表直接就没了.

[root@storage03-ib ceph-9]# dd if=/dev/zero of=/dev/sde bs=512K count=1
1+0 records in
1+0 records out
524288 bytes (524 kB) copied, 0.00109677 s, 478 MB/s

利用 dd 命令把硬盘的前 512K 填充为 0, 直接干掉分区信息.

看评论，似乎还可以

可以用 wipefs -a /dev/sde，这样更快些可以用 partprobe 让内核重新读分区表，不用重启

没有尝试过

dd if=/dev/zero of=/dev/sdb bs=512K count=1

继续

ceph-deploy osd create ceph-1 –data /dev/sdb

[root@ceph-1 ceph-cluster]# ceph-deploy osd create ceph-1 --data /dev/sdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy osd create ceph-1 --data /dev/sdb
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  block_wal                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  journal                       : None
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  host                          : ceph-1
[ceph_deploy.cli][INFO  ]  filestore                     : None
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x2415758>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.cli][INFO  ]  data                          : /dev/sdb
[ceph_deploy.cli][INFO  ]  block_db                      : None
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph-1
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[ceph-1][WARNIN] Running command: /bin/ceph-authtool --gen-print-key
[ceph-1][WARNIN] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 749563a0-bcb0-4246-a5b7-9a9681b2bb68
[ceph-1][WARNIN] Running command: /sbin/vgcreate --force --yes ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37 /dev/sdb
[ceph-1][WARNIN]  stdout: Physical volume "/dev/sdb" successfully created.
[ceph-1][WARNIN]  stdout: Volume group "ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37" successfully created
[ceph-1][WARNIN] Running command: /sbin/lvcreate --yes -l 7867810 -n osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68 ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37
[ceph-1][WARNIN]  stdout: Wiping xfs signature on /dev/ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37/osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68.
[ceph-1][WARNIN]  stdout: Logical volume "osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68" created.
[ceph-1][WARNIN] Running command: /bin/ceph-authtool --gen-print-key
[ceph-1][WARNIN] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
[ceph-1][WARNIN] Running command: /bin/chown -h ceph:ceph /dev/ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37/osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68
[ceph-1][WARNIN] Running command: /bin/chown -R ceph:ceph /dev/dm-3
[ceph-1][WARNIN] Running command: /bin/ln -s /dev/ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37/osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68 /var/lib/ceph/osd/ceph-0/block
[ceph-1][WARNIN] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
[ceph-1][WARNIN]  stderr: 2021-07-21 22:40:12.775 7fb864843700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
[ceph-1][WARNIN] 2021-07-21 22:40:12.775 7fb864843700 -1 AuthRegistry(0x7fb860066aa8) no keyring found at /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
[ceph-1][WARNIN]  stderr: got monmap epoch 1
[ceph-1][WARNIN] Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-0/keyring --create-keyring --name osd.0 --add-key AQDLMfhg4mvlDxAAanSsTFDDOf03gDL2/rJxTA==
[ceph-1][WARNIN]  stdout: creating /var/lib/ceph/osd/ceph-0/keyring
[ceph-1][WARNIN] added entity osd.0 auth(key=AQDLMfhg4mvlDxAAanSsTFDDOf03gDL2/rJxTA==)
[ceph-1][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
[ceph-1][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
[ceph-1][WARNIN] Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 749563a0-bcb0-4246-a5b7-9a9681b2bb68 --setuser ceph --setgroup ceph
[ceph-1][WARNIN]  stderr: 2021-07-21 22:40:13.390 7f7493915a80 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
[ceph-1][WARNIN] --> ceph-volume lvm prepare successful for: /dev/sdb
[ceph-1][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
[ceph-1][WARNIN] Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37/osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68 --path /var/lib/ceph/osd/ceph-0 --no-mon-config
[ceph-1][WARNIN] Running command: /bin/ln -snf /dev/ceph-2adefc7a-f41f-48d4-92eb-6427b5a50e37/osd-block-749563a0-bcb0-4246-a5b7-9a9681b2bb68 /var/lib/ceph/osd/ceph-0/block
[ceph-1][WARNIN] Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block
[ceph-1][WARNIN] Running command: /bin/chown -R ceph:ceph /dev/dm-3
[ceph-1][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
[ceph-1][WARNIN] Running command: /bin/systemctl enable ceph-volume@lvm-0-749563a0-bcb0-4246-a5b7-9a9681b2bb68
[ceph-1][WARNIN]  stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-749563a0-bcb0-4246-a5b7-9a9681b2bb68.service to /usr/lib/systemd/system/ceph-volume@.service.
[ceph-1][WARNIN] Running command: /bin/systemctl enable --runtime ceph-osd@0
[ceph-1][WARNIN]  stderr: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.
[ceph-1][WARNIN] Running command: /bin/systemctl start ceph-osd@0
[ceph-1][WARNIN] --> ceph-volume lvm activate successful for osd ID: 0
[ceph-1][WARNIN] --> ceph-volume lvm create successful for: /dev/sdb
[ceph-1][INFO  ] checking OSD status...
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph-1 is now ready for osd use.

这时候 ceph -s

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_WARN
            OSD count 1 < osd_pool_default_size 3
 
  services:
    mon: 1 daemons, quorum ceph-1 (age 5h)
    mgr: ceph-1(active, since 55m)
    osd: 1 osds: 1 up (since 100s), 1 in (since 100s)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   1.0 GiB used, 30 TiB / 30 TiB avail
    pgs:

ceph-deploy osd create ceph-2 –data /dev/sdb
ceph-deploy osd create ceph-3 –data /dev/sdb

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-1 (age 5h)
    mgr: ceph-1(active, since 59m)
    osd: 3 osds: 3 up (since 7s), 3 in (since 7s)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 94 TiB / 94 TiB avail
    pgs:

此时不再显示 warn 了

1 2	HEALTH_WARN OSD count 1 < osd_pool_default_size 3

查看 osd 状态

[root@ceph-1 ceph-cluster]# ceph osd status 
+----+--------+-------+-------+--------+---------+--------+---------+-----------+
| id |  host  |  used | avail | wr ops | wr data | rd ops | rd data |   state   |
+----+--------+-------+-------+--------+---------+--------+---------+-----------+
| 0  | ceph-1 | 1025M | 30.0T |    0   |     0   |    0   |     0   | exists,up |
| 1  | ceph-2 | 1025M | 60.0T |    0   |     0   |    0   |     0   | exists,up |
| 2  | ceph-3 | 1025M | 3904G |    0   |     0   |    0   |     0   | exists,up |
+----+--------+-------+-------+--------+---------+--------+---------+-----------+

拓展 mon 和 mgr

参考 https://docs.ceph.com/en/octopus/install/ceph-deploy/quick-ceph-deploy/#expanding-your-cluster

mon 需要高可用
mon 挂了整个集群挂了
Paxos 算法奇数个

ceph-deploy mon add ceph-2 ceph-3

报错

[ceph-2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph-2][WARNIN] ceph-2 is not defined in `mon initial members`
[ceph-2][WARNIN] monitor ceph-2 does not exist in monmap
[ceph-2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[ceph-2][WARNIN] monitors may not be able to form quorum
[ceph-2][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-2.asok mon_status
[ceph-2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph-2][WARNIN] monitor: mon.ceph-2, might not be running yet

原因 ceph.conf 配置文件中缺少 public network 的配置

添加了后，把配置推送到每个节点

ceph-deploy –overwrite-conf config push ceph-1 ceph-2 ceph-3

ceph-deploy mon add ceph-2

[root@ceph-1 ceph-cluster]# ceph-deploy mon add ceph-2 
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mon add ceph-2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : add
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['ceph-2']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x224c2a8>
[ceph_deploy.cli][INFO  ]  address                       : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mon][INFO  ] ensuring configuration of new mon host: ceph-2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-2
[ceph-2][DEBUG ] connected to host: ceph-2 
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph-2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 10.101.235.217
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-2 ...
[ceph-2][DEBUG ] connected to host: ceph-2 
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph-2][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.4.1708 Core
[ceph-2][DEBUG ] determining if provided host has same hostname in remote
[ceph-2][DEBUG ] get remote short hostname
[ceph-2][DEBUG ] adding mon to ceph-2
[ceph-2][DEBUG ] get remote short hostname
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-2][DEBUG ] create the mon path if it does not exist
[ceph-2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-2/done
[ceph-2][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph-2][DEBUG ] create the init path if it does not exist
[ceph-2][INFO  ] Running command: systemctl enable ceph.target
[ceph-2][INFO  ] Running command: systemctl enable ceph-mon@ceph-2
[ceph-2][INFO  ] Running command: systemctl start ceph-mon@ceph-2
[ceph-2][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-2.asok mon_status
[ceph-2][WARNIN] ceph-2 is not defined in `mon initial members`
[ceph-2][WARNIN] monitor ceph-2 does not exist in monmap
[ceph-2][INFO  ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-2.asok mon_status
[ceph-2][DEBUG ] ********************************************************************************
[ceph-2][DEBUG ] status for monitor: mon.ceph-2
[ceph-2][DEBUG ] {
[ceph-2][DEBUG ]   "election_epoch": 0, 
[ceph-2][DEBUG ]   "extra_probe_peers": [], 
[ceph-2][DEBUG ]   "feature_map": {
[ceph-2][DEBUG ]     "mon": [
[ceph-2][DEBUG ]       {
[ceph-2][DEBUG ]         "features": "0x3ffddff8ffecffff", 
[ceph-2][DEBUG ]         "num": 1, 
[ceph-2][DEBUG ]         "release": "luminous"
[ceph-2][DEBUG ]       }
[ceph-2][DEBUG ]     ]
[ceph-2][DEBUG ]   }, 
[ceph-2][DEBUG ]   "features": {
[ceph-2][DEBUG ]     "quorum_con": "0", 
[ceph-2][DEBUG ]     "quorum_mon": [], 
[ceph-2][DEBUG ]     "required_con": "2449958197560098820", 
[ceph-2][DEBUG ]     "required_mon": [
[ceph-2][DEBUG ]       "kraken", 
[ceph-2][DEBUG ]       "luminous", 
[ceph-2][DEBUG ]       "mimic", 
[ceph-2][DEBUG ]       "osdmap-prune", 
[ceph-2][DEBUG ]       "nautilus"
[ceph-2][DEBUG ]     ]
[ceph-2][DEBUG ]   }, 
[ceph-2][DEBUG ]   "monmap": {
[ceph-2][DEBUG ]     "created": "2021-07-21 17:31:24.761362", 
[ceph-2][DEBUG ]     "epoch": 1, 
[ceph-2][DEBUG ]     "features": {
[ceph-2][DEBUG ]       "optional": [], 
[ceph-2][DEBUG ]       "persistent": [
[ceph-2][DEBUG ]         "kraken", 
[ceph-2][DEBUG ]         "luminous", 
[ceph-2][DEBUG ]         "mimic", 
[ceph-2][DEBUG ]         "osdmap-prune", 
[ceph-2][DEBUG ]         "nautilus"
[ceph-2][DEBUG ]       ]
[ceph-2][DEBUG ]     }, 
[ceph-2][DEBUG ]     "fsid": "d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb", 
[ceph-2][DEBUG ]     "min_mon_release": 14, 
[ceph-2][DEBUG ]     "min_mon_release_name": "nautilus", 
[ceph-2][DEBUG ]     "modified": "2021-07-21 17:31:24.761362", 
[ceph-2][DEBUG ]     "mons": [
[ceph-2][DEBUG ]       {
[ceph-2][DEBUG ]         "addr": "10.101.235.84:6789/0", 
[ceph-2][DEBUG ]         "name": "ceph-1", 
[ceph-2][DEBUG ]         "public_addr": "10.101.235.84:6789/0", 
[ceph-2][DEBUG ]         "public_addrs": {
[ceph-2][DEBUG ]           "addrvec": [
[ceph-2][DEBUG ]             {
[ceph-2][DEBUG ]               "addr": "10.101.235.84:3300", 
[ceph-2][DEBUG ]               "nonce": 0, 
[ceph-2][DEBUG ]               "type": "v2"
[ceph-2][DEBUG ]             }, 
[ceph-2][DEBUG ]             {
[ceph-2][DEBUG ]               "addr": "10.101.235.84:6789", 
[ceph-2][DEBUG ]               "nonce": 0, 
[ceph-2][DEBUG ]               "type": "v1"
[ceph-2][DEBUG ]             }
[ceph-2][DEBUG ]           ]
[ceph-2][DEBUG ]         }, 
[ceph-2][DEBUG ]         "rank": 0
[ceph-2][DEBUG ]       }
[ceph-2][DEBUG ]     ]
[ceph-2][DEBUG ]   }, 
[ceph-2][DEBUG ]   "name": "ceph-2", 
[ceph-2][DEBUG ]   "outside_quorum": [], 
[ceph-2][DEBUG ]   "quorum": [], 
[ceph-2][DEBUG ]   "rank": -1, 
[ceph-2][DEBUG ]   "state": "probing", 
[ceph-2][DEBUG ]   "sync_provider": []
[ceph-2][DEBUG ] }
[ceph-2][DEBUG ] ********************************************************************************
[ceph-2][INFO  ] monitor: mon.ceph-2 is currently at the state of probing

ceph-deploy mon add ceph-3

ceph -s

cluster:
  id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
  health: HEALTH_WARN
          clock skew detected on mon.ceph-2, mon.ceph-3
 
services:
  mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 14s)
  mgr: ceph-1(active, since 9m)
  osd: 3 osds: 3 up (since 27m), 3 in (since 27m)
 
data:
  pools:   0 pools, 0 pgs
  objects: 0 objects, 0 B
  usage:   3.0 GiB used, 94 TiB / 94 TiB avail
  pgs:

ceph quorum_status –format json-pretty

[root@ceph-1 ceph-cluster]# ceph quorum_status --format json-pretty

{
    "election_epoch": 12,
    "quorum": [
        0,
        1,
        2
    ],
    "quorum_names": [
        "ceph-1",
        "ceph-2",
        "ceph-3"
    ],
    "quorum_leader_name": "ceph-1",
    "quorum_age": 63,
    "monmap": {
        "epoch": 3,
        "fsid": "d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb",
        "modified": "2021-07-21 23:12:49.336700",
        "created": "2021-07-21 17:31:24.761362",
        "min_mon_release": 14,
        "min_mon_release_name": "nautilus",
        "features": {
            "persistent": [
                "kraken",
                "luminous",
                "mimic",
                "osdmap-prune",
                "nautilus"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "ceph-1",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.101.235.84:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.101.235.84:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.101.235.84:6789/0",
                "public_addr": "10.101.235.84:6789/0"
            },
            {
                "rank": 1,
                "name": "ceph-2",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.101.235.217:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.101.235.217:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.101.235.217:6789/0",
                "public_addr": "10.101.235.217:6789/0"
            },
            {
                "rank": 2,
                "name": "ceph-3",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.101.235.38:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.101.235.38:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.101.235.38:6789/0",
                "public_addr": "10.101.235.38:6789/0"
            }
        ]
    }
}

ceph mon stat

1
2

[root@ceph-1 ceph-cluster]# ceph mon stat
e3: 3 mons at {ceph-1=[v2:10.101.235.84:3300/0,v1:10.101.235.84:6789/0],ceph-2=[v2:10.101.235.217:3300/0,v1:10.101.235.217:6789/0],ceph-3=[v2:10.101.235.38:3300/0,v1:10.101.235.38:6789/0]}, election epoch 12, leader 0 ceph-1, quorum 0,1,2 ceph-1,ceph-2,ceph-3

ceph mon dump

[root@ceph-1 ceph-cluster]# ceph mon dump 
epoch 3
fsid d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
last_changed 2021-07-21 23:12:49.336700
created 2021-07-21 17:31:24.761362
min_mon_release 14 (nautilus)
0: [v2:10.101.235.84:3300/0,v1:10.101.235.84:6789/0] mon.ceph-1
1: [v2:10.101.235.217:3300/0,v1:10.101.235.217:6789/0] mon.ceph-2
2: [v2:10.101.235.38:3300/0,v1:10.101.235.38:6789/0] mon.ceph-3
dumped monmap epoch 3

创建 mgr

ceph-deploy mgr create ceph-2 ceph-3

[root@ceph-1 ceph-cluster]# ceph-deploy mgr create ceph-2 ceph-3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mgr create ceph-2 ceph-3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           : [('ceph-2', 'ceph-2'), ('ceph-3', 'ceph-3')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : 
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-2:ceph-2 ceph-3:ceph-3
[ceph-2][DEBUG ] connected to host: ceph-2 
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-2
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-2][WARNIN] mgr keyring does not exist yet, creating one
[ceph-2][DEBUG ] create a keyring file
[ceph-2][DEBUG ] create path recursively if it doesn't exist
[ceph-2][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-2 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-2/keyring
[ceph-2][INFO  ] Running command: systemctl enable ceph-mgr@ceph-2
[ceph-2][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-2.service to /usr/lib/systemd/system/ceph-mgr@.service.
[ceph-2][INFO  ] Running command: systemctl start ceph-mgr@ceph-2
[ceph-2][INFO  ] Running command: systemctl enable ceph.target
[ceph-3][DEBUG ] connected to host: ceph-3 
[ceph-3][DEBUG ] detect platform information from remote host
[ceph-3][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-3
[ceph-3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-3][WARNIN] mgr keyring does not exist yet, creating one
[ceph-3][DEBUG ] create a keyring file
[ceph-3][DEBUG ] create path recursively if it doesn't exist
[ceph-3][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-3 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-3/keyring
[ceph-3][INFO  ] Running command: systemctl enable ceph-mgr@ceph-3
[ceph-3][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-3.service to /usr/lib/systemd/system/ceph-mgr@.service.
[ceph-3][INFO  ] Running command: systemctl start ceph-mgr@ceph-3
[ceph-3][INFO  ] Running command: systemctl enable ceph.target

ceph -s

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_WARN
            clock skew detected on mon.ceph-2, mon.ceph-3
 
  services:
    mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 7m)
    mgr: ceph-1(active, since 16m), standbys: ceph-2, ceph-3
    osd: 3 osds: 3 up (since 34m), 3 in (since 34m)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 94 TiB / 94 TiB avail
    pgs:

dashboard 面板

12.x luminous 以上版本提供

https://docs.ceph.com/en/nautilus/mgr/dashboard/

WebUI based on Angular/TypeScript

特性

https://docs.ceph.com/en/nautilus/mgr/dashboard/#feature-overview

多用户多权限
支持 SSO （单一登陆）
SSL/TLS 支持：自签证书，CA 发行的证书
审计：命令审计
i18n 国际化

管理和监控功能

整个集群的健康
内嵌 grafana 面板
集群日志
主机管理
性能监控
监控
配置

yum install ceph-mgr-dashboard

自动安装 ceph-grafana-dashboards

ceph-grafana-dashboards 这个包里带有 grafana dashboard json

for i in {1..3};do echo $i;ssh ceph-$i “yum install -y ceph-mgr-dashboard”;done

开启 dashboard 记得要在所有 mgr 节点上安装 dashboard 包，否则会报错

就算 –force 忽略错误，如果当前 mgr 挂了，mgr 转移到别的 standby mgr 节点，你的 dashboard 将无法使用

ceph mgr module enable dashboard

SSL

ceph dashboard create-self-signed-cert

1 2	[root@ceph-1 ceph-cluster]# ceph dashboard create-self-signed-cert Self-signed certificate created

openssl req -new -nodes -x509 -subj “/O=IT/CN=ceph-mgr-dashboard” -days 3650 -keyout dashboard.key -out dashboard.crt -extensions v3_ca

[root@ceph-1 ceph-cluster]# openssl req -new -nodes -x509   -subj "/O=IT/CN=ceph-mgr-dashboard" -days 3650   -keyout dashboard.key -out dashboard.crt -extensions v3_ca
Generating a 2048 bit RSA private key
.....................................+++
........+++
writing new private key to 'dashboard.key'
-----

ceph config-key set mgr mgr/dashboard/crt -i dashboard.crt
ceph config-key set mgr mgr/dashboard/key -i dashboard.key
ceph config-key set mgr/dashboard/ceph1/crt -i dashboard.crt

[root@ceph-1 ceph-cluster]# ceph config-key set mgr mgr/dashboard/crt -i dashboard.crt
set mgr
[root@ceph-1 ceph-cluster]# ceph config-key set mgr mgr/dashboard/key -i dashboard.key
set mgr
[root@ceph-1 ceph-cluster]# ceph config-key set mgr/dashboard/ceph1/crt -i dashboard.crt
WARNING: it looks like you might be trying to set a ceph-mgr module configuration key.  Since Ceph 13.0.0 (Mimic), mgr module configuration is done with `config set`, and new values set using `config-key set` will be ignored.
set mgr/dashboard/ceph1/crt

ceph dashboard set-ssl-certificate -i dashboard.crt
ceph dashboard set-ssl-certificate-key -i dashboard.key
ceph dashboard set-ssl-certificate ceph-1 -i dashboard.crt
ceph dashboard set-ssl-certificate-key ceph-1 -i dashboard.key

[root@ceph-1 ceph-cluster]# ceph dashboard set-ssl-certificate -i dashboard.crt
SSL certificate updated
[root@ceph-1 ceph-cluster]# ceph dashboard set-ssl-certificate-key -i dashboard.key
SSL certificate key updated
[root@ceph-1 ceph-cluster]# ceph dashboard set-ssl-certificate  ceph-1 -i dashboard.crt
SSL certificate updated
[root@ceph-1 ceph-cluster]# ceph dashboard set-ssl-certificate-key ceph-1 -i dashboard.key
SSL certificate key updated

ceph mgr services

[root@ceph-1 ceph-cluster]# ceph mgr services
{
    "dashboard": "https://ceph-1:8443/"
}

创建用户
n版本，需要把密码写在文本中传进去
模版
ceph dashboard ac-user-create admin -i password.txt administrator

不太懂这个操作，防止在 history 里面泄露么

部署 mds

ceph-deploy mds create ceph-1 ceph-2 ceph-3

[root@ceph-1 ceph-cluster]# ceph-deploy mds create ceph-1 ceph-2 ceph-3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-1 ceph-2 ceph-3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : 
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  mds                           : [('ceph-1', 'ceph-1'), ('ceph-2', 'ceph-2'), ('ceph-3', 'ceph-3')]
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-1:ceph-1 ceph-2:ceph-2 ceph-3:ceph-3
[ceph-1][DEBUG ] connected to host: ceph-1 
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph_deploy.mds][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-1
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-1][WARNIN] mds keyring does not exist yet, creating one
[ceph-1][DEBUG ] create a keyring file
[ceph-1][DEBUG ] create path if it doesn't exist
[ceph-1][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-1 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-1/keyring
[ceph-1][INFO  ] Running command: systemctl enable ceph-mds@ceph-1
[ceph-1][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-1.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-1][INFO  ] Running command: systemctl start ceph-mds@ceph-1
[ceph-1][INFO  ] Running command: systemctl enable ceph.target
[ceph-2][DEBUG ] connected to host: ceph-2 
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph_deploy.mds][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-2
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-2][WARNIN] mds keyring does not exist yet, creating one
[ceph-2][DEBUG ] create a keyring file
[ceph-2][DEBUG ] create path if it doesn't exist
[ceph-2][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-2 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-2/keyring
[ceph-2][INFO  ] Running command: systemctl enable ceph-mds@ceph-2
[ceph-2][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-2.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-2][INFO  ] Running command: systemctl start ceph-mds@ceph-2
[ceph-2][INFO  ] Running command: systemctl enable ceph.target
[ceph-3][DEBUG ] connected to host: ceph-3 
[ceph-3][DEBUG ] detect platform information from remote host
[ceph-3][DEBUG ] detect machine type
[ceph_deploy.mds][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-3
[ceph-3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-3][WARNIN] mds keyring does not exist yet, creating one
[ceph-3][DEBUG ] create a keyring file
[ceph-3][DEBUG ] create path if it doesn't exist
[ceph-3][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-3 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-3/keyring
[ceph-3][INFO  ] Running command: systemctl enable ceph-mds@ceph-3
[ceph-3][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-3.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-3][INFO  ] Running command: systemctl start ceph-mds@ceph-3
[ceph-3][INFO  ] Running command: systemctl enable ceph.target
[root@ceph-1 ceph-cluster]#

ceph -s

[root@ceph-1 ceph-cluster]# ceph -s
  cluster:
    id:     d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
    health: HEALTH_WARN
            clock skew detected on mon.ceph-2, mon.ceph-3
 
  services:
    mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 68m)
    mgr: ceph-1(active, since 84m), standbys: ceph-2, ceph-3
    mds:  3 up:standby
    osd: 3 osds: 3 up (since 2h), 3 in (since 2h)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 94 TiB / 94 TiB avail
    pgs:

3 个 up:standy

因为当前没有文件系统

重做单盘 raid0

https://medium.com/@george.shuklin/how-to-remove-osd-from-ceph-cluster-b4c37cc0ec87

osd invalid

ceph osd out osd.11
If you see “osd.11 is already out” — it’s ok.
ceph osd down osd.11
Remove it: ceph osd rm osd.11. If it says ‘Error EBUSY: osd.11 is still up; must be down before removal.’ that means OSD is not dead yet. Go to the host it resides on and kill it (systemctl stop ceph-osd@11), and repeat rm operation.
Now it would list in ceph osd tree with ‘DNE’ status (DNE = do not exists). To clean up this status, remove it from CRUSH map: ceph osd crush rm osd.11
Last step: remove it authorization (it should prevent problems with ‘couldn’t add new osd with same number’): ceph auth del osd.

如何在 Linux 下抓取 HTTPS 流量

2022-04-05T10:30:59.000Z

最近在根据一个命令行的客户端程序，复刻它的服务端，虽然客户端程序是开源的，但是是 golang 写的，我对 golang 不是很熟悉，所以准备抓包重放来辅助复刻。

mitmproxy

使用 mitmproxy , Linux 系统可以直接在官网下载二进制文件

官网地址：https://mitmproxy.org/

下载地址：https://snapshots.mitmproxy.org/8.0.0/mitmproxy-8.0.0-linux.tar.gz

下载后解压

1	tar zxvf mitmproxy-8.0.0-linux.tar.gz

使用

1	./mitmproxy

程序监听在 127.0.0.1:8080

安装中间人证书

参考： https://docs.mitmproxy.org/stable/concepts-certificates/#installing-the-mitmproxy-ca-certificate-manually

设置代理

另起一个终端，设置代理地址为 127.0.0.1:8080

1 2	export http_proxy=127.0.0.1:8080 export https_proxy=127.0.0.1:8080

运行你的程序，这样 mitmproxy 那边就能看到相关请求了

保存抓包文件

在 mitmproxy 交互式窗口内，按w，下方会出现

@shown

在 @shown 后面空一格子，输入绝对路径，比如/root/1.mitm，敲下回车就可以保存了。

其他快捷键可以参考官方文档，或者 https://quickref.me/mitmproxy 这个表格。

在本地浏览抓包文件

将上面的抓包文件拖回本地后，本地电脑也要安装 mitmproxy ，浏览抓包文件需要运行 mitmweb 命令，运行后程序会自动打开网页端 http://127.0.0.1:8081/

在网页端 File - Open 打开抓包文件即可浏览

为什么不用 mitmproxy 透明代理

mitmproxy 也提供透明代理，大概就是不用去设置 https_proxy 和 http_proxy

详见： https://docs.mitmproxy.org/stable/howto-transparent/

但是我没有配置成功，就放弃了，现在也不是很执着必须要配置出来，不管用什么骚方法，能达到我的抓包目的就行。

探索过程

最开始我准备使用 tcpdump，可是 tcpdump 抓包是在网口上抓包，而不是在应用层面抓包，没有安装 mitm 证书的情况下，是抓不到通信内容的，虽然我也知道抓不到，但是实际上能看到什么内容我还是不太清楚的，于是专门试验了下。

结论是：

只能抓到 SNI。
请求路径，通信内容全是加密的。

SNI 是 Server Name Indication，TLS 服务器名称指示，具体是啥呢可以看这篇 https://www.cloudflare.com/zh-cn/learning/ssl/what-is-sni/ 。

SNI 可能是 HTTPS 网站的域名，如果用了 cloudflare 这样的 CDN，那就可能是 cloudflare 证书的域名地址了。

刚好最近在做 DPI 有关的事，看了下 DPI 的 https 话单规范，里面有用点的也是只有 SNI 信息。

如果下一步， SNI 也被加密，那 https 可能算是比较安全了。

逆向某终端安全助手，并绕过策略规则接入网络

2022-03-19T02:24:27.000Z

⚠️ disclaimer 声明：

前言

我所在办公环境，需要使用一个终端安全助手，登录输入口令和每天变化的验证码才能接入到办公网络和外网。登录前会检查一些必要的安全策略，包括但不限于是否运行杀毒软件、是否安装重要的系统安全补丁等等，而且这个软件只能在 Windows 上运行，导致开 Wi-Fi 只能通过电脑开热点，但是登录前也会检查网络共享，然而并没有检查出来，可能是放水了。

这周的某天，这个软件强制升级了，升级后要求必须安装 WPS 才能让接入网络，一下把我恶心到了。不过，这还不是我打算逆向他的真实原因。

当然，上面的安全检查也确实有其必要性，但是在这也把其他终端排除在外，Linux / macOS 无法接入网络，只能开虚拟机。

强制使用 WPS 是根据当前国情做出的决定，防止被国外卡脖子，那我一年 40 的 Office 365 不是白买了吗。（碎碎念）

但是，我之所以准备逆向这个终端安全助手，是因为：

想要移植到 Linux 上运行，用树莓派或者其他 Linux 终端接入网络，开放热点
了解接入网络的登陆原理，开始以为是 802.1X 之类的认证，后来发现并不是
绕过策略，防止以后更恶心的策略（次要）

开始逆向

IDA

最开始，我是尝试使用我从来没有使用过的神器 IDA ，用 IDA 打开入口程序时，可以看到有一些注释，我以为很顺利，再紧接着打开主程序时，有很多乱码，我还到网上搜，以为是编码问题，就想到是不是可以通过反编译查看源码。

其实 IDA 应该是适合交互的时候逆向。

dnSpy 反编译

仔细查看文件目录，exe 的程序文件名后会有个同名的 config 文件，这看起来好像是 C# 写的啊，想起之前反编译用过 dnSpy，就打开看了下，果不其然，能看到源码了，而且文件非常多，但是中文部分还是乱码，仍然以为是编码问题。

仔细看了下，原来是有个 urlencode.b 的函数在对这些乱码进行解码，心里想着，怎么会这样写呢，这得多麻烦（跟个弱智一样）

internal static string b(string A_0, int A_1)
{
char[] array = A_0.ToCharArray();
int num = 1567956420 + A_1;
int num3;
int num2;
if ((num2 = (num3 = 0)) < 1)
{
goto IL_47;
}
IL_14:
int num5;
int num4 = num5 = num2;
char[] array2 = array;
int num6 = num5;
char c = array[num5];
byte b = (byte)((int)(c & 'ÿ') ^ num++);
byte b2 = (byte)((int)(c >> 8) ^ num++);
byte b3 = b2;
b2 = b;
b = b3;
array2[num6] = (ushort)((int)b2 << 8 | (int)b);
num3 = num4 + 1;
IL_47:
if ((num2 = num3) >= array.Length)
{
return string.Intern(new string(array));
}
goto IL_14;
}

仔细看了代码，准备用 python 还原一份，后来发现 python 没有 ushort，我不知道咋写了，又换 golang 写，写完发现，我 TM 写了个寂寞啊！！！

返回值 array 是从 A_0 来的，但是自始自终没有对 array 进行任何操作。

终于，我意识到，这可能就是代码被混淆过了，也就是俗称的加壳。

因为上次用 dnSpy 反编译的程序没有遇到混淆，所以我就不知道这就是混淆。

用 ScanId 1.5 (https://down.52pojie.cn/Tools/NET/ScanId_1_5.zip) 检测后发现，果然加壳了，壳子类型是 Dotfuscator 。

de4dot 脱壳

使用 de4dot (https://down.52pojie.cn/Tools/NET/de4dot.zip) 脱壳

de4dot 会自动检测壳子类型，然后脱壳，很强大的软件。

脱壳后的程序会自动单独改名保存，不用担心覆盖原文件。

dnSpy 再次反编译

将 de4dot 脱壳后的 exe 拖到 dnSpy 中，就发现，这次可以看到源码了，也没有乱码。

阅读源码

接下来，来到了喜闻乐见的源码阅读环节。

我拿到源码的第一件事，先全局搜索万恶之源 “WPS” 关键字，发现并没有搜到结果。

GenAndSend 对本地数据文件的加密

仔细看了下源码，安全策略被称为 Policy，强制安装 WPS 被称为 LicensedSoftware ，每次启动会从服务端下载最新的策略，经过一个叫 GenAndSend 类中的加密函数放到本地磁盘上，我在程序目录确实也看到了这个文件。

程序目录的其他数据文件，从服务器下载的其他数据，都是通过这个 GenAndSend 加密，看来只有将 GenAndSend 的解密函数用其他方便的语言实现，才能知道文件的内容是什么。

GenAndSend 的加解密是基于 3Des ，这是一种对称加密，密钥是明文写在程序里面的，自然也就可以还原了。

当我写完后发现，我根本不需要这么做，因为…

fiddler 抓取 http/https 请求

通过阅读源码，策略是从网络上下载的，然后加密后保存到本地文件，那么肯定有 http 请求了，加密前，数据也是明文，我们可以直接抓流量。

本来我是习惯使用 burpsuite 来抓包的，因为可以手动控制是否放行流量，但是我没多少时间来做 burp 的配置，要安装 java、配置证书、配置代理端口。

我需要一个一把梭的软件，那就是 fiddler ，fiddler 确实好用，只需要安装上，勾选上 “开始捕获” 就可以用了。

然后通过 fiddler ，我抓到了每次登录账号，接入网络的所有请求，也看到了策略的内容，结合程序，知道了这个助手通过注册表检测我是否安装了 WPS 软件。

但是我最疑惑的一点是，到底是如何接入网络的，通过 802.1X 那套认证吗，那我还得去研究协议。

而且因为源码太多了，我又不会 C#，只能看个大概含义。虽然说反编译确实能看到源码，但是可能函数名和位置都变化了，搜索”登录成功”的字样的前后，并没有看到修改本地网络之类的操作。我花了 2 个小时，将所有的源码大概浏览一遍，我很确信没有看到修改本地网络之类的操作。

登录过程

仔细查看登录的源码，查看 fiddler 抓取的请求，发现是将用户名、密码、附加码组合到一起，
Des 加密后（之前是 3Des 加密），post 发到一个地址。

string text4 = "CMethod=login&LoginName={0}&IPs={1}&ExCode={2}&Password={3}";
text4 = string.Format(text4, new object[]
{
this.UserName,
ipv,
this.ExCode,
UrlEncode.Encoder(this.UserPass)
});
this.LoginProcess = "正在登录...";
string text5 = "";
try
{
Dictionary<string, string> dictionary2 = new Dictionary<string, string>();
string value3 = Class39.smethod_2(text4);
dictionary2.Add("Message", value3);
text5 = HttpHelper.GetUrlData(this.ServerUrl + "/TSClientReport.aspx", dictionary2, webBrowser_Form_New.urlType, Encoding.UTF8);
CommonUtility.WriteInfoLog("LoginState", "Login", "TSclient", text5, new DateTime[]
{
DateTime.Now
});
}
catch (Exception)
{
this.LoginResult = -5;
this.LoginDesc = "无法连接认证服务器，请检查您的网络设置。";
return;
}

Class39.smethod_2(text4) 就是 Des 加密的函数。

我打开那个地址，哦，是认证网页。如果没联网的情况下，访问网页，就会强制跳转到那个网页。

我突然意识到！！！

助手只是个套壳程序，真正接入网络是这个认证网页，接入网络的逻辑是在别人的服务端上，本地压根儿没有。

换句话说，只要在这个网页登陆成功，上层设备就给我们的设备授权可以使用网络。

于是我尝试通过网页登录，网页是通用的，因为需要被浏览器解析，没法做到加密，程序逻辑都在 js 里面。

通过网页上也能登录账号并接入网络，但是也会检查安全策略，浏览器自然是无法检测补丁和安装的程序，所以会要求打开助手程序运行起来，登录过程中网页会请求 localhost 的 8695，这个是助手监听的端口，向这个地址发送请求，与助手通信，以此让助手检查安全策略，收到结果后再确定要不要登录。

我将登录过程大致整理如下：

输入用户名、密码、附加码，点击登录，有个登录请求，我们不需要关心
请求 http://localhost:8695/TSPolicyService/CheckPolicy?message=xxxx

xxxx 是 Des 加密的密文，明文是 ‘timestamp=1647493100000&pageCode=910bea86-xxxx-xxxx-xxxx-d95cbd046c87&userName=xxxxxx’ 的格式

字段说明：

timestamp 时间戳
pageCode 类似于会话 id，形如 uuid ，后面会用到
userName 登录名

得到响应：
{"d":null}

请求 http://localhost:8695/TSPolicyService/ChangeUser

得到响应
{"d":null}

助手上传策略检查结果

请求方式： POST
地址：

1	http:///TSWebService/TSSafeCheckResult/UploadTSSafeCheckResult?r={}".format(ticks)

ticks 是 C# 特有的，表示 0001-01-01 00:00:00:000 至此的以 100 ns（即 1/10000 ms）为单位的时间数。

python 实现

import datetime

def getTicks():
    
    t0 = datetime.datetime(1, 1, 1)
    now = datetime.datetime.utcnow() + datetime.timedelta(hours=8)
    seconds = (now - t0).total_seconds()
    ticks = int(seconds * 10**7)
    
    return ticks

post data 内容：

data = {
    'UserLoginName': userName,
    'CheckScore': '95', 
    'IPAddress': 'xx.xx.xx.xx',
    'XmlCheckResult': XmlCheckResult,
    'PageCode': PageCode
}

字段说明：

UserLoginName 登录名
CheckScore 检查得分
IPAddress IP 地址
XmlCheckResult 是个 XML 格式的策略检查结果，直接抄作业即可
PageCode 从第二个请求的 message 解密得到

服务端收到结果后，就登录成功了，这时候就接入网络了。

助手端显示，已经接入网络，即使我们没有在助手登录。再次印证我的猜测，助手只是个套壳软件。

我使用 python flask 框架完成了上面的请求处理，退出助手，将 python 程序运行起来，在网页上登录，可以登录成功。

小插曲

有个小插曲，Des 加解密用到的库叫 pycryptodome ，在 Windows 上解密时会运行失败，我只有在我的电脑上也运行一个 flask 后端程序，Windows 电脑登录需要解密时，发送请求到我的 flask 后端程序解密。

后续还是要用 golang 重新实现一下，解决这个问题。

因为 golang 可以跨平台编译，我希望编译成 windows 程序，先测试，然后再编译给 Linux 实现热点。

心跳数据包

python 程序也开着，但是用了没两分钟，就没网了。

咋回事鸭.jpg

把 python 程序退出了，开正版助手看看抓包，原来是每隔？分钟就会发送心跳数据，以此维持连接。

一看心跳数据，又是 Des 加密，还和上面 CheckPolicy 的加密不是一个函数，而且每次发送的都不一样。

看源码

string str = ConfigurationManager.AppSettings.Get("Client_HeartBeatURL");
string message = string.Format("CMethod=report&LoginName={0}&IPs={1}&ComputerName={2}&AssemblyVersion={3}&EstStatus={4}&SysVolNum={5}&TerminalNumber={6}", new object[]
{
webBrowser_Form_New.Username,
this.clientInformation_0.IPv4,
this.clientInformation_0.HostName,
this.version,
num,
ComputerInfor.OnlySysVoluNum,
ComputerInfor.TerminalNumber
});
string str2 = "/TSClientReport.aspx";
str += str2;
TSResult tsresult = new TSResult();
Dictionary<string, string> dictionary = new Dictionary<string, string>();
string value = Class39.smethod_2(message);
dictionary.Add("Message", value);
try
{
string urlData = HttpHelper.GetUrlData(str, dictionary, webBrowser_Form_New.urlType, Encoding.UTF8);
tsresult.Result = 0;
tsresult.Message = urlData;
}
catch (Exception ex)
{
tsresult.Result = -1;
tsresult.Message = ex.ToString();
}
if (tsresult.Result == 0 && tsresult.Message.ToLower().IndexOf("html") == -1)
{
this.sysLog_1.Description = "心跳已经发送。状态：成功，描述：" + tsresult.Message;
}
else
{
this.sysLog_1.Description = "心跳已经发送。状态：可能失败，描述：" + tsresult.Message;
}
CommonUtility.WriteBeatLog(this.sysLog_1);

Class39.smethod_2(message) 就是加密函数。

看最后一行，还会写日志。

在程序目录下，有个 BeatLog 的目录，里面就是发送心跳数据包的日志：

[] 2022-03-17 00:01:40::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;
[] 2022-03-17 00:03:40::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;
[] 2022-03-17 00:05:40::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;
[] 2022-03-17 00:07:40::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;
[] 2022-03-17 00:09:40::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;
[] 2022-03-17 00:11:40::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;
[] 2022-03-17 00:13:41::time_NetWordCatdCheck_Tick()心跳已经发送。状态：成功，描述：0;

描述：0; 是服务器响应，与上面截图中的服务器响应一致。

于是我又用 python 实现了，解密后发现内容一毛一样，因为有随机数参与，密文导致不一样。

我同时又想到另一种方法，我可以重放数据包，但是别人后台也能看到，为了伪装的像一点，逆向心跳包的加密函数也是有必要的。

但是之前没有用 python 写过加密函数的实现，导致遇到了个报错:

--> 244         return self._cipher.encrypt(plaintext)
    245 
    246     def decrypt(self, ciphertext):

ValueError: Input strings must be a multiple of 8 in length

也就是说 plaintext 明文长度必须是 8 的整数倍，不足就需要填充，一般使用 PKCS7Padding 填充。

最后写出了加密函数，但是发现服务端的响应不是正常的 0;

正常的数据，结尾是两个 %3d%3d，也就是说，urlencode 前的密文，结尾是两个”=”，这是因为 Des 加密过程最后一个步骤就是 base64 编码。

但是我发送的数据，结尾是 %253d ，看起来是进行了两次 urlencode ,导致 % 再次被编码成 %25( 25 是 % 的 hex 码)

1 2	>>> urllib.parse.quote('%') '%25'

一条条代码手动调试，发现 python 的 requests.post 过程还会自动进行一次 urlencode ，所以我照着 C# 源码写的 python 版本实现，加密函数返回的结果就不能再 urlencode。

发现这个问题后，我马上改了我这边的加密函数，flask 会自动加载，然后可以看到服务端的响应已经正常。

至此，我已经把接入网络过程中的必要过程完全用 python 实现。

C# 和 python 还有个函数的差异，比如：
C# 的 urlencode ，是将特殊符号转换成小写格式的 urlencode，而 python 是转换成大写。

但是因为，post 过程中自动进行 urlencode ，我无法控制，也没法通过正则去替换了。

所幸的是，大写格式的 urlencode，服务器端也能正常响应，解析应该没有问题。

总结

我退出助手，运行精简盗版助手，可以正常接入网络，绕过了策略检测，并且每 2 分钟发送一次心跳，网络没有中断，心跳请求的服务器响应也是正常的。

其实好像发送了错误的心跳包，我的网络也没有中断，可能还需要继续探索，是不是随便发点数据过去，也能维持网络。

完了后，之前遇到的一些神奇现象也能被解释的通了：

有时候重启电脑，开机后打开助手，发现显示“已经接入到办公网络”，因为还在2分钟内。
右键退出程序后还可以联网，过一会儿没网，是因为没发送心跳包。
右键断开办公网，是马上断网。因为有个断网的请求，message 解密内容为 “CMethod=logoff&IPs=xx.xx.xx.xx&LoginName=xxxxx&ComputerName=xxxxx\x01” ，末尾的\x01就是填充字符。

N26 德国区视频认证复盘

2021-11-22T15:24:27.000Z

N26 简介什么的不说了，可以看 bluesky 的这两篇文章：

这里仅记录本人德国区视频认证过程，希望对后人有所帮助，能想得起的细节我都写上了，认证时注意听关键词就行了，像我只过CET4的弱鸡都能过，难度真的不高。

不用挂代理，直连就行，注册德区不是为了 AMEX 卡，主要是想挑战下口语（雾），没想到第一次认证翻车了。

第一次认证，租房的网络太卡了，客服说话跟 ppt 一样的，我说我英语不太好，她说你还会别的语言吗，我说 Chinese，她说，抱歉，注册 N26 必须要懂英语，我们无法验证你的账号，再见! 然后给我挂了。我就有阴影了，想起之前 6 级的听力考试。。。。当时瞬间后悔注册德区了，应该注册荷兰区的，荷兰区不用视频认证。

这次我回家又试了下，一口气就过了，复盘一下过程。

开始一大堆介绍听不懂，不用管，然后开始问问题了，她顿了下，我才意识到该我说话了，我说 yes，我这里网络不好，你能说慢点么(can you speak slowly?).
又一大段听不懂的，好像是问申请账户的原因，我说 yes 我注册自己用的。
让我伸出5根手指，放在脸的旁边。
look at front camera，看着前置摄像头，拍个大头贴。
好了后，APP 切换到后置摄像头，看护照，拍照，让我 hold 护照，我没听懂，把护照关上了，她说打开护照，我打开了，她说 hold it，我又给关上了放桌上了。她说，不要关掉。我说，不是你让我 hold 吗。她说，让你 hold，没让你关掉。好吧。。。。拍照期间会自动调用闪光灯。
拍完后，让我把 up right 的区域对着摄像头，然后什么什么 from side to side，就是让我调整角度，这样她可以看到防伪水印。
完了后，让我读一遍 passport number。
读完后关掉护照，并且摄像头拍着护照封面。在我看不到护照内容的情况下问问题，问我 when were you born，我网络真的不好，when 听成了 where，我说 China，她一脸蒙蔽，我说，你问啥来着，她说 I said when….。我说，嗷，you say when 嗦（心里OS：你说 birthday 不就完了吗）。然后心想，生日，我生日多久来着，我是先说月份还是先说日呢，卡了5秒钟才说出来。
完了后，问我能不能收验证码，这时候 APP 已经切换到确认手机号的步骤，我还问她是不是 click 那个 button ，她说是，然后收到验证码填上，她说已经验证我的账户，可以开始使用了，have a good day。我说 thank you。

APP 全程无法截图，录屏，要完的时候趁她不注意用另一个手机拍了张，我应该开录音的，忘记了，真想注销了再注册一次（狗头）

有网友说遇到大妈客服的，看你英语不太熟就秒挂，我两次都没遇到大妈，而且这个小姐姐比较有耐心，看我听不懂了，手上拿了张卡片做示范指导我该怎么做。

入金 20 欧激活，直接绑定招行 Master 卡支付就行。N26 可以绑定支付宝，淘宝购物有 3% 费率，支付宝不要脸，明明一般卡组织手续费都是商家出。

在 Google play 上消费了 44.9 HKD，卡里扣款 5.12 欧，汇率 0.114，🈚️手续费
买包子豆浆 4.5 CNY，折算 0.63 欧，是按实时汇率算的，0.14，🈚️手续费

Telegram 2021 Translator Tests Walkthrough?

2021-08-31T00:00:34.000Z

申请岗位

2021-04-27 在「荔枝木」频道看到 Telegram 有在招人：

Telegram 官方的工作申请: telegram.org/jobs

来源： https://t.me/lychee_wood/21558

就去尝试申请了下翻译员(Translator)，之前我也断断续续在翻译一些博客的文章，或者是文档手册，一方面也是锻炼自己，另一方面为社区做贡献嘛。

申请工作的话，找 @jobs_bot

职位描述

我比较感兴趣的：

Translator

Responsibilities: translate app and website interfaces, posts on the Telegram blog, articles and more.
Requirements: fluency in English and at least one other language.

Site Reliability Engineer

Responsibilities: automate routine tasks, proactively identify and solve potential reliability issues, increase fault-tolerance of Telegram’s multi-datacenter infrastructure.
Preferred qualifications: experience administering *nix like systems, experience developing in C, Python or Perl; knowledge of bash, network protocols, and network equipment of major manufacturers.

其他的还有：

C/C++ Software Engineer
Junior Accountant
Assistant to the CEO

最终我还是选择 Translator

后来我就忘了这事，谁知道他们是不是真的在招聘呢，直到。。。。

考前测试

2021-08-19 23:55 (UTC+8)

收到 @jobs_bot 发来的通知

Thank you very much for expressing an interest in joining Telegram as a Translator. We would like to offer you a series of test tasks to make sure that we fit well together.
The first stage will require you to complete a test of your English skills (you will receive a separate announcement with all the details).
Before that, we invite you to participate in a preliminary test — an easy quiz that will not be used for evaluation. This is not a requirement, but we highly recommend to participate because it will introduce you to all the necessary features of the Quiz Platform.
The preliminary test will start on August 20, 18:00 UTC and take less than 30 minutes. Here is the quiz link:
https://quiz.directory/quiz/PM0jsfZ9
In case you encounter any issues with the interface, please report them to this dedicated bug report account: @test_feedback.
During the quiz, please note that the interface does not allow changing your answer after clicking on one of the answer options.

这是为了让你适应答题模式而设立的考试，你可以感受下每道题的倒计时，选择的方式。

08-21 晚上，测试开始前8分钟，收到开始考试的通知

The preliminary test has almost started!
Here is the quiz link:
https://quiz.directory/quiz/PM0jsfZ9
The quiz will stay open for 30 minutes and you can take it several times to get used to the interface.
If you encounter any problems during the quiz (e.g. media fails to load, long response times from the interface or other issues), please report them to this dedicated bug report account: @test_feedback.
As a reminder, this quiz will NOT be used for evaluation — so don’t worry about choosing the wrong answers.

考完了后，等到结束时间可以看到题目的正确答案，答对的题目数，以及自己的排名

正式测验

预先测试的当天晚上

2021-08-21 22:10 收到通知

As part of the first evaluation task, we would like to invite you to complete a series of aptitude tests on Sunday, August 22. The tests will start exactly at 15:00 UTC and will take approximately 3 hours.
We understand that this invitation comes with very short notice. If you are unable to take the tests this Sunday, there will be another opportunity in September or October of this year.
That said, urgent translation tasks with ambitious deadlines that appear at unexpected times are not uncommon at Telegram, so it will definitely count as bonus points for your application if you are able to take the tests tomorrow.
We will send you more information about the tests on Sunday, before the task starts.
You will need:
A computer with internet access (desktop preferred).
A pen and paper for notes.
Do this now:
Check your local time! (UTC time now) The task will open at 15:00 UTC and close at 18:00 UTC, you will not have time to complete it if you are late.
Log in on quiz.directory ahead of time to ensure you are able to start the task at exactly 15:00 UTC.

总算不是阴间时间了，我本地时间的晚上 11 点开始，考试时长 3 小时，有3个部分。

诶，为什么要准备纸笔？

2021-08-22 收到考试的详细内容

You have applied as a Telegram translator. Today, you are invited to take three aptitude tests one after another.
There will be another opportunity to take the tests for the translator vacancy later this year, however, we recommend that you take the tests today if you have the opportunity as this will count as bonus points for your application. You will only be able to take each of the tests once.
Test Schedule
15:00 UTC. English Language test.
https://quiz.directory/course/pzEAZwEc/quiz/HRLodYvD
Maximum 30 minutes, 1 minute per question.
== 10 minutes break ==
15:40 UTC. Logic / Spatial thinking test.
https://quiz.directory/course/pzEAZwEc/quiz/UnZ5TCfY
Maximum 1 hour, 2 minutes per question.
== 10 minutes break ==
16:50 UTC. Math Test.
https://quiz.directory/course/pzEAZwEc/quiz/eWDdeE3L
Maximum 1 hour, 2 minutes per question.
If you complete the tests in less than the maximum time, you will have longer breaks between tests. Don’t hurry too much — correct answers are more important than speed.
Please complete all of the tests
Many of the questions were intended to be difficult — and you are likely to make at least some mistakes. But don’t worry! The goal here is not to get 100% correct results, but rather to demonstrate how you solve progressively more difficult tasks under stress. We advise you to finish the tests no matter what.
Test 1. English Language
English is the universal language of communication at Telegram. As a translator, it is vital for you to demonstrate exceptional English skills because translating interfaces and other texts requires a deep understanding of the finest details in the original text, especially when it comes to manuals, blog posts, etc.
Test 2. Logic/Spatial
These questions will help you show off how you handle logical tasks. While they are not directly connected with any of the activities you might perform as a potential member of our team, they will help us learn more about you and which types of tasks and formats of work are best suited for you.
Test 3. Math
While math skills are not required for the position you applied for, we would be grateful if you could also complete the full math test — even if you find it difficult. This will help us to know you a little better and may help to unlock additional opportunities, should you become a member of our team. You are welcome to use a calculator for the math tasks.
===
We will send another notification immediately before the task, but you are welcome to begin as soon as the task opens at 15:00 UTC.

竟然有数学题，好吧，这就是准备纸笔的原因吧。

又是考试前 8 分钟

The aptitude tests are about to begin. Please open the English language test and be ready to start at 15:00 UTC.
Test Schedule
15:00 UTC. English Language test.
https://quiz.directory/course/pzEAZwEc/quiz/HRLodYvD
Maximum 30 minutes, 1 minute per question.
== 10 minutes break ==
15:40 UTC. Logic / Spatial thinking test.
https://quiz.directory/course/pzEAZwEc/quiz/UnZ5TCfY
Maximum 1 hour, 2 minutes per question.
== 10 minutes break ==
16:50 UTC. Math Test.
https://quiz.directory/course/pzEAZwEc/quiz/eWDdeE3L
Maximum 1 hour, 2 minutes per question.
Good luck and see you on the other side! 💪💪

如果你的账号没有申请过岗位，打开连接后可能会显示

Sorry, the course you are looking for does not exist.

所以你是无法找到以往的考题。

考试真题

我这里将题目备份了下来，以供参考

English Language test

Quiz: [202108A] Telegram English Language Evaluation Quiz
Questions 30
1 min Per question
Average time 18:16
Community Rating 4.2
Participants 3,279

点击展开题目

【水】随便记录下导入某数据的记录

2020-09-29T00:00:34.000Z

上个月，机缘巧合，拿到一份不知道是什么鬼网站的 5e 数据和另一个软件的 8e 数据，随便记录下入库的流水账吧。

如果这份数据让你联想到了什么，我只能说：如有雷同，纯属巧合。

部分关键字已移除。

文件名： www_removed_com_tel.zip
文件大小：5G

解压出来，是一个 txt，文件大小 11.2G，用记事本打开是别想了，使用 ultraedit 打开，可以看看数据的结构

1	XXXXXXXXXXX XXXXXXXXXX

11位数字后面一个空格，然后10位数字

因为 mongodb 的导入，只支持 json 和 csv 格式，json 太复杂了，我选择使用 csv 格式，将空格分隔符改为逗号分隔符。

1	tr -s '[:blank:]' ',' < www_removed_com_tel.txt > w.csv

导入 mongodb，因为没有表头，需要指定一下字段名，--fields="tel.string(),oid.string()"

1	mongoimport -d removed -c w --type=csv --file w.csv --fields="tel.string(),oid.string()" --columnsHaveTypes -u "myUserAdmin" -p "password" --port 27017 --authenticationDatabase "admin"

创建索引，指定一下 background 在后台创建索引，否则会阻塞查询

1 2	> db.w.createIndex({"tel":1},{background:true}) > db.w.createIndex({"oid":1},{background:true})

对于另一份 8e 数据，分隔符是四个-，同样使用 tr 来转换

1	tr -s '[=-=]' ',' <8e.txt > 8e.csv

导入

1	mongoimport -d removed -c q --type=csv --file 8e.csv --fields="qq.string(),tel.string()" --columnsHaveTypes -u "myUserAdmin" -p "password" --port 27017 --authenticationDatabase "admin"

创建索引

1
2
3

> db.q.createIndex({"qq":1},{background:true})

> db.q.createIndex({"tel":1},{background:true})

创建索引，因为尻尻号可能绑定多个tel号码，所以不能设置成唯一索引

参考：

https://stackoverflow.com/questions/43108359/how-to-remove-all-special-characters-in-linux-text

https://stackoverflow.com/questions/83329/how-can-i-extract-a-predetermined-range-of-lines-from-a-text-file-on-unix

https://alvinalexander.com/blog/post/linux-unix/how-remove-non-printable-ascii-characters-file-unix/

在 Proxmox 上创建 KVM 的虚拟机模板（多图杀猫

2020-06-16T00:24:27.000Z

本文记录下如何在 Proxmox 上创建一个 CentOS7 的 KVM 虚拟机模版，以及如何从模板克隆来创建 KVM 虚拟机。

上次那篇问题太多了，就删了重写。（自己谷歌搜了下，还搜到之前那篇，真是丢人

参考：

下载系统镜像

这次使用 CentOS 7.8 的系统制作模板。

可以选择将镜像文件上传到存储，也可以下载到母鸡的/var/lib/vz/template/iso

1 2	cd /var/lib/vz/template/iso/ wget http://mirror.sesp.northwestern.edu/centos/7.8.2003/isos/x86_64/

创建 VM

我们需要先创建一个 VM，然后将其转换成模版。

访问 Proxmox 网页管理端：
https://[console-ip]:8006

右上角，点击创建 VM，这个是创建 KVM
（另一个，创建 CT，这个是 OPENVZ

VM ID 是可以修改的，我改了个永远不会被占用的数字。

虚拟机名字： centos7-template
硬盘：20 G (最好还是设置小点，5G就够了，方便后面备份)
CPU：2 cores，这样安装快一点，实际克隆可以修改的。
内存：4 G (4096MiB)
网络：桥接 vmbr1，vmbr1 是之前配置的内网网卡)

配置概览如下

点击“Finish”保存。

开机前，在 Hardware(硬件) 中添加一个 CloudInit Drive，存储位置就放在保持 VM 硬盘的存储中即可。（不一定要在开机前添加这个，只要在转换成模板前添加 CloudInit Drive 就行）

安装系统

启动 VM，就和平常安装一样的

需要配置的是安装目的地(installation destination) 和网络(network & hostname)，你可以按需调整选择安装的类型(software selection) 。

语言等设置默认

语言、位置、时区、键盘布局默认

硬盘分区

硬盘全部划分到根分区，且文件系统为 ext4，不要配置 LVM（否则克隆时不能自动扩容）

点击手动分区，然后左上角点 Done。

选择分区类型为 Standard Partition

然后从左下角的 + ，添加挂载点，容量为 20 x 1024 = 20480，如果是 5G 就是 5x1024 = 5120.

文件系统选择为 ext4，点击 Update Settings（更新设置）

这样就分区好了，点左上角 Done，屏幕底部出现警告，这里是说，没有设置 swap 分区，建议设置一下，我们这里做模板不需要设置，再次点 Done。

再次点 Done，会出现对话框，这是本次设置生成的步骤，点击 Accept Changes（接受更改）。

网络配置

主机名随意
网络，选择手动配置网络。我这就配置的内网 IP，公网 IP 也可。

右下角点击 Configure 进行配置

按如图配置，如果要配置公网 IP ，对应修改就可以了，设置好了后，点 Save 保存。

完了后，还要打开网卡的开关，表示接上网线。

点 Done 回到主界面，点右下角 Begin Installation 开始安装。

这里会让你设置用户，只需要设置密码即可，不创建用户。

进入系统

安装完成后，看看能否上网，curl ip.sb 返回的是母鸡的 IP 地址，说明网络是通的，DNS 也没有问题。

可以根据自己的需要安装一些额外的软件包

比如我会安装 vim，htop 等等

1	yum update && yum install -y vim htop

关闭 SELinux、Firewalld

SELinux 玩不会，反正每次进刚装好的 CentOS 第一件事也是关闭这货，不如直接在模版里面就关掉好了。

1
2
3

systemctl disable firewalld
setenforce 0
sed -ri '/^[^#]*SELINUX=/s#=.+$#=disabled#' /etc/selinux/config

Cloud-init

Cloud-init 会在克隆的时候，进行一些基础配置，比如 IP 配置和用户创建、密码设置。

安装 Cloud-init

1 2	yum install cloud-init cloud-utils qemu-guest-agent systemctl enable qemu-guest-agent

CentOS 7 会自动安装 cloud-utils-growpart，这个包是用来自动扩容硬盘的。

qemu-guest-agent，默认可能会安装，用于查看虚拟机的 IP 地址。

由于 Cloud-init 的文档实在看不懂，这里选择简单配置。

修改 Cloud-init 配置，一般是在这个位置 /etc/cloud/cloud.cfg

开头的一部分：

1 2	disable_root: 1 ssh_pwauth: 0

改为

1 2	disable_root: 0 ssh_pwauth: 1

disable_root: 禁止 root 账号，设置为 0 ，即允许 root 账号。
ssh_pwauth： pw -> password, ssh 使用密码认证，设置为 1 ，即允许。

网络配置

/etc/sysconfig/network-scripts/ifcfg-eth0，现在这里是配置好了的

清理系统

运行以下命令

yum clean all
> /etc/machine-id
rm -f /etc/ssh/ssh_host_*
rm -rf /root/.ssh/
rm -f /root/anaconda-ks.cfg
rm -f /root/.bash_history
unset HISTFILE
rm -f /var/log/boot.log
rm -f /var/log/cron
rm -f /var/log/dmesg
rm -f /var/log/grubby
rm -f /var/log/lastlog
rm -f /var/log/maillog
rm -f /var/log/messages
rm -f /var/log/secure
rm -f /var/log/spooler
rm -f /var/log/tallylog
rm -f /var/log/wpa_supplicant.log
rm -f /var/log/wtmp
rm -f /var/log/yum.log
rm -f /var/log/audit/audit.log
rm -f /var/log/ovirt-guest-agent/ovirt-guest-agent.log
rm -f /var/log/tuned/tuned.log
rm -f /etc/udev/rules/70-persistent-*-rules
sys-unconfig

最后一条指令运行后，系统会关机。

为 VM 开启 Qemu Agent

在 VM 面板的 Options 里面开启 Qemu Agent。

验证克隆

先不急着转换模版，可以直接克隆看看有没有问题，因为一旦转换成模版，就不能修改了，如果有问题，就只能重装。

这里的克隆，是 Full Clone（全量克隆），从下方的 log 可以看出来，全量克隆会耗费一点时间。

克隆完成后，设置 Cloud-init。

User: root
Password: 
DNS domain: 1.1.1.1 
DNS servers: 1.1.1.1
SSH public key: none
IP config: ip=192.168.1.101/24,gw=192.168.1.1

DNS domain 和 DNS servers

默认是 use the host settings，但是实测不填，克隆出的服务器没有这个配置，可能是 cloud-init 没有识别到主机配置，保险期间还是填一个。

IP config

如果是公网的话，CIDR 当前子网实际是多少就填多少，因为要按照这个 CIDR 计算广播地址的。我实测填成 /32,计算就不对了。

点 Regenerate Image ，重新生成镜像.

硬盘扩容

Hardware - Hard Disk - resize disk

这里填写的是增量，而且只能增加，不能减少。

如果 20G 调整为 100G，那么增量填写 80 。

没有什么要调整的，就可以开机了。

开机后，在虚拟机的 Summary 里 IPs 一栏，可以看到主机的 IP 地址。

转换模版

到 Web 管理界面，右键刚刚关机的 VM，转换成模版即可。

克隆小鸡

右键刚刚生成的模版，点 clone，会问你创建链接克隆(linked clone)还是全量克隆(full clone)

区别：

链接克隆(Linked Clone)
克隆出的系统必须依赖于原始系统，如果原始系统出现任何问题，克隆出的系统也会出问题
全量克隆(Full Clone)
指克隆出的系统和原始系统一模一样，可以独立运行。

一般为了完整的克隆，选择 Full Clone 。

系统初始化的设置参考上节 “验证克隆”、

之后开机就行了。
控制台(VNC)可以看到 Cloud-init 的过程。

模版备份

可以通过备份，将 KVM 虚拟机导出成文件，方便迁移和恢复。
步骤：VM 面板 - Backup (备份)，备份模式 Stop(停止)，然后点备份按钮。

如果 Storage 下拉框里面没有条目，那就是你的存储没有开备份功能。

在 Datacenter - Storage，选择一个 Type 为 Dir 的存储，点击 Edit，在 Content 下拉框，选中 VZDump backup file，然后保存。

因为备份是实实在在的文件，不是 raw 了，只能放在某个目录的存储中。

生成的vma.gz文件位置：
/var/lib/vz/dump/
命名规则：
vzdump-qemu-{vmid}-{year_month_day_hour_minute_second}.vma.gz
比如：
vzdump-qemu-100-2018_10_10-23_22_33.vma.gz

（完）

为 Proxmox 配置私有网络

2020-06-15T00:00:34.000Z

参考：

盘算了下 IPv4 地址除去宿主机占用一个，剩下四个，现在有个 code-server （vscode 的网页版，本文也是使用这个来编辑的），实测使用 ipv6 地址无法安装插件，这样就还剩 3 个了，如果再开个数据库，想学习 k8s 就只有 2 个 IP 了，实在有点捉襟见肘。

突然想到，我可以为 VM 配置内网（私有网络），也就是俗称的 NAT 小鸡。SSH 可以通过 NAT ，也可以压根不 NAT，通过 VNC 也可以管理，只是这样会有点麻烦。

在 Proxmox 上添加虚拟网卡

在 Proxmox 宿主机上修改 /etc/network/interfaces ，最后面加上一个桥接网卡 vmbr1，配置如下

auto vmbr1
iface vmbr1 inet static
        address 192.168.1.1
        netmask 255.255.255.0
        bridge_ports none
        bridge_stp off
        bridge_fd 0

        post-up echo 1 > /proc/sys/net/ipv4/ip_forward
        post-up   iptables -t nat -A POSTROUTING -s '192.168.1.0/24' -o vmbr0 -j MASQUERADE
        post-down iptables -t nat -D POSTROUTING -s '192.168.1.0/24' -o vmbr0 -j MASQUERADE

简单说下配置的意思。

前面一段是网卡信息。
bridge_ports 没有指向的网卡，之前安装的时候，我曾把 vmbr0 指向到物理网卡。
bridge_stp off 和 bridge_fd 0 不知道什么意思。

后面一段是和网卡相关的附属配置。
post- ，这个单词前缀，是 after ，behind 的意思。
post-up, post-down 可以理解为，在网卡启用后，在网卡关闭后。

在网卡开启后，需要在内核配置开启转发，然后在 iptables 上配置一条 nat 规则，源地址为 ‘192.168.1.0/24’ 的流量，转发到 vmbr0 接口。 -A，是添加规则。

在网卡关闭后，-D 删除这条规则。

注意：
这个配置文件是没有语法校验的，请确保无误。万一打错了一个字母，重启后你的母鸡就离线了。。

确认好后，重启网络程序，来应用更改。建议重启的时候关掉 VM，否则 VM 的网络会有问题，也需要重启 VM 的。

1	/etc/init.d/networking restart

如果配置没问题的话，shell 会立刻重连上。

如果重启网络程序的时候，有 VM 正在运行，VM 的网络会断掉，所以你需要重启一下 VM。

VM 上的网络配置

VM 的网络配置如下：

IP 地址： 192.168.1.2
子网掩码：255.255.255.0
网关：    192.168.1.1

DNS：     1.1.1.1

DNS 需要使用外部 DNS，这里填网关地址是不行的，母鸡上没有处理 DNS 的程序。