diff --git a/2026 Ubuntu 安装.md b/2026 Ubuntu 安装.md index e4fe4cd..770afd5 100644 --- a/2026 Ubuntu 安装.md +++ b/2026 Ubuntu 安装.md @@ -1,3 +1,88 @@ + + +20260525 +下载 https://cn.ubuntu.com/download 最新 Ubuntu 桌面系统 26.04 LTS + +sudo umount /dev/sdX* +sudo dd if=/path/to/image.iso of=/dev/sdX bs=4M status=progress +sync 验证数据同步完成 + + + +## 软链接 + +```bash +touch ~/资源/@/@000000/2026.******.txt +ln -s ~/资源/@/@000000/2026.******.txt ~ 建立软链接 +``` + +## 微信 Linux 版 +https://linux.weixin.qq.com/ + +WeChatLinux_x86_64.deb 下载 deb包 +sudo dpkg -i WeChatLinux_x86_64.deb + + +## 邮箱 + +网易邮箱大师 +https://dashi.163.com/ +sudo dpkg -i mail.deb +sudo apt-get install -f 修复依赖包缺失 + +## 安装 Typora + + [下载]( https://release-assets.githubusercontent.com/github-production-release-asset/387719913/8fc9123c-8ab6-42c4-beb8-51aed8a90380?sp=r&sv=2018-11-09&sr=b&spr=https&se=2026-02-23T12%3A07%3A58Z&rscd=attachment%3B+filename%3DTypora_Linux_0.11.18_amd64.deb&rsct=application%2Foctet-stream&skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skt=2026-02-23T11%3A07%3A11Z&ske=2026-02-23T12%3A07%3A58Z&sks=b&skv=2018-11-09&sig=WkCLwB1oETFZhzyEfwv%2Bg6rgXToyxVjTEoHs%2FhNv5Vk%3D&jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmVsZWFzZS1hc3NldHMuZ2l0aHVidXNlcmNvbnRlbnQuY29tIiwia2V5Ijoia2V5MSIsImV4cCI6MTc3MTg0ODIwNCwibmJmIjoxNzcxODQ2NDA0LCJwYXRoIjoicmVsZWFzZWFzc2V0cHJvZHVjdGlvbi5ibG9iLmNvcmUud2luZG93cy5uZXQifQ.t0qa0XCAmHMMaI1riV8M9-VMB4v932b0Uhb-5QJtbeM&response-content-disposition=attachment%3B%20filename%3DTypora_Linux_0.11.18_amd64.deb&response-content-type=application%2Foctet-stream) 0.11.18 或安装 /home/zimy/Tdisk/deb安装备份/Typora_Linux_0.11.18_amd64.deb + +```sh +sudo apt install ./Typora_Linux_0.11.18_amd64.deb +``` +sudo apt install pandoc + +安装主题 zeus 和 Tailwind + + + +## 安装 VPN + +https://ikuuu.win/user/tutorial?os=linux&client=ikuuu-vpn + +sudo dpkg -i ikuuu_vpn-0.16.4-b3a29b9a-linux-amd64.deb + +sudo apt-get install -f 修复依赖包缺失 +ikuuuvpn 打开客户端 + +### DNS + +``` +resolvectl status 查看当前真实使用的 DNS +sudo nano /etc/systemd/resolved.conf +增加 +DNS=1.1.1.1 8.8.8.8 +FallbackDNS=1.0.0.1 8.8.4.4 + +sudo systemctl restart systemd-resolved 重启服务 +``` + + + +## 便签 + +安装短裤便签 + +```bash +$ sudo apt install flatpak 安装 flatpak 应用商店 +$ flatpak remote-add --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo 添加 flathub 的远程仓库 +$ flatpak install flathub io.github.ellie_commons.jorts +``` + +## + + + + + + sudo apt upgrade 其他包管理工具 @@ -42,41 +127,16 @@ https://extensions.gnome.org/extension/261/kimpanel/ https://blog.csdn.net/KaminZzz/article/details/160000087?spm=1001.2101.3001.6650.2&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7EYuanLiJiHua%7EPosition-2-160000087-blog-158711659.235%5Ev43%5Epc_blog_bottom_relevance_base5&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7EYuanLiJiHua%7EPosition-2-160000087-blog-158711659.235%5Ev43%5Epc_blog_bottom_relevance_base5&utm_relevant_index=5 -## 安装微信 -4.1.0 下载 https://dldir1v6.qq.com/weixin/Universal/Linux/WeChatLinux_x86_64.deb -sudo dpkg -i WeChatLinux_x86_64.deb -## 安装 Typora - [下载]( https://release-assets.githubusercontent.com/github-production-release-asset/387719913/8fc9123c-8ab6-42c4-beb8-51aed8a90380?sp=r&sv=2018-11-09&sr=b&spr=https&se=2026-02-23T12%3A07%3A58Z&rscd=attachment%3B+filename%3DTypora_Linux_0.11.18_amd64.deb&rsct=application%2Foctet-stream&skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skt=2026-02-23T11%3A07%3A11Z&ske=2026-02-23T12%3A07%3A58Z&sks=b&skv=2018-11-09&sig=WkCLwB1oETFZhzyEfwv%2Bg6rgXToyxVjTEoHs%2FhNv5Vk%3D&jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmVsZWFzZS1hc3NldHMuZ2l0aHVidXNlcmNvbnRlbnQuY29tIiwia2V5Ijoia2V5MSIsImV4cCI6MTc3MTg0ODIwNCwibmJmIjoxNzcxODQ2NDA0LCJwYXRoIjoicmVsZWFzZWFzc2V0cHJvZHVjdGlvbi5ibG9iLmNvcmUud2luZG93cy5uZXQifQ.t0qa0XCAmHMMaI1riV8M9-VMB4v932b0Uhb-5QJtbeM&response-content-disposition=attachment%3B%20filename%3DTypora_Linux_0.11.18_amd64.deb&response-content-type=application%2Foctet-stream) 0.11.18 或安装 /home/zimy/Tdisk/deb安装备份/Typora_Linux_0.11.18_amd64.deb -```sh -sudo apt install ./Typora_Linux_0.11.18_amd64.deb -``` - -## 安装 Dove -看 /home/zimy/Tdisk/deb安装备份/dove + ## 安装 chrome wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb - -## 安装 office -卸载现有版本 -sudo apt remove --purge libreoffice* -sudo apt autoremove - -更新包列表 -sudo apt update - -安装中文语言包和核心组件 -sudo apt install libreoffice-l10n-zh-cn libreoffice-help-zh-cnz在 - -安装核心程序、Writer、Calc、Impress -sudo apt install libreoffice-core libreoffice-writer libreoffice-calc libreoffice-impress - ## 网页版AI应用封装为桌面应用 ``` # 需先安装 Node.js 和 npm @@ -132,16 +192,6 @@ https://atrustcdn.sangfor.com/standard/linux/2.5.16.20/ubuntu/amd64/aTrustInstal sudo dpkg -i aTrustInstaller_amd64.deb -## 便签 - -安装短裤便签 - -```bash -$ sudo apt install flatpak 安装 flatpak 应用商店 -$ flatpak remote-add --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo 添加 flathub 的远程仓库 -$ flatpak install flathub io.github.ellie_commons.jorts -``` - ## 修改拼音的键盘布局 将键盘布局改为标准美式,避免“死键”行为: @@ -150,61 +200,10 @@ $ flatpak install flathub io.github.ellie_commons.jorts gsettings set org.gnome.desktop.input-sources sources "[('xkb', 'us'), ('ibus', 'libpinyin')]" ``` -## 邮箱 -网易邮箱大师 -``` -cd /home/zimy/Tdisk/deb安装备份 -sudo dpkg -i mail.deb -sudo apt-get install -f 修复依赖包缺失 -``` -## 开发环境 - -### DNS & Docker - -``` -resolvectl status 查看当前真实使用的 DNS -sudo nano /etc/systemd/resolved.conf -增加 -DNS=1.1.1.1 8.8.8.8 -FallbackDNS=1.0.0.1 8.8.4.4 - -sudo systemctl restart systemd-resolved 重启服务 -``` - -安装 Docker - -``` -# 添加 Docker 官方 GPG 密钥 -curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg - -# 添加 Docker 仓库 -echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null - -# 安装 Docker -sudo apt update -sudo apt install docker-ce docker-ce-cli containerd.io - -# 验证 -$ docker version 29.2.1 -$ docker compose version v5.1.0 -``` - -sudo vim /etc/docker/daemon.json - -``` -{ - "registry-mirrors": [ - "https://docker.1ms.run", - "https://docker-0.unsee.tech", - "https://docker.m.daocloud.io" - ], - "live-restore": true, - "features": { "buildkit": true } -} -``` +## ### python diff --git a/Docker技术.md b/Docker技术.md new file mode 100644 index 0000000..c802341 --- /dev/null +++ b/Docker技术.md @@ -0,0 +1,84 @@ +# Docker相关 + +## 安装 + +``` +# 添加 Docker 官方 GPG 密钥 +curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg + +# 添加 Docker 仓库 +echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null + +# 安装 Docker +sudo apt update +sudo apt install docker-ce docker-ce-cli containerd.io + +# 验证 +$ docker version 29.5.2 +$ docker compose version v5.1.4 +``` + +sudo vim /etc/docker/daemon.json + +``` +{ + "registry-mirrors": [ + "https://docker.1ms.run", + "https://docker-0.unsee.tech", + "https://docker.m.daocloud.io" + ], + "live-restore": true, + "features": { "buildkit": true } +} +``` + + + +### 将当前用户加入 docker 组 + +这是最推荐的方法,加入后无需每次使用 `sudo`: + +``` +# 1. 将当前用户加入 docker 组 +sudo usermod -aG docker $USER + +# 2. 刷新组权限(或者完全退出终端并重新登录) +newgrp docker +``` + + + +## 使用 + + + +``` +Docker相关: + +给当前用户增加 docker 运行权限 +sudo usermod -aG docker $USER +newgrp docker + +SaaS镜像发布:submit- 然后远程 docker load -i my*.tar 重启容器 +docker compose logs coze-server --tail 10 | grep -i "error" 查错 +docker logs container_name | grep -C 20 "error" 查错误日志前后 20 行内的内容 +docker compose exec -it ? /bin/sh 或 /bin/bash 进入容器执行命令 +docker image prune -a 删除所有未使用的镜像 docker rmi $(docker images -q) 删除所有镜像 +删除所有容器 docker stop $(docker ps -q) ; docker rm $(docker ps -q) +doker pull coleifer/sqlite-web:latest 拉镜像 +docker save coleifer/sqlite-web:latest -o sqlite_web_latest.tar 保存镜像,上传 +docker load -i sqlite_web_latest.tar 服务器加载镜像 +- +清空日志: +docker compose logs --no-log-prefix > /dev/null 2>&1 && truncate -s 0 $(docker inspect --format='{{.LogPath}}' $(docker compose ps -q)) +``` + + - 重启所有服务:`docker compose restart` + - 只重启网关服务:`docker compose restart hermes` + - 查看所有服务实时日志:`docker compose logs -f` + - 升级到最新镜像版本:`docker compose pull && docker compose up -d` + - 停止所有服务:`docker compose down` + + + +启动指定生产环境 docker compose -f docker-compose.prod.yml ps \ No newline at end of file diff --git a/GIT技术.md b/GIT技术.md new file mode 100644 index 0000000..2eae5d2 --- /dev/null +++ b/GIT技术.md @@ -0,0 +1,185 @@ +# GIT 功能应用 + + + + + + + +## 安装 + +```bash +sudo apt install git +sudo apt install gitk git-cola 图形化工具 + +全局设置 +git config --global user.email "zimyx@.com" +git config --global user.name "Your Name" + + +``` + + + + + + + + + +公司 http://117.78.60.236:8000/users/sign_in zhanghonggang@csbr.cn w~7! + +## 分支管理规范 + +分支类型与命名规范 + +### 分支命名格式 + +| 分支类型 | 命名格式 | 示例 | +|---------|---------|------| +| 主分支 | `main` | `main` | +| 开发分支 | `dev` | `dev` | +| 功能分支 | `feat/<功能名>` | `feat/event`, `feat/stream` | +| 热修复分支 | `hotfix/<问题名>` | `hotfix/event` | +| 发布分支 | `release/<版本号>` | `release/v1.0.0` | + +### 分支用途说明 + +| 分支类型 | 用途说明 | +|---------|---------| +| `main` | 稳定版本分支,直接对应生产环境代码 | +| `dev` | 开发主分支,所有功能集成分支先合并到此分支 | +| `feat/*` | 功能开发分支,用于开发新功能 | +| `hotfix/*` | 热修复分支,用于紧急修复生产问题 | +| `release/*` | 发布分支,用于版本发布准备 | + +### 分支流程规则 + +#### 核心规则:dev 禁止合并到 main + +**重要约束:`dev` 分支** **禁止直接合并到 `main` 分支** + +- `dev` 是开发分支,代码未经充分测试,不适合直接发布到生产 +- 所有合并到 `main` 的代码必须通过 `release/*` 分支或直接从 `hotfix/*` 分支合并 + +#### 标准合并流程 + +``` +功能开发流程: +feat/* → dev → release/* → main + +热修复流程: +hotfix/* → main (紧急情况可直接合并) +hotfix/* → dev (同步修复到开发分支) +``` + +#### 各分支合并规则 + +| 从分支 | 合并到 | 是否允许 | 说明 | +|-------|-------|---------|------| +| `feat/*` | `dev` | ✅ 允许 | 功能开发完成后合并到开发分支 | +| `dev` | `main` | ❌ 禁止 | dev 禁止直接合并到 main | +| `dev` | `release/*` | ✅ 允许 | 开发分支可以合并到发布分支 | +| `release/*` | `main` | ✅ 允许 | 发布分支通过测试后合并到 main | +| `release/*` | `dev` | ✅ 允许 | 发布分支可同步回开发分支 | +| `hotfix/*` | `main` | ✅ 允许 | 热修复可直接合并到 main | +| `hotfix/*` | `dev` | ✅ 允许 | 热修复需同步到开发分支 | +| `feat/*` | `main` | ❌ 禁止 | 功能分支禁止直接合并到 main | +| `feat/*` | 其他 `feat/*` | ❌ 禁止 | 功能分支之间禁止互相合并 | + +### 当前项目分支结构 + +``` +本地分支: + - dev (开发主分支,当前工作分支) + - feat/event (事件功能分支) + - feat/stream (流功能分支) + - hotfix/event (事件热修复分支) + - main (生产环境分支) + +远程分支: + - origin/main + - origin/dev + - origin/feat/event +``` + +### Git 操作指南 + +#### 创建功能分支 + +```bash +# 从 main 创建功能分支 +git checkout main +git pull origin main +git checkout -b feat/your-feature-name +``` + +#### 合并功能到 dev + +```bash +# 完成功能开发后,切换到 dev 分支 +git checkout dev +git pull origin dev + +# 合并功能分支 +git merge feat/your-feature-name + +# 推送 dev 分支 +git push origin dev +``` + +#### 创建发布分支 + +```bash +# 从 dev 创建发布分支 +git checkout dev +git pull origin dev +git checkout -b release/v1.0.0 +``` + +#### 发布到 main + +```bash +# 在 release 分支完成测试后,合并到 main +git checkout main +git pull origin main +git merge release/v1.0.0 +git push origin main + +# 删除发布分支 +git branch -d release/v1.0.0 +git push origin --delete release/v1.0.0 +``` + +#### 热修复操作 + +```bash +# 从 main 创建热修复分支 +git checkout main +git pull origin main +git checkout -b hotfix/issue-description + +# 修复完成后,合并到 main +git checkout main +git merge hotfix/issue-description +git push origin main + +# 同时合并到 dev +git checkout dev +git merge hotfix/issue-description +git push origin dev + +# 删除热修复分支 +git branch -d hotfix/issue-description +``` + + + + + +``` +git add . +git commit -m "更新代码" +git push +``` + diff --git a/Linux技术.md b/Linux技术.md new file mode 100644 index 0000000..41209e2 --- /dev/null +++ b/Linux技术.md @@ -0,0 +1,44 @@ + + +ss -tunlp | grep -E '9090|789.' 查看 9090 和 789* 端口 +find / -name '*.php' 查询根目录下所有php后缀文件 +for file in *.docx; do pandoc "$file" -o "${file%.docx}.md"; done 批量转换 docx 到 md +pdf2docx convert a.pdf a.docx 转换 pdf 到 docx +ps aux --sort=-%cpu | head -n 6 CPU占用Top5 free -h 内存占用 +openssl rand -base64 12 生成长度为12的随机密码 +curl 4.ipw.cn 本机上网出口 IPv4 查询, 4可改6 +Watch -n 1 'ls -l' 每隔1秒执行目录查询 +tar -cvf coze-studio.tar coze-studio 创建tar压缩包 +tar -xvf coze-studio.tar 解压 +映射远程目录 Nautilus (GNOME Files) 的“其他位置”输入 sftp://1.95.127.205 + +sudo apt update +sudo apt upgrade + + + +===================================================快捷键 +Alt + F2 运行应用程序 Copilot(AI键) 打开命令行终端 cmd +WIN +Alt + → / ← 工作台间切换 WIN + ` 在同一个应用程序的窗口之间切换 +Shift + 小键盘0~5 0截屏 1目录 2 firefox 3 邮箱 4Typora 5 记事 . 摄像头 +Emoji 符号 通过 Firefox 插件输入 + + + + + +``` +VIM 快捷键 +yy -> p(粘贴) 拷贝当前行 nyy -> p(粘贴) 拷贝下n行 +dd 删除当前行 u 撤销 5 -> Shift+g 定位到第5行 +/xx ->n(下一个) 查找文本。上一个按‘N’ +G 文末 gg 首行 :set nu 设置行号 +========================================================== +``` + +============== +安装 ufw: sudo pacman -S ufw 防火墙 +sudo ufw enable 启用 ufw +sudo ufw default deny incoming 设置默认策略为拒绝所有传入和传出连接 +sudo ufw default deny outgoing +允许特定的服务和端口。例如,允许 SSH 连接:sudo ufw allow ssh 或者:sudo ufw allow 22/tcp 最后,检查 ufw 的状态:sudo ufw status diff --git a/Python技术.md b/Python技术.md new file mode 100644 index 0000000..3d425f9 --- /dev/null +++ b/Python技术.md @@ -0,0 +1,21 @@ + + +``` +PYTHON +apt install python3-pip python3-dev 安装包管理器、编译环境包 (3.12.3) +python3 -m venv .venv 为对应目录创建虚拟环境 +source .venv/bin/activate 激活后终端会显示 + +Python 使用 pyenv 多版本管理 +$ curl https://pyenv.run | bash 安装pyenv +echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc +echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc +echo 'eval "$(pyenv init - bash)"' >> ~/.bashrc +source .bashrc +pyenv install 3.10.0 安装 Python 3.10 +pyenv global 3.10.0 设置全局 Python 版本 +用 Python 3.10 创建虚拟环境, 进入项目目录 +python3.10 -m venv .venv_py310 +source .venv_py310/bin/activate +``` + diff --git a/hermes-agent.md b/hermes-agent.md new file mode 100644 index 0000000..206be0c --- /dev/null +++ b/hermes-agent.md @@ -0,0 +1,616 @@ +# Hermes Agent 0.9.0 + +https://hermes-agent.nousresearch.com/docs/user-guide/docker?_highlight=docker#docker-compose-example + +1.95.126.170 + +## 部署 + +​ /www/hermes/docker-compose.yml + +```yml +services: + hermes: + image: nousresearch/hermes-agent:latest + container_name: hermes + restart: unless-stopped + command: gateway run + ports: + - "8642:8642" + volumes: + - ~/.hermes:/opt/data + networks: + - hermes-net + # Uncomment to forward specific env vars instead of using .env file: + # environment: + # - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} + # - OPENAI_API_KEY=${OPENAI_API_KEY} + # - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN} + deploy: + resources: + limits: + memory: 4G + cpus: "2.0" + + dashboard: + image: nousresearch/hermes-agent:latest + container_name: hermes-dashboard + restart: unless-stopped + command: dashboard --host 0.0.0.0 --insecure + ports: + - "9119:9119" + volumes: + - ~/.hermes:/opt/data + environment: + - GATEWAY_HEALTH_URL=http://hermes:8642 + networks: + - hermes-net + depends_on: + - hermes + deploy: + resources: + limits: + memory: 512M + cpus: "0.5" + +networks: + hermes-net: + driver: bridge +``` + +## 配置 + +cat ~/.hermes/.env 配置仪表盘、模型、消息平台 + +```ini +# Hermes Agent Environment Configuration +# Copy this file to .env and fill in your API keys + +# ============================================================================= +# LLM PROVIDER (OpenRouter) +# ============================================================================= +# OpenRouter provides access to many models through one API +# All LLM calls go through OpenRouter - no direct provider keys needed +# Get your key at: https://openrouter.ai/keys +# OPENROUTER_API_KEY= + +# Default model is configured in ~/.hermes/config.yaml (model.default). +# Use 'hermes model' or 'hermes setup' to change it. +# LLM_MODEL is no longer read from .env — this line is kept for reference only. +# LLM_MODEL=anthropic/claude-opus-4.6 + +# ============================================================================= +# LLM PROVIDER (Google AI Studio / Gemini) +# ============================================================================= +# Native Gemini API via Google's OpenAI-compatible endpoint. +# Get your key at: https://aistudio.google.com/app/apikey +# GOOGLE_API_KEY=your_google_ai_studio_key_here +# GEMINI_API_KEY=your_gemini_key_here # alias for GOOGLE_API_KEY +# Optional base URL override (default: Google's OpenAI-compatible endpoint) +# GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai + +# ============================================================================= +# LLM PROVIDER (z.ai / GLM) +# ============================================================================= +# z.ai provides access to ZhipuAI GLM models (GLM-4-Plus, etc.) +# Get your key at: https://z.ai or https://open.bigmodel.cn +# GLM_API_KEY= +# GLM_BASE_URL=https://api.z.ai/api/paas/v4 # Override default base URL + +# ============================================================================= +# LLM PROVIDER (Kimi / Moonshot) +# ============================================================================= +# Kimi Code provides access to Moonshot AI coding models (kimi-k2.5, etc.) +# Get your key at: https://platform.kimi.ai (Kimi Code console) +# Keys prefixed sk-kimi- use the Kimi Code API (api.kimi.com) by default. +# Legacy keys from platform.moonshot.ai need KIMI_BASE_URL override below. +# KIMI_API_KEY= +# KIMI_BASE_URL=https://api.kimi.com/coding/v1 # Default for sk-kimi- keys +# KIMI_BASE_URL=https://api.moonshot.ai/v1 # For legacy Moonshot keys +# KIMI_BASE_URL=https://api.moonshot.cn/v1 # For Moonshot China keys +# KIMI_CN_API_KEY= # Dedicated Moonshot China key + +# ============================================================================= +# LLM PROVIDER (Arcee AI) +# ============================================================================= +# Arcee AI provides access to Trinity models (trinity-mini, trinity-large-*) +# Get an Arcee key at: https://chat.arcee.ai/ +# ARCEEAI_API_KEY= +# ARCEE_BASE_URL= # Override default base URL + +# ============================================================================= +# LLM PROVIDER (MiniMax) +# ============================================================================= +# MiniMax provides access to MiniMax models (global endpoint) +# Get your key at: https://www.minimax.io +# MINIMAX_API_KEY= +# MINIMAX_BASE_URL=https://api.minimax.io/v1 # Override default base URL + +# MiniMax China endpoint (for users in mainland China) +# MINIMAX_CN_API_KEY= +# MINIMAX_CN_BASE_URL=https://api.minimaxi.com/v1 # Override default base URL + +# ============================================================================= +# LLM PROVIDER (OpenCode Zen) +# ============================================================================= +# OpenCode Zen provides curated, tested models (GPT, Claude, Gemini, MiniMax, GLM, Kimi) +# Pay-as-you-go pricing. Get your key at: https://opencode.ai/auth +# OPENCODE_ZEN_API_KEY= +# OPENCODE_ZEN_BASE_URL=https://opencode.ai/zen/v1 # Override default base URL +A + +# ============================================================================= +# LLM PROVIDER (OpenCode Go) +# ============================================================================= +# OpenCode Go provides access to open models (GLM-5, Kimi K2.5, MiniMax M2.5) +# A +# $10/month subscription. Get your key at: https://opencode.ai/auth +# OPENCODE_GO_API_KEY= + +# ============================================================================= +# LLM PROVIDER (Hugging Face Inference Providers) +# ============================================================================= +# Hugging Face routes to 20+ open models via unified OpenAI-compatible endpoint. +# Free tier included ($0.10/month), no markup on provider rates. +# Get your token at: https://huggingface.co/settings/tokens +# Required permission: "Make calls to Inference Providers" +# HF_TOKEN= +# OPENCODE_GO_BASE_URL=https://opencode.ai/zen/go/v1 # Override default base URL + +# ============================================================================= +# LLM PROVIDER (Qwen OAuth) +# ============================================================================= +# Qwen OAuth reuses your local Qwen CLI login (qwen auth qwen-oauth). +# No API key needed — credentials come from ~/.qwen/oauth_creds.json. +# Optional base URL override: +# HERMES_QWEN_BASE_URL=https://portal.qwen.ai/v1 + +# ============================================================================= +# LLM PROVIDER (Xiaomi MiMo) +# ============================================================================= +# Xiaomi MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash). +# Get your key at: https://platform.xiaomimimo.com +# XIAOMI_API_KEY=your_key_here +# Optional base URL override: +# XIAOMI_BASE_URL=https://api.xiaomimimo.com/v1 + +# ============================================================================= +# TOOL API KEYS +# ============================================================================= + +# Exa API Key - AI-native web search and contents +# Get at: https://exa.ai +# EXA_API_KEY= + +# Parallel API Key - AI-native web search and extract +# Get at: https://parallel.ai +# PARALLEL_API_KEY= + +# Firecrawl API Key - Web search, extract, and crawl +# Get at: https://firecrawl.dev/ +# FIRECRAWL_API_KEY= + + +# FAL.ai API Key - Image generation +# Get at: https://fal.ai/ +# FAL_KEY= + +# Honcho - Cross-session AI-native user modeling (optional) +# Builds a persistent understanding of the user across sessions and tools. +# Get at: https://app.honcho.dev +# Also requires ~/.honcho/config.json with enabled=true (see README). +# HONCHO_API_KEY= + +# ============================================================================= +# TERMINAL TOOL CONFIGURATION +# ============================================================================= +# Backend type: "local", "singularity", "docker", "modal", or "ssh" +# Terminal backend is configured in ~/.hermes/config.yaml (terminal.backend). +# Use 'hermes setup' or 'hermes config set terminal.backend docker' to change. +# Supported: local, docker, singularity, modal, ssh +# +# Only override here if you need to force a backend without touching config.yaml: +# TERMINAL_ENV=local + +# Override the container runtime binary (e.g. to use Podman instead of Docker). +# Useful on systems where Docker's storage driver is broken or unavailable. +# HERMES_DOCKER_BINARY=/usr/local/bin/podman + +# Container images (for singularity/docker/modal backends) +# TERMINAL_DOCKER_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20 +# TERMINAL_SINGULARITY_IMAGE=docker://nikolaik/python-nodejs:python3.11-nodejs20 +TERMINAL_MODAL_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20 + + +# Working directory for terminal commands +# For local backend: "." means current directory (resolved automatically) +# For remote backends (ssh/docker/modal/singularity): use an absolute path +# INSIDE the target environment, or leave unset for the backend's default +# (/root for modal, / for docker, ~ for ssh). Do NOT use a host-local path. +# Usually managed by config.yaml (terminal.cwd) — uncomment to override +# TERMINAL_CWD=. + +# Default command timeout in seconds +TERMINAL_TIMEOUT=60 + +# Cleanup inactive environments after this many seconds +TERMINAL_LIFETIME_SECONDS=300 + +# ============================================================================= +# SSH REMOTE EXECUTION (for TERMINAL_ENV=ssh) +# ============================================================================= +# Run terminal commands on a remote server via SSH. +# Agent code stays on your machine, commands execute remotely. +# +# SECURITY BENEFITS: +# - Agent cannot read your .env file (API keys protected) +# - Agent cannot modify its own code +# - Remote server acts as isolated sandbox +# - Can safely configure passwordless sudo on remote +# +# TERMINAL_SSH_HOST=192.168.1.100 +# TERMINAL_SSH_USER=agent +# TERMINAL_SSH_PORT=22 +# TERMINAL_SSH_KEY=~/.ssh/id_rsa + +# ============================================================================= +# SUDO SUPPORT (works with ALL terminal backends) +# ============================================================================= +# If set, enables sudo commands by piping password via `sudo -S`. +# Works with: local, docker, singularity, modal, and ssh backends. +# +# SECURITY WARNING: Password stored in plaintext. Only use on trusted machines. +# +# ALTERNATIVES: +# - For SSH backend: Configure passwordless sudo on the remote server +# - For containers: Run as root inside the container (no sudo needed) +# - For local: Configure /etc/sudoers for specific commands +# - For CLI: Leave unset - you'll be prompted interactively with 45s timeout +# +# SUDO_PASSWORD=your_password_here + +# ============================================================================= +# MODAL CLOUD BACKEND (Optional - for TERMINAL_ENV=modal) +# ============================================================================= +# Modal uses CLI authentication, not environment variables. +# Run: pip install modal && modal setup +# This will authenticate via browser and store credentials locally. +# No API key needed in .env - Modal handles auth automatically. + +# ============================================================================= +# BROWSER TOOL CONFIGURATION (agent-browser + Browserbase) +# ============================================================================= +# Browser automation requires Browserbase cloud service for remote browser execution. +# This allows the agent to navigate websites, fill forms, and extract information. +# +# STEALTH MODES: +# - Basic Stealth: ALWAYS active (random fingerprints, auto CAPTCHA solving) +# - Advanced Stealth: Requires BROWSERBASE_ADVANCED_STEALTH=true (Scale Plan only) + +# Browserbase API Key - Cloud browser execution +# Get at: https://browserbase.com/ +# BROWSERBASE_API_KEY= + +# Browserbase Project ID - From your Browserbase dashboard +# BROWSERBASE_PROJECT_ID= + +# Enable residential proxies for better CAPTCHA solving (default: true) +# Routes traffic through residential IPs, significantly improves success rate +BROWSERBASE_PROXIES=true + +# Enable advanced stealth mode (default: false, requires Scale Plan) +# Uses custom Chromium build to avoid bot detection altogether +BROWSERBASE_ADVANCED_STEALTH=false + +# Browser session timeout in seconds (default: 300) +# Sessions are cleaned up after this duration of inactivity +BROWSER_SESSION_TIMEOUT=300 + +# Browser inactivity timeout - auto-cleanup inactive sessions (default: 120 = 2 min) +# Browser sessions are automatically closed after this period of no activity +BROWSER_INACTIVITY_TIMEOUT=120 + +# ============================================================================= +# SESSION LOGGING +# ============================================================================= +# Session trajectories are automatically saved to logs/ directory +# Format: logs/session_YYYYMMDD_HHMMSS_UUID.json +# Contains full conversation history in trajectory format for debugging/replay + +# ============================================================================= +# VOICE TRANSCRIPTION & OPENAI TTS +# ============================================================================= +# Required for voice message transcription (Whisper) and OpenAI TTS voices. +# Uses OpenAI's API directly (not via OpenRouter). +# Named VOICE_TOOLS_OPENAI_KEY to avoid interference with OpenRouter. +# Get at: https://platform.openai.com/api-keys +# VOICE_TOOLS_OPENAI_KEY= + +# ============================================================================= +# SLACK INTEGRATION +# ============================================================================= +# Slack Bot Token - From Slack App settings (OAuth & Permissions) +# Get at: https://api.slack.com/apps +# SLACK_BOT_TOKEN=xoxb-... + +# Slack App Token - For Socket Mode (App-Level Tokens in Slack App settings) +# SLACK_APP_TOKEN=xapp-... + +# Slack allowed users (comma-separated Slack user IDs) +# SLACK_ALLOWED_USERS= + +# ============================================================================= +# TELEGRAM INTEGRATION +# ============================================================================= +# Telegram Bot Token - From @BotFather (https://t.me/BotFather) +# TELEGRAM_BOT_TOKEN= +# TELEGRAM_ALLOWED_USERS= # Comma-separated user IDs +# TELEGRAM_HOME_CHANNEL= # Default chat for cron delivery +# TELEGRAM_HOME_CHANNEL_NAME= # Display name for home channel + +# Webhook mode (optional — for cloud deployments like Fly.io/Railway) +# Default is long polling. Setting TELEGRAM_WEBHOOK_URL switches to webhook mode. +# TELEGRAM_WEBHOOK_URL=https://my-app.fly.dev/telegram +# TELEGRAM_WEBHOOK_PORT=8443 +# TELEGRAM_WEBHOOK_SECRET= # Recommended for production + +# WhatsApp (built-in Baileys bridge — run `hermes whatsapp` to pair) +# WHATSAPP_ENABLED=false +# WHATSAPP_ALLOWED_USERS=15551234567 + +# Email (IMAP/SMTP — send and receive emails as Hermes) +# For Gmail: enable 2FA → create App Password at https://myaccount.google.com/apppasswords +# EMAIL_ADDRESS=hermes@gmail.com +# EMAIL_PASSWORD=xxxx xxxx xxxx xxxx +# EMAIL_IMAP_HOST=imap.gmail.com +# EMAIL_IMAP_PORT=993 +# EMAIL_SMTP_HOST=smtp.gmail.com +# EMAIL_SMTP_PORT=587 +# EMAIL_POLL_INTERVAL=15 +# EMAIL_ALLOWED_USERS=your@email.com +# EMAIL_HOME_ADDRESS=your@email.com + +# Gateway-wide: allow ALL users without an allowlist (default: false = deny) +# Only set to true if you intentionally want open access. +# GATEWAY_ALLOW_ALL_USERS=false + +# ============================================================================= +# RESPONSE PACING +# ============================================================================= +# Human-like delays between message chunks on messaging platforms. +# Makes the bot feel less robotic. +# HERMES_HUMAN_DELAY_MODE=off # off | natural | custom +# HERMES_HUMAN_DELAY_MIN_MS=800 # Min delay in ms (custom mode) +# HERMES_HUMAN_DELAY_MAX_MS=2500 # Max delay in ms (custom mode) + +# ============================================================================= +# DEBUG OPTIONS +# ============================================================================= +WEB_TOOLS_DEBUG=false +VISION_TOOLS_DEBUG=false +MOA_TOOLS_DEBUG=false +IMAGE_TOOLS_DEBUG=false + +# ============================================================================= +# CONTEXT COMPRESSION (Auto-shrinks long conversations) +# ============================================================================= +# When conversation approaches model's context limit, middle turns are +# automatically summarized to free up space. +# +# Context compression is configured in ~/.hermes/config.yaml under compression: +# CONTEXT_COMPRESSION_ENABLED=true # Enable auto-compression (default: true) +# CONTEXT_COMPRESSION_THRESHOLD=0.85 # Compress at 85% of context limit +# Model is set via compression.summary_model in config.yaml (default: google/gemini-3-flash-preview) + +# ============================================================================= +# RL TRAINING (Tinker + Atropos) +# ============================================================================= +# Run reinforcement learning training on language models using the Tinker API. +# Requires the rl-server to be running (from tinker-atropos package). + +# Tinker API Key - RL training service +# Get at: https://tinker-console.thinkingmachines.ai/keys +# TINKER_API_KEY= + +# Weights & Biases API Key - Experiment tracking and metrics +# Get at: https://wandb.ai/authorize +# WANDB_API_KEY= + +# RL API Server URL (default: http://localhost:8080) +# Change if running the rl-server on a different host/port +# RL_API_URL=http://localhost:8080 + +# ============================================================================= +# SKILLS HUB (GitHub integration for skill search/install/publish) +# ============================================================================= + +# GitHub Personal Access Token — for higher API rate limits on skill search/install +# Get at: https://github.com/settings/tokens (Fine-grained recommended) +# GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx + +# GitHub App credentials (optional — for bot identity on PRs) +# GITHUB_APP_ID= +# GITHUB_APP_PRIVATE_KEY_PATH= +# GITHUB_APP_INSTALLATION_ID= + +# Groq API key (free tier — used for Whisper STT in voice mode) +# GROQ_API_KEY= + +# ============================================================================= +# STT PROVIDER SELECTION +# ============================================================================= +# Default STT provider is "local" (faster-whisper) — runs on your machine, no API key needed. +# Install with: pip install faster-whisper +# Model downloads automatically on first use (~150 MB for "base"). +# To use cloud providers instead, set GROQ_API_KEY or VOICE_TOOLS_OPENAI_KEY above. +# Provider priority: local > groq > openai +# Configure in config.yaml: stt.provider: local | groq | openai + +# ============================================================================= +# STT ADVANCED OVERRIDES (optional) +# ============================================================================= +# Override default STT models per provider (normally set via stt.model in config.yaml) +# STT_GROQ_MODEL=whisper-large-v3-turbo +# STT_OPENAI_MODEL=whisper-1 + +# Override STT provider endpoints (for proxies or self-hosted instances) +# GROQ_BASE_URL=https://api.groq.com/openai/v1 +# STT_OPENAI_BASE_URL=https://api.openai.com/v1 + +# 基本配置 +GATEWAY_ALLOW_ALL_USERS=true + +# Dashboard 认证 +DASHBOARD_USERNAME=admin +DASHBOARD_PASSWORD=your_secure_password_here # 改成你的密码 + +# 时区 +TZ=Asia/Shanghai +ARCEEAI_API_KEY=dec57933-c3e4-4f4a-a114-b0ae8104e396 +ARCEE_BASE_URL=https://ark.cn-beijing.volces.com/api/coding/v3 + + +WECOM_BOT_ID=aib1TzEwFCq0ia1...qlDLt2Fv +WECOM_SECRET=1U12qLqTdV......ZRIt0ts +GATEWAY_ALLOW_ALL_USERS=true +WECOM_ALLOWED_USERS=LiuRan,ZhangHongGang,WangJian +WECOM_ENABLE_ATTACHMENTS=true +``` + +### 飞书 + +**飞书开发者后台** + +1. 登录[飞书开放平台](https://open.feishu.cn/app?lang=zh-CN),创建**企业自建应用** +2. `App ID` cli_a968...9cb6 和 `App Secret` Q8ykzs7vJdhnda.....qpxi1yfAJIUQH +3. 进入「添加应用能力」→ **机器人** →允许接收私聊、允许接收群聊 +4. 进入「权限管理」 + - `im:message:send_as_bot`(机器人发消息) + - `im:message:readonly`(读消息) + - `im:message.group_at_msg:readonly`(群 @) + - `im:message.p2p_msg:readonly`(私聊) + - `contact:user.employee_id:readonly`(读用户 ID) +5. 事件订阅(WebSocket 模式) + - 左侧:开发配置 → 事件与回调 + - 接收方式:**使用长连接(WebSocket)** + - 订阅事件:**`im.message.receive_v1`**(接收消息) + - 保存 +6. 版本发布**→ 发布**(企业自建应用直接可用) + +**服务器:安装飞书依赖** + +```bash +# 进入 Hermes 环境(根据你部署方式) +cd /path/to/hermes + +# 安装飞书 SDK +uv pip install lark-oapi websockets +# 或用 pip(无 uv) +pip install lark-oapi websockets +``` + +### 企业微信 + +1. 打开企业微信管理后台,创建机器人:https://work.weixin.qq.com/wework_admin/frame#/aiHelper/create 选择 API 方式创建 **CS Agent** + - 连接方式: 使用长连接 + - Bot ID aib1T ...t2Fv + - Secret 1U12qLqTdV pd8kKBrcpdk9pZRIt0ts +2. 可使用权限 + - 用户信息 + - 文档 + +## 运行 + +``` +cd /www/hermes/ +docker run -it --rm -v ~/.hermes:/opt/data nousresearch/hermes-agent +``` + + + + + +我想让你通过学习成为医学减重领域的个案管理师,怎么样 + +我可以按照「权威知识库搭建→适配你的工作规范→实操迭代优化」三步来完成学习,完全适配你的工作需求: + +### 第一步:先搭建权威的医学减重知识基底(我可以自主完成) + +我会优先检索学习国内国际公认的权威指南作为底层知识框架,包括: + +- 《中国超重/肥胖医学营养治疗指南(2023版)》 +- WHO成人肥胖预防与管理临床指南 +- 肥胖合并高血压/糖尿病/多囊卵巢综合征等并发症的个体化干预规范 +- 医学减重个案管理师执业标准、随访规范、风险预警指征 所有核心知识我会保存到持久化记忆,使用时严格遵循规范不会出错。 + + + +### 第二步:对齐你的专属工作要求(需要你提供相关资料) + +你可以把你常用的工作资料发给我,我会全部学习固化: + +1. 你的机构内部工作规范、减重方案模板、个案评估表/随访记录表模板 +2. 过往你处理过的优质个案案例(可以隐去隐私信息) +3. 你需要我输出的报告格式、需要遵守的特殊要求 我会把这些规则做成专属技能,后续处理任务100%符合你的工作标准。 + + + +### 第三步:实操迭代优化 + +你可以先拿少量测试个案让我尝试处理,你给出反馈后我会快速修正优化,几轮之后就能达到你要求的处理精度。 + + + +### 学完后我可以帮你完成的工作: + +✅ 个案初筛:自动根据用户的体检数据、基础病史做减重风险评估,判断适应症/禁忌症 ✅ 方案生成:根据个体情况生成个性化的饮食/运动/行为干预方案,特殊合并症人群自动适配特殊要求 ✅ 随访管理:自动生成随访计划、随访话术,可配置定时任务自动给你推送当日待跟进个案清单 ✅ 数据统计:自动统计个案的减重达标率、依从率、不良反应发生率等数据,生成汇总报表 + +你看这个学习路径是否符合你的预期?有没有需要我优先学习的资料或者重点侧重的功能? + +## mini 机器 + +https://github.com/NousResearch/hermes-agent + + + +``` +第一步:创建虚拟环境(彻底避开系统限制) +bash +运行 +cd ~/hermes-agent-main +python3 -m venv venv +source venv/bin/activate +第二步:安装所有依赖(不会再报错!) +bash +运行 +pip install -r requirements.txt +第三步:运行项目 +bash +运行 +./hermes +为什么这个方法一定成功? +之前报错:因为你直接用系统 Python,系统包不允许被 pip 修改 / 卸载 +现在方法:创建了独立干净的 Python 环境,没有任何权限冲突 +所有包都能正常安装,不会再出现任何报错 +以后每次使用前,先激活环境 +每次新开终端,只需要先运行这一行: +bash +运行 +source ~/hermes-agent-main/venv/bin/activate +然后再运行: +bash +运行 +./hermes +总结 +执行这三行,彻底解决所有问题: +bash +运行 +cd ~/hermes-agent-main +python3 -m venv venv +source venv/bin/activate +pip install -r requirements.txt +./hermes +再也不会有 ModuleNotFound / 无法卸载 / 权限报错 等问题 +这是 Linux 上运行 Python 项目的标准正确方法 +``` + diff --git a/备份与恢复.md b/备份与恢复.md new file mode 100644 index 0000000..8a8ac9e --- /dev/null +++ b/备份与恢复.md @@ -0,0 +1,102 @@ + + + + + + +导出 firefox 书签 + +备份 typora 的主题 + + + +``` +============== +备份 : +步骤1:安全移除并重新连接 + +# 1. 确保设备未挂载 +sudo umount /dev/sda1 2>/dev/null + +# 2. 安全移除 +sudo udisksctl power-off -b /dev/sda + +# 3. 等待 5 秒,重新插拔 USB +sleep 5 +# (手动重新插拔 USB 线) + +步骤2:验证 UAS 已禁用 + +# 查看 USB 设备模式 +sudo dmesg | tail -20 +# 应该看到 "usb-storage" 而不是 "uas" + +# 或查看 SCSI 主机 +sudo lsscsi -t +# 应该显示 "usb: " 而不是 "uas: " + +步骤3:重新分区和格式化 + +# 1. 创建 GPT 分区表 +sudo parted /dev/sda --script mklabel gpt + +# 2. 创建单个分区 +sudo parted /dev/sda --script mkpart primary 0% 100% + +# 3. 等待分区识别 +sleep 2 +sudo partprobe /dev/sda + +# 4. 格式化为 ext4(带优化) +sudo mkfs.ext4 -F -L "BACKUP_DISK" -m 0 /dev/sda1 + +# 5. 检查结果 +sudo blkid /dev/sda1 +sudo fsck -f /dev/sda1 + +步骤4:优化挂载参数 + +# 1. 创建挂载点 +sudo mkdir -p /mnt/backup + +# 2. 挂载(使用稳定参数) +sudo mount -o noatime,nodiratime,errors=remount-ro /dev/sda1 /mnt/backup + +# 3. 验证 +df -h /mnt/backup +mount | grep sda1 + +步骤5:使用稳定的备份策略 + +# 1. 创建备份脚本 +cat > /tmp/backup.sh << 'EOF' +#!/bin/bash +LOG="/tmp/backup_$(date +%Y%m%d_%H%M%S).log" +{ +echo "=== 备份开始: $(date) ===" + +# 分阶段备份,从大目录开始 +echo "阶段1: 备份主目录" +rsync -av --no-owner --no-group --timeout=180 \ + --exclude='/.cache/' \ + --exclude='/.local/share/Trash/' \ + /home/ /mnt/backup/ + +echo "阶段2: 备份缓存目录(排除浏览器大缓存)" +rsync -av --no-owner --no-group --timeout=180 \ + --exclude='chromium/Default/Cache/' \ + --exclude='google-chrome/Default/Cache/' \ + --exclude='*/.cache/yay/pkg/' \ + /home/*/.cache/ /mnt/backup/*/.cache/ + +echo "=== 备份完成: $(date) ===" +} > "$LOG" 2>&1 & +EOF + +# 2. 执行备份 +sudo bash /tmp/backup.sh + +# 3. 查看日志 +tail -f /tmp/backup_*.log +``` + diff --git a/电子秤图片识别系统设计.md b/电子秤图片识别系统设计.md new file mode 100644 index 0000000..f8ca245 --- /dev/null +++ b/电子秤图片识别系统设计.md @@ -0,0 +1,271 @@ +# 电子秤图片识别系统设计文档 + +> 版本:v1.0 +> 最后更新:2026-05-20 +> 状态:草案(待迭代) + +## 1. 项目背景 + +### 1.1 业务场景 +用户拍摄电子秤显示器图片,需自动提取**读数(数字 + 单位)**用于数据录入或核对。 +图片来源包括家用体重秤、台秤、智能手机 App 截图等。 + +### 1.2 现状 +- 现有脚本 `preprocess.py` 仅完成 ROI 裁剪 + 对比度增强,无识别能力 +- 122 张样本通过人工 + 多模态模型识别,准确率约 80%,主要误差来源见 §2 + +### 1.3 目标 +- **短期**:将自动识别准确率提升至 **≥85%**,剩余由人工核对 +- **中期**:达到 **≥95%**,仅极端模糊/异常显示需要人工 +- **长期**:支持任意品牌秤的型号自动识别 + 单位歧义消除 + + +## 2. 问题分析(基于 122 张样本) + +### 2.1 误差来源分布 + +| 误差类型 | 占比 | 典型样例 | 严重性 | +|---|---|---|---| +| 拍摄方向倒置 | ~25% | 倒着读"119.35"识别为反向 | 🔴 高 | +| ROI 裁切过紧 | ~12% | 只剩"6"丢失其余数字 | 🔴 高 | +| 7段数码管字符混淆 | ~16% | `7↔L`、`2↔5`、`6↔9` | 🟡 中 | +| 显示器反光/模糊 | ~8% | 粉色秤小屏幕 | 🟡 中 | +| 单位歧义(kg/斤) | ~5% | 小米秤同显示两种模式 | 🟢 低 | +| 显示异常(启动/出错) | ~2% | 段笔画不完整 | 🟢 低 | + +### 2.2 设备类型分布 + +| 设备类型 | 占比 | 显示特征 | 难点 | +|---|---|---|---| +| 蓝背光 LCD 体重秤 | ~40% | 黑字蓝底 | 反光、自适应阈值难 | +| LED 数码管秤(橙/白) | ~20% | 高对比度发光数字 | 易倒置拍摄 | +| 香山台秤(多屏) | ~10% | 黑底白字三段 LCD | 多窗口干扰 | +| App 截图 | ~25% | 高清,多字段 | 需要定位"主体重"字段 | +| 其他(粉色、HUAWEI、LEFU 等) | ~5% | 各异 | 长尾问题 | + + +## 3. 技术方案 + +### 3.1 整体 Pipeline + +``` +┌─────────┐ ┌──────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ +│ 原始图片 │ → │ EXIF校正 │ → │ ROI检测 │ → │ 方向校正│ → │ 增强 │ → │ OCR识别 │ +└─────────┘ └──────────┘ └────────┘ └────────┘ └────────┘ └────────┘ + ↓ + ┌──────────────────┐ + │ 数值+单位解析 │ + │ 合理性校验 │ + └──────────────────┘ + ↓ + ┌──────────────────┐ + │ 输出:值/单位/置信│ + └──────────────────┘ +``` + +### 3.2 各阶段技术选型 + +#### Step 1 — EXIF 方向校正 +**目的**:手机拍摄的 JPG 自带 Orientation 元数据,可零成本转正。 +**方案**:`PIL.ExifTags` 读取 Orientation 标签,按 3/6/8 三种 case 旋转。 +**预期收益**:解决约 30% 的方向问题,几乎无成本。 + +#### Step 2 — ROI 检测(屏幕区域定位) +**目的**:从全图裁出显示器矩形,降低 OCR 干扰。 +**方案**:4 策略加权打分: +- **策略 A**:LED 高亮 + 低饱和(白/黄数码管)— boost ×1.3 +- **策略 B**:蓝色背光 LCD(HSV: H∈[85,140]) +- **策略 C**:白色/浅色 LCD(V>180, S<60) +- **策略 D**:Canny 边缘 + 横向膨胀(用于黑底白字台秤)— boost ×0.8 + +**打分维度**:面积比、宽高比、对比度、亮度。 +**风险**:当前 margin 15% 偏紧,新版调至 20%;仍需观察。 + +#### Step 3 — 方向校正(4 向暴力 + 评分) +**目的**:解决倒置/侧拍。 +**评分项**: +1. **形状先验**:显示器一般 `w>h`(横向) → +1 +2. **7段重力特征**:水平投影下 1/3 行 ≥ 上 1/3 行 → +1(数字朝上的典型分布) +3. **Tesseract OSD**:若返回 `rotate=0` → +2(强证据) + +**选择**:取 0°/90°/180°/270° 中评分最高的方向。 +**优势**:即使 OSD 失败,仍有 7段特征兜底。 + +#### Step 4 — 图像增强 +**目的**:让 OCR 识别 7 段数码管。 +**关键步骤**: +1. CLAHE(自适应直方图均衡化) +2. 极性自动判断:若图像均值<127 视为暗背景,反转 +3. 自适应阈值(Gaussian, blockSize=25) +4. **关键**:2×2 核膨胀 1 次 — 连接 7 段间隙,否则 Tesseract 会把一个数字切成多字符 + +#### Step 5 — OCR 识别(多后端) +| 后端 | 优势 | 劣势 | 推荐场景 | +|---|---|---|---| +| **Tesseract** | 易部署、有 LCD 字模 | 7段需调参 | 默认 | +| **SSOCR** | 专为 7段设计 | 仅命令行、需手编译 | 干净 LED 显示 | +| **PaddleOCR** | 中文+数字皆强 | 体积大 | App 截图 | +| **自训 CNN** | 准确率天花板高 | 需训练数据 | 长期投入 | + +**当前默认**:Tesseract,PSM=7(单行),whitelist 限制为数字 + 单位字符。 + +#### Step 6 — 数值合理性校验 +**正则提取**:`(\d{1,3}\.?\d{0,2})\s*(kg|KG|公斤|千克|斤|两|lb)?` +**合理范围**: +- kg / 公斤 / 千克:10–200 +- 斤 / 两:20–400 +- lb:20–440 + +**作用**:拦截 OCR 典型错误(如把 `1` 错认 `7` 得到 170kg 时直接 reject)。 + + +## 4. 文件结构 + +``` +新建文件夹 1/ +├── 未识别/ # 待处理原图 +├── 预处理/ # 旧版输出(v1) +├── 预处理v2/ # 新版输出 +│ ├── *.png # 各图的二值化预处理结果 +│ └── _results.csv # 识别汇总(文件名/旋转/OCR/置信度/解析值) +├── preprocess_v2.py # 新版(预处理 + 识别 + 校验) +└── 设计文档.md # 本文档 +``` + + +## 5. 评估指标 + +### 5.1 准确率指标 +- **Top-1 准确率**:单次识别完全正确(数字 + 单位都对)的比例 +- **数字准确率**:仅数字正确(容忍单位错误) +- **±5% 容忍准确率**:识别值与真值相对误差 ≤5%(避免小数位识别错带来的负面感受) + +### 5.2 召回率 / 拒识率 +- **拒识率**:合理性校验后输出"无效"的比例(高拒识率好过错误识别) +- **置信度阈值**:建议 Tesseract conf > 60 才接受,否则降级人工 + +### 5.3 性能 +- 单图处理时间 < 1s(CPU 单线程) +- 122 张全量处理 < 2 分钟 + + +## 6. 已知局限与待解决 + +| # | 问题 | 当前对策 | 长期改进 | +|---|---|---|---| +| L1 | 粉色秤显示器尺寸过小(占图 <1%) | margin 放宽 | 改用超分辨率(ESRGAN)预处理 | +| L2 | 启动画面/异常显示误识 | 合理性校验拦截 | 加一个"是否合法读数"的分类器 | +| L3 | 香山台秤三屏,只需重量栏 | ROI 取最大 LCD | 训练专用检测器,按字段标签裁切 | +| L4 | 小米秤 kg/斤 模式无法从显示判断 | 默认按区间猜单位 | 加品牌+型号识别(CNN 或 logo 匹配) | +| L5 | OCR 把"6/9"反向识别 | 4向暴力旋转 | 训练朝向检测器 | +| L6 | App 截图字段繁多 | 暂用通用 OCR + 选最大数字 | 训练 App 截图专用 layout 模型 | + + +## 7. 改进路线图 + +### v2.0(当前)— 通用 Pipeline +- ✅ EXIF 方向校正 +- ✅ 4 策略 ROI 检测 +- ✅ 4 向暴力旋转 + 评分 +- ✅ Tesseract / SSOCR / PaddleOCR 三选一 +- ✅ 数值合理性校验 +- ✅ CSV 汇总输出 + +### v2.1 — 兜底机制 +- [ ] 低置信度自动转人工:置信度 <60 时输出标记 +- [ ] 多 OCR 后端投票(Tesseract + PaddleOCR 投票) +- [ ] 单位上下文推断:根据数字范围反推(如 50–100 多半是 kg,120+ 多半是斤) + +### v2.2 — 专项优化 +- [ ] **App 截图分支**:检测到 App UI 元素时走专用路径(直接定位"目标数字"色块) +- [ ] **秤品牌识别**:用图中 logo(mi、SENSSUN、HUAWEI、LEFU、HCS 等)辅助选择 OCR 参数 +- [ ] **超分辨率增强**:粉色秤等小显示器先 4× 超分再识别 + +### v3.0 — 深度学习 +- [ ] 用现有 122 张 + 持续累积构建训练集(目标 ≥1000 张) +- [ ] Fine-tune 7 段专用 CNN(参考 [Seven-Segment-OCR](https://github.com/SachaIZADI/Seven-Segment-OCR)) +- [ ] 端到端 YOLO 检测+识别一体模型(参考 Goodfellow 多位数字识别论文) + +### v4.0 — 工程化 +- [ ] 提供 HTTP API(FastAPI) +- [ ] 提供 Web UI 标注 + 校验工具 +- [ ] 主动学习:被人工修正过的样本自动加入训练集 + + +## 8. 测试与验证 + +### 8.1 当前测试集 +- 122 张样本图(位于 `未识别/`,已按"识别结果.jpg"命名作为 ground truth) +- 真值来源:人工 + 多模态模型双校 +- **注意**:约 10 张倒置图的真值仍可能有误差,需后续校对 + +### 8.2 验证流程 +```bash +# 1. 跑识别 +python preprocess_v2.py --backend tesseract + +# 2. 对照真值(文件名即真值) +python evaluate.py # 待实现:从文件名提取真值,与 _results.csv 比对 +``` + +### 8.3 验收标准(v2.0) +- Top-1 准确率 ≥ 70% +- 数字准确率 ≥ 80% +- ±5% 容忍准确率 ≥ 85% +- 拒识率 ≤ 20%(剩余转人工) + + +## 9. 依赖 + +### 必装 +```bash +pip install opencv-python pillow pytesseract imutils numpy +``` + +### 系统依赖(Linux) +```bash +sudo apt install tesseract-ocr tesseract-ocr-chi-sim +# 可选:下载 LCD 专用字模 +wget https://github.com/Shreeshrii/tessdata_ssd/raw/master/ssd.traineddata +sudo mv ssd.traineddata /usr/share/tesseract-ocr/4.00/tessdata/ +``` + +### 可选(更强后端) +```bash +pip install paddlepaddle paddleocr # PaddleOCR +sudo apt install ssocr # 7段专用 OCR(Debian/Ubuntu 自带) +``` + + +## 10. 参考资料 + +### 论文 +- Goodfellow et al., [Multi-digit Number Recognition from Street View Imagery using Deep CNNs](https://arxiv.org/abs/1312.6082) + +### 开源项目 +- [SSOCR](https://www.unix-ag.uni-kl.de/~auerswal/ssocr/) — 专用 7段 OCR(C) +- [tessdata_ssd](https://github.com/Shreeshrii/tessdata_ssd) — Tesseract 7段字模 +- [LCD-OCR](https://github.com/DevashishPrasad/LCD-OCR) — Tesseract + OpenCV 流水线 +- [Seven-Segment-OCR](https://github.com/SachaIZADI/Seven-Segment-OCR) — CNN/MNIST 风格 +- [seven-segment-ocr](https://github.com/renjithsasidharan/seven-segment-ocr) — TensorFlow Lite + +### 工程文章 +- [PyImageSearch — Correcting Text Orientation with Tesseract](https://pyimagesearch.com/2022/01/31/correcting-text-orientation-with-tesseract-and-python/) +- [Medium — Correct image orientation using Pytesseract + Imutils](https://medium.com/@birenmer/correct-image-orientation-using-python-pytesseract-and-imutils-5a524d386857) + + +## 11. 维护与协作 + +### 11.1 文档维护 +- 每次方案变更需在 §7 路线图打勾,并写明版本号 +- 新增已知问题加入 §6 已知局限表 +- 重大架构调整需更新 §3 Pipeline 图 + +### 11.2 代码维护 +- `preprocess.py` 保留作为 v1 基准(不再改动) +- 新增功能在 `preprocess_v2.py` 上演进,重大版本切到 `preprocess_v3.py` +- 实验性功能放 `experiments/` 子目录 + +### 11.3 数据维护 +- 新增的真值样本放 `未识别/` 目录,文件名即真值 +- 误识别的样本单独标记,作为难例集合(后续训练用)