commit 8cbf3a484441fa15b7282881422ef50b5a355691 Author: 爱喝水的木子 Date: Tue Mar 10 08:58:27 2026 +0800 Add crawler-reverse skill diff --git a/crawler-reverse/LICENSE b/crawler-reverse/LICENSE new file mode 100644 index 0000000..14fac91 --- /dev/null +++ b/crawler-reverse/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2026 + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/crawler-reverse/README.md b/crawler-reverse/README.md new file mode 100644 index 0000000..585cf36 --- /dev/null +++ b/crawler-reverse/README.md @@ -0,0 +1,143 @@ +# crawler-reverse + +中文 | [English](#english) + +一个适用于 **OpenClaw 风格技能仓库** 的可复用技能包,用于在**合法授权前提下**进行网页抓包分析、前端 JS 混淆排查、请求签名定位、反爬链路梳理,以及浏览器辅助逆向分析。 + +## 这个技能能做什么 + +当你需要下面这些能力时,可以使用 `crawler-reverse`: + +- 分析页面请求链路 +- 查找 `sign`、`token`、`timestamp`、`nonce` 或自定义 Header 的生成位置 +- 比较浏览器请求与脚本请求差异 +- 排查与请求相关的前端 JS 混淆逻辑 +- 分析 Cookie / localStorage / sessionStorage / Header 依赖 +- 复现一个已观察到的请求流程,并输出最小验证脚本 + +## 安全边界 + +这个技能**仅用于合法授权、正当测试、自有系统调试、教学演示或明确获准的分析场景**。 + +**不应用于:** + +- 未授权访问 +- 绕过登录、权限、付费墙、验证码或限流 +- 撞库、账号滥用 +- 未经授权的大规模采集 +- 为攻击性滥用提供规避安全控制方案 + +如果授权范围不明确,应该先确认再继续。 + +## 仓库内容 + +- `SKILL.md` — 技能主说明 +- `skill.json` — 基础元数据,可用于索引/注册 +- `examples/example.md` — 示例提示词与使用方式 +- `LICENSE` — MIT 许可证 + +## 推荐用法 + +典型分析流程: + +1. 在浏览器中复现用户操作 +2. 观察 XHR / fetch / websocket / document 请求 +3. 识别动态参数 +4. 追踪这些参数的生成位置 +5. 对比浏览器请求与脚本请求 +6. 产出最小验证脚本 + +## 推荐配套工具 + +这个技能适合与以下工具配合使用: + +- 浏览器自动化 / 浏览器检查工具 +- 本地文件读取工具 +- shell / grep / ripgrep +- 小型 Python / JavaScript 验证脚本 + +## 安装方式 + +将该目录复制到你的 OpenClaw 兼容 skills 目录,或根据你的 OpenClaw 配置将该 GitHub 仓库作为自定义技能来源引入。 + +## 技能摘要 + +- **名称:** crawler-reverse +- **分类:** web-analysis / reverse-engineering / debugging +- **主要输出:** 请求链路分析、参数来源说明、安全复现步骤 + +## 说明 + +这个仓库目前采用**通用 GitHub skill 仓库布局**生成,后续如果需要适配某个 OpenClaw 技能注册中心或特定格式,可以再进一步调整。 + +--- + +## English + +A reusable OpenClaw-style skill package for **authorized** web traffic analysis, JS deobfuscation support, request-signature tracing, anti-bot workflow inspection, and browser-assisted reverse engineering. + +### What this skill is for + +Use `crawler-reverse` when you need to: + +- inspect a page's request chain +- locate where `sign`, `token`, `timestamp`, `nonce`, or custom headers are generated +- compare browser requests with script requests +- analyze obfuscated frontend JS related to requests +- understand cookie / localStorage / sessionStorage / header dependencies +- reproduce an observed request flow with a minimal script + +### Safety boundary + +This skill is intended **only for authorized, defensive, educational, self-owned, or explicitly permitted analysis**. + +It must **not** be used for: + +- unauthorized access +- bypassing authentication, paywalls, permissions, captchas, or rate limits +- credential stuffing / account abuse +- large-scale scraping in violation of authorization +- evasion of security controls for abusive purposes + +If authorization is unclear, ask first. + +### Package contents + +- `SKILL.md` — full skill instructions +- `skill.json` — basic metadata for registry/indexing +- `examples/example.md` — example invocation patterns +- `LICENSE` — MIT + +### Suggested usage + +Typical workflow: + +1. Reproduce the user action in a browser +2. Observe XHR / fetch / websocket / document requests +3. Identify dynamic parameters +4. Trace where they are generated +5. Compare browser and script requests +6. Produce a minimal validation script + +### Recommended tools + +This skill is designed to pair well with tools such as: + +- browser automation / browser inspection tools +- local file readers +- shell / grep / ripgrep +- small Python or JavaScript validation scripts + +### Install + +Copy this folder into your OpenClaw-compatible skills directory, or add it as a GitHub-hosted custom skill source depending on your OpenClaw setup. + +### Skill summary + +- **Name:** crawler-reverse +- **Category:** web-analysis / reverse-engineering / debugging +- **Primary output:** request-chain analysis, parameter-origin explanation, safe reproduction steps + +### Publishing note + +This package was generated in a generic GitHub skill-repo layout so it can be adapted to a specific OpenClaw registry format later if needed. diff --git a/crawler-reverse/SKILL.md b/crawler-reverse/SKILL.md new file mode 100644 index 0000000..5853e2b --- /dev/null +++ b/crawler-reverse/SKILL.md @@ -0,0 +1,178 @@ +--- +name: crawler-reverse +description: "Use this skill for authorized web traffic analysis, JS obfuscation troubleshooting, request-signature tracing, replay debugging, anti-bot workflow inspection, cookie/token/header origin analysis, and browser-assisted reverse engineering. Refuse or redirect high-risk unauthorized misuse such as bypassing access controls, abusive scraping, credential abuse, or evasion for harmful purposes." +metadata: + { "emoji": "🕷️", "category": "web-analysis", "authoring_format": "generic-openclaw" } +--- + +# Crawler Reverse + +Use this skill for **authorized** browser/network analysis, request tracing, frontend JS reverse engineering, and anti-bot workflow debugging. + +## When to use + +Use this skill when the task involves: + +- analyzing a page's request chain +- locating where signature parameters are generated +- understanding how token / cookie / header values are produced +- tracing request-building logic inside frontend JavaScript +- comparing browser behavior with Python / JS script behavior +- replaying an observed request for validation +- isolating anti-bot request differences + +## Goals + +Break the problem into layers: + +1. **Entry point** + - user action + - button click + - route change + - form submit + - lazy load + +2. **Network activity** + - XHR + - fetch + - document + - websocket + - static resources + +3. **Dynamic inputs** + - query + - body + - header + - cookie + - localStorage + - sessionStorage + +4. **Generation logic** + - signature + - timestamp + - nonce + - random values + - encryption + - serialization + - compression + +5. **Reproduction** + - minimal script + - browser automation flow + - diff against browser traffic + +## Recommended workflow + +### A. Confirm authorization and scope first + +Clarify: + +- target site/system +- whether the user has permission or a legitimate testing purpose +- whether the focus is page behavior, API flow, signature generation, or login/session analysis + +If the request is clearly about unauthorized access, bypassing protections, mass abuse, or harmful evasion, do not provide an operational bypass. + +### B. Observe page behavior and requests + +1. open the page +2. reproduce the relevant user action +3. record the important requests +4. capture: + - URL + - method + - status + - headers + - payload + - response structure + - page state before/after + +### C. Trace dynamic parameter origins + +Prioritize searching for: + +- `sign` +- `token` +- `timestamp` +- `nonce` +- `secret` +- `encrypt` +- `signature` +- `authorization` +- custom `x-` headers + +Methods: + +- search source files for keywords +- inspect page variables/functions in-browser +- trace upward from the request call site +- compare multiple requests to find changing fields + +### D. Check common anti-bot points + +Look for: + +- cookie-bound sessions +- CSRF tokens +- dynamic headers +- encrypted / wrapped request bodies +- timestamp/random participation in signatures +- temporary tokens from websocket or bootstrap APIs +- parameters assembled after render +- localStorage/sessionStorage/memory dependencies + +### E. Produce a safe, testable output + +Prefer output that includes: + +- key request list +- explanation of parameter origins +- summary of generation logic +- minimal validation steps +- if appropriate and safe, a minimal verification script + +## Recommended tool pairing + +Useful companions include: + +- browser automation / browser inspection tools +- local text/file readers +- shell search tools +- short Python / JavaScript validation scripts + +## Output template + +- **Target page/action:** +- **Key requests:** +- **Suspicious/dynamic parameters:** +- **Evidence and origin hypothesis:** +- **Open questions:** +- **Suggested next step:** + +## Safety boundary + +### Allowed + +- authorized API/page analysis +- debugging your own system +- tracing frontend parameter generation +- local validation scripts +- reproducing observed browser behavior for diagnosis + +### Not allowed + +- unauthorized bypass of access controls +- bypassing captcha/paywall/permission systems with abuse intent +- abusive scraping at scale +- credential abuse +- operational evasion guidance for harmful misuse + +If intent or authorization is unclear, ask before proceeding. + +## Practical reminders + +- start from replaying an already observed legitimate request +- identify the smallest changing fields first +- compare at least two requests when analyzing signatures +- for large bundles, center analysis around the actual request trigger path +- if a visible browser helps, use a visible-browser workflow diff --git a/crawler-reverse/examples/example.md b/crawler-reverse/examples/example.md new file mode 100644 index 0000000..1b97d80 --- /dev/null +++ b/crawler-reverse/examples/example.md @@ -0,0 +1,21 @@ +# Example Usage + +## Example prompts + +- Analyze the request chain of this page and identify where the signature is generated. +- Compare the browser request and my Python request and explain why the script fails. +- Help me trace where this token/header comes from in the frontend. +- Inspect this page's JS and find the request-building logic. +- Reproduce this observed API call with a minimal validation script. + +## Expected outputs + +- key request list +- dynamic field explanation +- request diff summary +- likely signature origin +- next debugging steps + +## Safety reminder + +Use only for systems you own, are authorized to test, or are analyzing for legitimate educational/defensive purposes. diff --git a/crawler-reverse/skill.json b/crawler-reverse/skill.json new file mode 100644 index 0000000..b3aa7ad --- /dev/null +++ b/crawler-reverse/skill.json @@ -0,0 +1,20 @@ +{ + "name": "crawler-reverse", + "version": "1.0.0", + "title": "Crawler Reverse", + "description": "Authorized web traffic analysis, JS obfuscation troubleshooting, request-signature tracing, anti-bot workflow inspection, and browser-assisted reverse engineering.", + "category": "web-analysis", + "tags": [ + "reverse-engineering", + "web-analysis", + "anti-bot", + "request-signature", + "debugging", + "browser" + ], + "author": "Generated with ChatGPT", + "license": "MIT", + "entry": "SKILL.md", + "repository_format": "generic-openclaw", + "safe_use_only": true +}