Add crawler-reverse skill

2026-03-10 08:58:27 +08:00
commit 8cbf3a4844
5 changed files with 383 additions and 0 deletions
--- a/crawler-reverse/LICENSE
+++ b/crawler-reverse/LICENSE
@@ -0,0 +1,21 @@
 MIT License
 Copyright (c) 2026
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
--- a/crawler-reverse/README.md
+++ b/crawler-reverse/README.md
@@ -0,0 +1,143 @@
 # crawler-reverse
 中文 | [English](#english)
 一个适用于 **OpenClaw 风格技能仓库** 的可复用技能包，用于在**合法授权前提下**进行网页抓包分析、前端 JS 混淆排查、请求签名定位、反爬链路梳理，以及浏览器辅助逆向分析。
 ## 这个技能能做什么
 当你需要下面这些能力时，可以使用 `crawler-reverse`：
 - 分析页面请求链路
 - 查找 `sign`、`token`、`timestamp`、`nonce` 或自定义 Header 的生成位置
 - 比较浏览器请求与脚本请求差异
 - 排查与请求相关的前端 JS 混淆逻辑
 - 分析 Cookie / localStorage / sessionStorage / Header 依赖
 - 复现一个已观察到的请求流程，并输出最小验证脚本
 ## 安全边界
 这个技能**仅用于合法授权、正当测试、自有系统调试、教学演示或明确获准的分析场景**。
 **不应用于：**
 - 未授权访问
 - 绕过登录、权限、付费墙、验证码或限流
 - 撞库、账号滥用
 - 未经授权的大规模采集
 - 为攻击性滥用提供规避安全控制方案
 如果授权范围不明确，应该先确认再继续。
 ## 仓库内容
 - `SKILL.md` — 技能主说明
 - `skill.json` — 基础元数据，可用于索引/注册
 - `examples/example.md` — 示例提示词与使用方式
 - `LICENSE` — MIT 许可证
 ## 推荐用法
 典型分析流程：
 1. 在浏览器中复现用户操作
 2. 观察 XHR / fetch / websocket / document 请求
 3. 识别动态参数
 4. 追踪这些参数的生成位置
 5. 对比浏览器请求与脚本请求
 6. 产出最小验证脚本
 ## 推荐配套工具
 这个技能适合与以下工具配合使用：
 - 浏览器自动化 / 浏览器检查工具
 - 本地文件读取工具
 - shell / grep / ripgrep
 - 小型 Python / JavaScript 验证脚本
 ## 安装方式
 将该目录复制到你的 OpenClaw 兼容 skills 目录，或根据你的 OpenClaw 配置将该 GitHub 仓库作为自定义技能来源引入。
 ## 技能摘要
 - **名称：** crawler-reverse
 - **分类：** web-analysis / reverse-engineering / debugging
 - **主要输出：** 请求链路分析、参数来源说明、安全复现步骤
 ## 说明
 这个仓库目前采用**通用 GitHub skill 仓库布局**生成，后续如果需要适配某个 OpenClaw 技能注册中心或特定格式，可以再进一步调整。
 ---
 ## English
 A reusable OpenClaw-style skill package for **authorized** web traffic analysis, JS deobfuscation support, request-signature tracing, anti-bot workflow inspection, and browser-assisted reverse engineering.
 ### What this skill is for
 Use `crawler-reverse` when you need to:
 - inspect a page's request chain
 - locate where `sign`, `token`, `timestamp`, `nonce`, or custom headers are generated
 - compare browser requests with script requests
 - analyze obfuscated frontend JS related to requests
 - understand cookie / localStorage / sessionStorage / header dependencies
 - reproduce an observed request flow with a minimal script
 ### Safety boundary
 This skill is intended **only for authorized, defensive, educational, self-owned, or explicitly permitted analysis**.
 It must **not** be used for:
 - unauthorized access
 - bypassing authentication, paywalls, permissions, captchas, or rate limits
 - credential stuffing / account abuse
 - large-scale scraping in violation of authorization
 - evasion of security controls for abusive purposes
 If authorization is unclear, ask first.
 ### Package contents
 - `SKILL.md` — full skill instructions
 - `skill.json` — basic metadata for registry/indexing
 - `examples/example.md` — example invocation patterns
 - `LICENSE` — MIT
 ### Suggested usage
 Typical workflow:
 1. Reproduce the user action in a browser
 2. Observe XHR / fetch / websocket / document requests
 3. Identify dynamic parameters
 4. Trace where they are generated
 5. Compare browser and script requests
 6. Produce a minimal validation script
 ### Recommended tools
 This skill is designed to pair well with tools such as:
 - browser automation / browser inspection tools
 - local file readers
 - shell / grep / ripgrep
 - small Python or JavaScript validation scripts
 ### Install
 Copy this folder into your OpenClaw-compatible skills directory, or add it as a GitHub-hosted custom skill source depending on your OpenClaw setup.
 ### Skill summary
 - **Name:** crawler-reverse
 - **Category:** web-analysis / reverse-engineering / debugging
 - **Primary output:** request-chain analysis, parameter-origin explanation, safe reproduction steps
 ### Publishing note
 This package was generated in a generic GitHub skill-repo layout so it can be adapted to a specific OpenClaw registry format later if needed.
--- a/crawler-reverse/SKILL.md
+++ b/crawler-reverse/SKILL.md
@@ -0,0 +1,178 @@
 ---
 name: crawler-reverse
 description: "Use this skill for authorized web traffic analysis, JS obfuscation troubleshooting, request-signature tracing, replay debugging, anti-bot workflow inspection, cookie/token/header origin analysis, and browser-assisted reverse engineering. Refuse or redirect high-risk unauthorized misuse such as bypassing access controls, abusive scraping, credential abuse, or evasion for harmful purposes."
 metadata:
  { "emoji": "🕷️", "category": "web-analysis", "authoring_format": "generic-openclaw" }
 ---
 # Crawler Reverse
 Use this skill for **authorized** browser/network analysis, request tracing, frontend JS reverse engineering, and anti-bot workflow debugging.
 ## When to use
 Use this skill when the task involves:
 - analyzing a page's request chain
 - locating where signature parameters are generated
 - understanding how token / cookie / header values are produced
 - tracing request-building logic inside frontend JavaScript
 - comparing browser behavior with Python / JS script behavior
 - replaying an observed request for validation
 - isolating anti-bot request differences
 ## Goals
 Break the problem into layers:
 1. **Entry point**
   - user action
   - button click
   - route change
   - form submit
   - lazy load
 2. **Network activity**
   - XHR
   - fetch
   - document
   - websocket
   - static resources
 3. **Dynamic inputs**
   - query
   - body
   - header
   - cookie
   - localStorage
   - sessionStorage
 4. **Generation logic**
   - signature
   - timestamp
   - nonce
   - random values
   - encryption
   - serialization
   - compression
 5. **Reproduction**
   - minimal script
   - browser automation flow
   - diff against browser traffic
 ## Recommended workflow
 ### A. Confirm authorization and scope first
 Clarify:
 - target site/system
 - whether the user has permission or a legitimate testing purpose
 - whether the focus is page behavior, API flow, signature generation, or login/session analysis
 If the request is clearly about unauthorized access, bypassing protections, mass abuse, or harmful evasion, do not provide an operational bypass.
 ### B. Observe page behavior and requests
 1. open the page
 2. reproduce the relevant user action
 3. record the important requests
 4. capture:
   - URL
   - method
   - status
   - headers
   - payload
   - response structure
   - page state before/after
 ### C. Trace dynamic parameter origins
 Prioritize searching for:
 - `sign`
 - `token`
 - `timestamp`
 - `nonce`
 - `secret`
 - `encrypt`
 - `signature`
 - `authorization`
 - custom `x-` headers
 Methods:
 - search source files for keywords
 - inspect page variables/functions in-browser
 - trace upward from the request call site
 - compare multiple requests to find changing fields
 ### D. Check common anti-bot points
 Look for:
 - cookie-bound sessions
 - CSRF tokens
 - dynamic headers
 - encrypted / wrapped request bodies
 - timestamp/random participation in signatures
 - temporary tokens from websocket or bootstrap APIs
 - parameters assembled after render
 - localStorage/sessionStorage/memory dependencies
 ### E. Produce a safe, testable output
 Prefer output that includes:
 - key request list
 - explanation of parameter origins
 - summary of generation logic
 - minimal validation steps
 - if appropriate and safe, a minimal verification script
 ## Recommended tool pairing
 Useful companions include:
 - browser automation / browser inspection tools
 - local text/file readers
 - shell search tools
 - short Python / JavaScript validation scripts
 ## Output template
 - **Target page/action:**
 - **Key requests:**
 - **Suspicious/dynamic parameters:**
 - **Evidence and origin hypothesis:**
 - **Open questions:**
 - **Suggested next step:**
 ## Safety boundary
 ### Allowed
 - authorized API/page analysis
 - debugging your own system
 - tracing frontend parameter generation
 - local validation scripts
 - reproducing observed browser behavior for diagnosis
 ### Not allowed
 - unauthorized bypass of access controls
 - bypassing captcha/paywall/permission systems with abuse intent
 - abusive scraping at scale
 - credential abuse
 - operational evasion guidance for harmful misuse
 If intent or authorization is unclear, ask before proceeding.
 ## Practical reminders
 - start from replaying an already observed legitimate request
 - identify the smallest changing fields first
 - compare at least two requests when analyzing signatures
 - for large bundles, center analysis around the actual request trigger path
 - if a visible browser helps, use a visible-browser workflow
--- a/crawler-reverse/examples/example.md
+++ b/crawler-reverse/examples/example.md
@@ -0,0 +1,21 @@
 # Example Usage
 ## Example prompts
 - Analyze the request chain of this page and identify where the signature is generated.
 - Compare the browser request and my Python request and explain why the script fails.
 - Help me trace where this token/header comes from in the frontend.
 - Inspect this page's JS and find the request-building logic.
 - Reproduce this observed API call with a minimal validation script.
 ## Expected outputs
 - key request list
 - dynamic field explanation
 - request diff summary
 - likely signature origin
 - next debugging steps
 ## Safety reminder
 Use only for systems you own, are authorized to test, or are analyzing for legitimate educational/defensive purposes.
--- a/crawler-reverse/skill.json
+++ b/crawler-reverse/skill.json
@@ -0,0 +1,20 @@
 {
  "name": "crawler-reverse",
  "version": "1.0.0",
  "title": "Crawler Reverse",
  "description": "Authorized web traffic analysis, JS obfuscation troubleshooting, request-signature tracing, anti-bot workflow inspection, and browser-assisted reverse engineering.",
  "category": "web-analysis",
  "tags": [
    "reverse-engineering",
    "web-analysis",
    "anti-bot",
    "request-signature",
    "debugging",
    "browser"
  ],
  "author": "Generated with ChatGPT",
  "license": "MIT",
  "entry": "SKILL.md",
  "repository_format": "generic-openclaw",
  "safe_use_only": true
 }