knowns init

AI extract

Review & bổ sung

Lặp lại

Knowledge sẵn sàng

Bài toán thực tế

Bạn nhận một project chỉ có code, không có doc gì. Không README, không architecture decision, không business flow. Mỗi lần onboard người mới hoặc đổi AI session là phải đọc lại code từ đầu. Bài viết này hướng dẫn cách dùng Knowns + AI để extract toàn bộ kiến thức từ codebase ra thành docs có cấu trúc.

Vấn đề

Rất nhiều project thực tế rơi vào tình trạng:

Code chạy được nhưng không ai biết tại sao nó được thiết kế như vậy
Business logic nằm rải rác trong code, không có tài liệu nào mô tả flow
Mỗi lần AI assistant bắt đầu session mới, nó phải đọc lại toàn bộ codebase
Onboard developer mới mất hàng tuần chỉ để hiểu project

Knowns giải quyết vấn đề này bằng cách cho AI extract kiến thức từ code ra thành docs có cấu trúc — và giữ nó persistent qua mọi session.

Khởi tạo Knowns trong project

Bước đầu tiên, chạy knowns init trong thư mục project:

cd your-legacy-project
knowns init

knowns init

✓ Created .knowns/ directory
✓ Generated KNOWNS.md guidelines
✓ MCP server configured
✓ Ready to go

File structure

your-legacy-project/
├── src/                    ← code hiện tại, không có doc
│   ├── controllers/
│   ├── services/
│   ├── models/
│   └── routes/
├── .knowns/                ← mới tạo
│   ├── docs/
│   ├── tasks/
│   └── config.json
└── KNOWNS.md               ← guidelines cho AI

Lúc này .knowns/docs/ còn trống. Bước tiếp theo sẽ để AI fill nó lên.

Dùng AI để extract kiến thức

Mở AI assistant (Claude Code, Codex, OpenCode, hoặc bất kỳ tool nào hỗ trợ MCP) và bắt đầu extract.

Không cần extract hết một lần

Lần đầu có thể chỉ extract được phần tổng quan. Các lần sau bạn bổ sung dần — business flow, kiến trúc, patterns, conventions. Mỗi lần extract thêm, AI có thêm context để extract chính xác hơn.

Lần 1: Extract tổng quan project

Prompt cho AI

You: Đọc toàn bộ codebase này và extract kiến thức ra Knowns docs.
   Tạo các docs sau:
   - Project overview (tech stack, mục đích, cấu trúc thư mục)
   - Architecture (các layer, cách chúng kết nối)
   - Business flow chính (user journey, data flow)

AI:  [Đọc code, phân tích cấu trúc...]

   ✓ Created @doc/architecture/overview
     → Tech stack: Express + TypeScript + Prisma + PostgreSQL
     → Monolith with service layer pattern
     → 12 API endpoints, 5 database models

   ✓ Created @doc/architecture/layers
     → Routes → Controllers → Services → Repositories
     → Middleware chain: auth → validate → rate-limit

   ✓ Created @doc/business/user-flow
     → Registration → Email verify → Login → Dashboard
     → Payment flow: Cart → Checkout → Stripe → Webhook

Lần 2: Extract patterns và conventions

Bổ sung thêm kiến thức

You: Bây giờ extract thêm:
   - Error handling patterns
   - Authentication flow chi tiết
   - Database conventions (naming, relations)
   - API response format

AI:  [Đọc code với context từ docs đã tạo...]

   ✓ Created @doc/patterns/error-handling
     → Custom AppError class with error codes
     → Global error middleware catches all
     → Structured response: { success, data, error }

   ✓ Created @doc/patterns/auth
     → JWT with refresh token rotation
     → bcrypt for password hashing
     → Role-based access: admin, user, guest

   ✓ Created @doc/conventions/database
     → snake_case for columns, PascalCase for models
     → Soft delete with deleted_at column
     → All timestamps in UTC

Lần 3: Extract business logic phức tạp

Deep dive vào business logic

You: Extract chi tiết business logic của payment flow
   và inventory management. Ghi rõ edge cases.

AI:  ✓ Created @doc/business/payment-flow
     → Stripe integration with webhook verification
     → Idempotency key for retry safety
     → Edge case: partial refund recalculates inventory

   ✓ Created @doc/business/inventory
     → Optimistic locking for concurrent updates
     → Low stock threshold triggers notification
     → Backorder flow when stock = 0

Review và chỉnh sửa

Sau khi AI extract xong, review lại để đảm bảo chính xác:

# Xem danh sách docs đã tạo
knowns doc list --plain

# Đọc từng doc
knowns doc "architecture/overview" --plain --smart

# Mở Web UI để review trực quan
knowns browser --open

AI có thể sai

AI extract dựa trên code nên có thể thiếu context về tại sao một quyết định được đưa ra. Review kỹ và bổ sung thêm context mà chỉ con người mới biết — ví dụ: "Dùng Stripe vì client yêu cầu, không phải vì technical reason."

Chỉnh sửa trực tiếp qua CLI hoặc Web UI:

# Bổ sung nội dung vào doc
knowns doc edit "architecture/overview" \
  --append "## Decision Log\n\n- Chọn Prisma thay vì TypeORM vì team quen hơn"

Hoặc nhờ AI chỉnh:

AI chỉnh sửa

You: Doc @doc/patterns/auth thiếu phần refresh token rotation.
   Bổ sung chi tiết flow refresh token.

AI:  ✓ Updated @doc/patterns/auth
   → Added: Refresh token rotation flow
   → Added: Token blacklist on logout
   → Added: 7-day refresh token expiry

Lặp lại cho đến khi đủ

Knowledge extraction là quá trình lặp. Mỗi lần bạn làm việc với project, extract thêm những gì bạn học được:

Extract liên tục

Session 1: Project overview + architecture
Session 2: Auth patterns + error handling
Session 3: Payment business flow + edge cases
Session 4: Deployment conventions + env config
Session 5: AI đã hiểu gần như toàn bộ project

File structure

.knowns/docs/
├── architecture/
│   ├── overview.md          ← tech stack, structure
│   └── layers.md            ← service layer pattern
├── business/
│   ├── user-flow.md         ← registration → dashboard
│   ├── payment-flow.md      ← cart → stripe → webhook
│   └── inventory.md         ← stock management
├── patterns/
│   ├── error-handling.md    ← AppError, middleware
│   └── auth.md              ← JWT, refresh tokens
└── conventions/
  ├── database.md          ← naming, relations
  └── api-response.md      ← response format

Pro tip: Dùng Sub-agent để extract nhanh hơn

Sub-agent = extract song song

Thay vì để một AI agent đọc toàn bộ codebase tuần tự, bạn có thể spawn nhiều sub-agent để extract song song. Mỗi sub-agent tập trung vào một phần khác nhau.

Trong Claude Code, Codex, hoặc OpenCode, bạn có thể yêu cầu AI tạo sub-agent:

Prompt dùng sub-agent

You: Extract kiến thức từ codebase này. Dùng sub-agent để làm song song:
   - Agent 1: Extract architecture và tech stack
   - Agent 2: Extract business flows
   - Agent 3: Extract patterns và conventions

AI:  Spawning 3 sub-agents...

   [Agent 1] Analyzing architecture...
   [Agent 2] Tracing business flows...
   [Agent 3] Identifying patterns...

   [Agent 1] ✓ Created @doc/architecture/overview
   [Agent 1] ✓ Created @doc/architecture/layers
   [Agent 2] ✓ Created @doc/business/user-flow
   [Agent 2] ✓ Created @doc/business/payment-flow
   [Agent 3] ✓ Created @doc/patterns/error-handling
   [Agent 3] ✓ Created @doc/patterns/auth
   [Agent 3] ✓ Created @doc/conventions/database

   All agents complete. 7 docs created.

Cách này tiết kiệm đáng kể thời gian, đặc biệt với codebase lớn.

Kết quả

Trước khi dùng Knowns	Sau khi extract
Code không có doc, AI phải đọc lại mỗi session	AI đọc docs đã extract, bắt đầu ngay
Onboard mất hàng tuần	Đọc docs trong 30 phút là hiểu project
Business logic chỉ nằm trong đầu dev cũ	Business flow được document rõ ràng
Mỗi AI session bắt đầu từ zero	AI có memory + docs, tiếp tục từ chỗ cũ
Không ai biết tại sao code được viết như vậy	Decision log ghi lại lý do

Checklist extract

Khi extract kiến thức từ legacy code, đảm bảo cover các mục sau:

Mục	Ví dụ
Project overview	Tech stack, mục đích, cấu trúc
Architecture	Layers, patterns, dependencies
Business flows	User journeys, data flows
Patterns	Error handling, auth, validation
Conventions	Naming, file structure, API format
Decision log	Tại sao chọn X thay vì Y
Edge cases	Các trường hợp đặc biệt trong business logic

Bắt đầu ngay

Chỉ cần knowns init và một AI assistant. Không cần viết doc thủ công — để AI đọc code và extract cho bạn. Review, bổ sung, lặp lại. Sau vài session, project của bạn sẽ có đầy đủ kiến thức mà bất kỳ AI nào cũng đọc được.

knowns init

AI extract

Review & bổ sung

Lặp lại

Knowledge sẵn sàng

Bài toán thực tế

Vấn đề

Rất nhiều project thực tế rơi vào tình trạng:

Code chạy được nhưng không ai biết tại sao nó được thiết kế như vậy
Business logic nằm rải rác trong code, không có tài liệu nào mô tả flow
Mỗi lần AI assistant bắt đầu session mới, nó phải đọc lại toàn bộ codebase
Onboard developer mới mất hàng tuần chỉ để hiểu project

Knowns giải quyết vấn đề này bằng cách cho AI extract kiến thức từ code ra thành docs có cấu trúc — và giữ nó persistent qua mọi session.

Khởi tạo Knowns trong project

Bước đầu tiên, chạy knowns init trong thư mục project:

cd your-legacy-project
knowns init

knowns init

✓ Created .knowns/ directory
✓ Generated KNOWNS.md guidelines
✓ MCP server configured
✓ Ready to go

File structure

your-legacy-project/
├── src/                    ← code hiện tại, không có doc
│   ├── controllers/
│   ├── services/
│   ├── models/
│   └── routes/
├── .knowns/                ← mới tạo
│   ├── docs/
│   ├── tasks/
│   └── config.json
└── KNOWNS.md               ← guidelines cho AI

Lúc này .knowns/docs/ còn trống. Bước tiếp theo sẽ để AI fill nó lên.

Dùng AI để extract kiến thức

Mở AI assistant (Claude Code, Codex, OpenCode, hoặc bất kỳ tool nào hỗ trợ MCP) và bắt đầu extract.

Không cần extract hết một lần

Lần 1: Extract tổng quan project

Prompt cho AI

You: Đọc toàn bộ codebase này và extract kiến thức ra Knowns docs.
   Tạo các docs sau:
   - Project overview (tech stack, mục đích, cấu trúc thư mục)
   - Architecture (các layer, cách chúng kết nối)
   - Business flow chính (user journey, data flow)

AI:  [Đọc code, phân tích cấu trúc...]

   ✓ Created @doc/architecture/overview
     → Tech stack: Express + TypeScript + Prisma + PostgreSQL
     → Monolith with service layer pattern
     → 12 API endpoints, 5 database models

   ✓ Created @doc/architecture/layers
     → Routes → Controllers → Services → Repositories
     → Middleware chain: auth → validate → rate-limit

   ✓ Created @doc/business/user-flow
     → Registration → Email verify → Login → Dashboard
     → Payment flow: Cart → Checkout → Stripe → Webhook

Lần 2: Extract patterns và conventions

Bổ sung thêm kiến thức

You: Bây giờ extract thêm:
   - Error handling patterns
   - Authentication flow chi tiết
   - Database conventions (naming, relations)
   - API response format

AI:  [Đọc code với context từ docs đã tạo...]

   ✓ Created @doc/patterns/error-handling
     → Custom AppError class with error codes
     → Global error middleware catches all
     → Structured response: { success, data, error }

   ✓ Created @doc/patterns/auth
     → JWT with refresh token rotation
     → bcrypt for password hashing
     → Role-based access: admin, user, guest

   ✓ Created @doc/conventions/database
     → snake_case for columns, PascalCase for models
     → Soft delete with deleted_at column
     → All timestamps in UTC

Lần 3: Extract business logic phức tạp

Deep dive vào business logic

You: Extract chi tiết business logic của payment flow
   và inventory management. Ghi rõ edge cases.

AI:  ✓ Created @doc/business/payment-flow
     → Stripe integration with webhook verification
     → Idempotency key for retry safety
     → Edge case: partial refund recalculates inventory

   ✓ Created @doc/business/inventory
     → Optimistic locking for concurrent updates
     → Low stock threshold triggers notification
     → Backorder flow when stock = 0

Review và chỉnh sửa

Sau khi AI extract xong, review lại để đảm bảo chính xác:

# Xem danh sách docs đã tạo
knowns doc list --plain

# Đọc từng doc
knowns doc "architecture/overview" --plain --smart

# Mở Web UI để review trực quan
knowns browser --open

AI có thể sai

Chỉnh sửa trực tiếp qua CLI hoặc Web UI:

# Bổ sung nội dung vào doc
knowns doc edit "architecture/overview" \
  --append "## Decision Log\n\n- Chọn Prisma thay vì TypeORM vì team quen hơn"

Hoặc nhờ AI chỉnh:

AI chỉnh sửa

You: Doc @doc/patterns/auth thiếu phần refresh token rotation.
   Bổ sung chi tiết flow refresh token.

AI:  ✓ Updated @doc/patterns/auth
   → Added: Refresh token rotation flow
   → Added: Token blacklist on logout
   → Added: 7-day refresh token expiry

Lặp lại cho đến khi đủ

Knowledge extraction là quá trình lặp. Mỗi lần bạn làm việc với project, extract thêm những gì bạn học được:

Extract liên tục

Session 1: Project overview + architecture
Session 2: Auth patterns + error handling
Session 3: Payment business flow + edge cases
Session 4: Deployment conventions + env config
Session 5: AI đã hiểu gần như toàn bộ project

File structure

.knowns/docs/
├── architecture/
│   ├── overview.md          ← tech stack, structure
│   └── layers.md            ← service layer pattern
├── business/
│   ├── user-flow.md         ← registration → dashboard
│   ├── payment-flow.md      ← cart → stripe → webhook
│   └── inventory.md         ← stock management
├── patterns/
│   ├── error-handling.md    ← AppError, middleware
│   └── auth.md              ← JWT, refresh tokens
└── conventions/
  ├── database.md          ← naming, relations
  └── api-response.md      ← response format

Pro tip: Dùng Sub-agent để extract nhanh hơn

Sub-agent = extract song song

Thay vì để một AI agent đọc toàn bộ codebase tuần tự, bạn có thể spawn nhiều sub-agent để extract song song. Mỗi sub-agent tập trung vào một phần khác nhau.

Trong Claude Code, Codex, hoặc OpenCode, bạn có thể yêu cầu AI tạo sub-agent:

Prompt dùng sub-agent

You: Extract kiến thức từ codebase này. Dùng sub-agent để làm song song:
   - Agent 1: Extract architecture và tech stack
   - Agent 2: Extract business flows
   - Agent 3: Extract patterns và conventions

AI:  Spawning 3 sub-agents...

   [Agent 1] Analyzing architecture...
   [Agent 2] Tracing business flows...
   [Agent 3] Identifying patterns...

   [Agent 1] ✓ Created @doc/architecture/overview
   [Agent 1] ✓ Created @doc/architecture/layers
   [Agent 2] ✓ Created @doc/business/user-flow
   [Agent 2] ✓ Created @doc/business/payment-flow
   [Agent 3] ✓ Created @doc/patterns/error-handling
   [Agent 3] ✓ Created @doc/patterns/auth
   [Agent 3] ✓ Created @doc/conventions/database

   All agents complete. 7 docs created.

Cách này tiết kiệm đáng kể thời gian, đặc biệt với codebase lớn.

Kết quả

Trước khi dùng Knowns	Sau khi extract
Code không có doc, AI phải đọc lại mỗi session	AI đọc docs đã extract, bắt đầu ngay
Onboard mất hàng tuần	Đọc docs trong 30 phút là hiểu project
Business logic chỉ nằm trong đầu dev cũ	Business flow được document rõ ràng
Mỗi AI session bắt đầu từ zero	AI có memory + docs, tiếp tục từ chỗ cũ
Không ai biết tại sao code được viết như vậy	Decision log ghi lại lý do

Checklist extract

Khi extract kiến thức từ legacy code, đảm bảo cover các mục sau:

Mục	Ví dụ
Project overview	Tech stack, mục đích, cấu trúc
Architecture	Layers, patterns, dependencies
Business flows	User journeys, data flows
Patterns	Error handling, auth, validation
Conventions	Naming, file structure, API format
Decision log	Tại sao chọn X thay vì Y
Edge cases	Các trường hợp đặc biệt trong business logic

Bắt đầu ngay

Extract Knowledge from Legacy Code

Vấn đề

Khởi tạo Knowns trong project

Dùng AI để extract kiến thức

Lần 1: Extract tổng quan project

Lần 2: Extract patterns và conventions

Lần 3: Extract business logic phức tạp

Review và chỉnh sửa

Lặp lại cho đến khi đủ

Pro tip: Dùng Sub-agent để extract nhanh hơn

Kết quả

Checklist extract

Vấn đề

Khởi tạo Knowns trong project

Dùng AI để extract kiến thức

Lần 1: Extract tổng quan project

Lần 2: Extract patterns và conventions

Lần 3: Extract business logic phức tạp

Review và chỉnh sửa

Lặp lại cho đến khi đủ

Pro tip: Dùng Sub-agent để extract nhanh hơn

Kết quả

Checklist extract

Related articles

AI Task Workflow

Bug Triage with Memory