Knowledge Base Optimization: Maximizing RuleWise Agent Performance
Knowledge Base Optimization: Maximizing RuleWise Agent Performance
Your RuleWise AI agents are only as good as the knowledge base they access. A well-organized, current, and comprehensive knowledge base enables agents to deliver accurate, relevant, and jurisdiction-specific responses. A poorly maintained knowledge base leads to generic answers, missed information, and frustrated users. This guide shows you how to optimize your knowledge base for maximum agent performance.
Understanding the RuleWise Knowledge Base
The RuleWise knowledge base is a sophisticated vector database (Pinecone) that stores your organization's policies, procedures, and regulatory information in a way that AI agents can quickly retrieve relevant content.
How the Knowledge Base Works
Document Upload: When you upload a PDF policy or regulation, RuleWise:
- Extracts all text from the document (with OCR if needed)
- Chunks the content into ~1000 character segments
- Generates semantic embeddings for each chunk
- Stores chunks in Pinecone with metadata
- Scopes content to appropriate namespaces
Agent Retrieval: When an agent needs information, it:
- Converts the user's question into an embedding
- Searches vector database for semantically similar chunks
- Automatically scopes search to user's organization and jurisdictions
- Returns most relevant chunks with similarity scores
- Uses retrieved content to inform responses
Key Insight: Agents don't read entire documents—they retrieve small, relevant chunks. Optimization focuses on ensuring the right chunks are retrieved for each query.
The Knowledge Base Hierarchy
RuleWise organizes knowledge in two primary scopes:
Organization-Scoped Content
What: Your firm's internal policies, procedures, and compliance frameworks.
Namespace: org-${organizationId} (e.g., org-123)
Who Uploads: Organization administrators and compliance officers
Examples:
- AML/CFT policy
- Conflicts of interest policy
- Insider trading policy
- Risk assessment frameworks
- Compliance testing procedures
- Board governance policies
- Training materials
- Procedure manuals
Jurisdiction-Scoped Content
What: Regulatory requirements specific to jurisdictions you operate in.
Namespace: Jurisdiction slug (e.g., guernsey, jersey, eu)
Who Uploads: System administrators (RuleWise team)
Examples:
- GFSC Handbook sections
- Financial services regulations
- AML/CFT legislation
- Data protection regulations
- Market conduct rules
- Regulatory guidance and notices
How Agents Use Both Scopes
When you ask an agent a question, it automatically searches:
- Your organization's namespace (firm-specific content)
- All enabled jurisdiction namespaces (regulations applicable to your firm)
This ensures responses combine your internal policies with relevant regulatory requirements.
Document Preparation Best Practices
High-quality document preparation dramatically improves agent performance.
Document Quality
Use Text-Based PDFs: Text-based PDFs are fastest and most accurate. Image-only PDFs require OCR, which may introduce errors.
Ensure Readability: Documents should be:
- Clear and legible (not blurry or low resolution)
- Properly oriented (not sideways or upside down)
- Complete (no missing pages)
- Searchable (test by opening in PDF reader and searching)
Avoid Scanned Handwriting: Handwritten documents don't work well even with OCR. Convert to typed text before uploading.
Document Structure
Use Clear Headings: Well-structured documents with clear section headings improve chunking quality.
Example - Good Structure:
AML/CFT Policy
1. Purpose and Scope
[content]
2. Regulatory Framework
[content]
3. Risk Assessment
3.1 Customer Risk Assessment
[content]
3.2 Geographic Risk Assessment
[content]
Why It Matters: Clear structure ensures chunks contain coherent, self-contained information rather than fragments split mid-sentence.
Include Table of Contents: TOCs help agents understand document structure and navigate to relevant sections.
Number Sections Consistently: Numbered sections (1.1, 1.2, etc.) help agents cite specific policy sections.
Document Metadata
Use Descriptive Filenames: Filenames should clearly identify content.
Good: AML-CFT-Policy-v3-Approved-2024-10-15.pdf
Poor: Document1.pdf
Include Dates: Indicate document version and approval dates.
Specify Jurisdiction: If a document applies to specific jurisdictions, include in filename.
Example: Guernsey-Regulatory-Reporting-Procedures-2024.pdf
Content Organization Strategies
Organize knowledge base content strategically.
Policy Documents
Upload Current Versions Only: Don't upload outdated policy versions—they confuse agents.
Exception: If you need historical reference, clearly label: AML-Policy-v2-SUPERSEDED-2023.pdf
Separate Policies by Topic: Upload each policy as a separate document rather than one massive policy manual.
Why: Agents can retrieve specific policies more accurately than searching through a 200-page manual.
Include Policy Dates: Always include effective dates, review dates, and version numbers in the document itself.
Procedures and Guides
Create Standalone Procedures: Each procedure should be a complete, standalone document.
Example: Upload separate documents for:
- Customer onboarding procedures
- Transaction monitoring procedures
- SAR filing procedures
- Risk assessment procedures
Rather than one "AML Procedures" manual containing all of them.
Use Step-by-Step Formats: Procedures work best when structured as clear steps:
Transaction Monitoring Alert Investigation Procedure
1. Alert Receipt
1.1 Review alert details in system
1.2 Verify alert threshold and trigger
2. Initial Assessment
2.1 Review customer profile
2.2 Review transaction history
...
Regulatory Content
Upload Relevant Sections: For large regulatory handbooks, upload sections relevant to your business rather than entire handbooks.
Example: If you're an investment adviser, upload:
- Investment business regulations
- AML/CFT regulations
- Data protection regulations
- Market conduct rules
Rather than the entire financial services handbook covering banking, insurance, etc.
Keep Regulations Current: Update regulatory content when regulations change. Remove superseded versions.
Training Materials
Upload Training Content: Training materials help agents understand how your firm explains compliance concepts to employees.
Include:
- Training presentations
- Training manuals
- Case studies
- Procedure guides
Why: When employees ask compliance questions, agents can respond consistent with how they were trained.
Document Lifecycle Management
Maintain knowledge base currency and relevance.
Upload Cadence
Immediate Upload: Upload these documents immediately:
- Newly approved policies
- Updated procedures
- New regulatory guidance
- Training materials after approval
Quarterly Review: Every quarter, review:
- Are all current policies uploaded?
- Have any policies been updated but not re-uploaded?
- Are there new regulatory requirements to upload?
- Should any outdated content be removed?
Annual Audit: Once per year, comprehensively audit:
- Complete policy inventory against uploads
- All uploaded documents still current
- Jurisdiction regulatory content is up-to-date
- Remove any unnecessary or outdated content
Version Control
Replace Rather Than Accumulate: When a policy is updated, replace the old version rather than having both in the knowledge base.
Process:
- Upload new version with clear filename:
AML-Policy-v4-2025-01-15.pdf - Remove old version:
AML-Policy-v3-2024-10-15.pdf - Verify new version is retrievable by agents
Exception: If you need to maintain historical versions for reference, clearly label them as superseded and consider separate storage.
Removal Guidelines
Remove:
- Superseded policy versions (after new version uploaded)
- Draft documents (never upload drafts)
- Outdated regulatory guidance
- Expired temporary procedures
- Irrelevant content accidentally uploaded
Verify Before Removing: Test that removal doesn't break agent responses by asking agents questions that should reference the document.
Optimizing for Agent Retrieval
Structure content to maximize agent effectiveness.
Write for Discoverability
Include Keywords: Policies should include terms people actually use.
Example: A conflicts policy should mention:
- "Conflicts of interest"
- "Personal account dealing"
- "Outside business interests"
- "Gifts and entertainment"
- "Related party transactions"
Even if these are addressed in subsections, mentioning them in the introduction helps agents find the right policy.
Use Question-Answer Format
Consider including FAQ sections in policies:
Frequently Asked Questions
Q: How do I determine if a gift is acceptable?
A: Gifts under £100 are generally acceptable. Gifts over £100 require compliance approval...
Q: What if a client invites me to an entertainment event?
A: Entertainment must be reasonable, business-related, and pre-approved if over £200...
Why: When users ask these exact questions, agents can retrieve these answers directly.
Create Summary Sections
Include executive summaries or key points sections:
Key Policy Requirements - Quick Reference
1. All conflicts must be disclosed to compliance within 48 hours
2. Gifts over £100 require pre-approval
3. Personal trading requires pre-clearance for restricted securities
4. Outside business interests must be approved annually
These summaries improve agent retrieval accuracy.
Cross-Reference Related Documents
Explicitly reference related policies:
This AML Policy should be read in conjunction with:
- Customer Due Diligence Procedures
- Transaction Monitoring Procedures
- Sanctions Screening Policy
- Suspicious Activity Reporting Procedures
Agents can identify and retrieve related documents when needed.
Testing Knowledge Base Effectiveness
Verify that your knowledge base delivers expected results.
Regular Testing Protocol
Monthly Testing: Test agent responses to common questions.
Test Set Example:
- "What is our policy on accepting gifts from clients?"
- "What customer due diligence is required for high-risk clients?"
- "What are the Guernsey regulatory reporting deadlines?"
- "How do we determine if a transaction requires enhanced due diligence?"
- "What is our process for filing suspicious activity reports?"
Evaluate:
- Does agent retrieve correct policy/procedure?
- Is response accurate and complete?
- Are citations to correct document sections?
- Is jurisdiction-specific information accurate?
- Are responses current (not based on outdated content)?
Gap Identification
When agents fail to answer questions correctly:
Diagnose the Issue:
Problem: Agent says "I don't have information about this" Solution: Relevant document likely not uploaded. Upload the policy/procedure.
Problem: Agent provides generic answer without citing firm policies Solution: Policy may be uploaded but not easily retrievable. Improve document structure or add keywords.
Problem: Agent cites outdated version Solution: Old version still in knowledge base. Remove it.
Problem: Agent can't find specific information that exists in documents Solution: Information may be buried in a large document. Consider extracting into standalone procedure.
Remediation Process
When gaps are identified:
- Identify missing or problematic content
- Upload missing documents or improve existing ones
- Re-test to verify improvement
- Document the issue and solution for future reference
Advanced Optimization Techniques
Strategic Document Chunking
While RuleWise automatically chunks documents, you can optimize for chunking:
Use Logical Section Breaks: Ensure each section is somewhat self-contained. Avoid sections that only make sense if you've read previous sections.
Optimal Section Length: Aim for sections of 300-1500 words. Very short sections may lack context; very long sections may be split mid-concept.
Repeat Context: Include enough context in each section that chunks make sense standalone.
Example:
Instead of:
3. High-Risk Customers
3.1 Criteria
- PEPs
- High-risk jurisdictions
- Complex structures
Use:
3. High-Risk Customers
Our firm applies enhanced due diligence to customers meeting high-risk criteria under our risk-based approach to AML compliance.
3.1 High-Risk Customer Criteria
Customers are classified as high-risk if they meet any of these criteria:
- Politically Exposed Persons (PEPs)
- Customers from high-risk jurisdictions
- Customers with complex ownership structures
The second version provides context even if only that chunk is retrieved.
Jurisdiction-Specific Optimization
Clearly Identify Jurisdiction Applicability: In multi-jurisdiction firms, clearly indicate which jurisdictions each policy applies to.
Example:
Data Protection Policy
Applicable Jurisdictions: Guernsey, Jersey, EU
This policy implements data protection requirements under:
- Guernsey Data Protection Law
- Jersey Data Protection Law
- EU General Data Protection Regulation (GDPR)
Separate When Necessary: If requirements differ significantly across jurisdictions, consider separate policies or clearly delineated sections.
Regular Content Refresh
Monitor Usage Patterns: Pay attention to which documents agents reference most frequently.
Enhance High-Use Documents: Invest in making frequently-referenced policies more comprehensive and better structured.
Evaluate Low-Use Documents: If documents are never referenced, consider whether they're:
- Not actually useful (consider removing)
- Poorly structured (improve and re-upload)
- Mislabeled (fix filename and metadata)
Knowledge Base Governance
Establish organizational practices around knowledge base management.
Roles and Responsibilities
Compliance Manager: Overall knowledge base quality and completeness
Policy Owners: Ensuring their policies are current in knowledge base
System Administrator: Technical management and optimization
Regular Users: Providing feedback on agent response quality
Quality Standards
Document Standards:
- All uploaded documents are approved, current versions
- Documents are properly formatted and readable
- Filenames follow naming conventions
- Superseded versions are removed
- Documents are appropriately scoped (organization vs. jurisdiction)
Performance Standards:
- Agents correctly answer 90%+ of common questions
- Response accuracy verified through monthly testing
- Gaps identified and remediated within 5 business days
- Knowledge base currency verified quarterly
Change Management
When Policies Change:
- Policy approved by appropriate authority
- New version uploaded to knowledge base within 24 hours
- Old version removed
- Test agent responses to verify correct retrieval
- Communicate policy change to organization
When Regulations Change:
- System administrator notified of regulatory change
- Updated regulatory content uploaded
- Superseded content removed
- Affected organizations notified
- Test agent responses for accuracy
Measuring Knowledge Base Quality
Track metrics that indicate knowledge base health.
Coverage Metrics
Policy Coverage: Percentage of required policies uploaded
- Target: 100% of approved policies
Currency: Percentage of uploaded policies that are current versions
- Target: 100% current
Completeness: Percentage of policy framework areas with documentation
- Target: 100% of material areas
Performance Metrics
Agent Answer Rate: Percentage of user questions agents can answer with citations
- Target: 90%+
Retrieval Accuracy: Percentage of retrievals that cite correct, relevant documents
- Target: 95%+
User Satisfaction: User ratings of agent response quality
- Target: 4.0/5.0+
Operational Metrics
Upload Timeliness: Average days from policy approval to knowledge base upload
- Target: <2 days
Gap Remediation Time: Average days from gap identification to remediation
- Target: <5 days
Review Frequency: Currency review conducted quarterly
- Target: 100% compliance
Real-World Knowledge Base Example
Here's how a Guernsey investment adviser optimized their knowledge base:
Initial State (Poor Performance)
Problems:
- 47 documents uploaded haphazardly
- Mix of current and outdated versions
- One 300-page "Compliance Manual" containing everything
- No clear organization
- Agents frequently couldn't find information
Agent Performance:
- Successfully answered only 62% of test questions
- Often provided generic responses without firm-specific citations
- User satisfaction: 2.8/5
Optimization Process
Week 1 - Audit:
- Inventoried all uploaded documents
- Identified outdated versions (removed 18 documents)
- Identified missing policies (discovered 8 policies approved but never uploaded)
Week 2 - Restructure:
- Broke 300-page manual into 23 separate policy documents
- Uploaded 8 missing policies
- Standardized filenames
- Total: 60 well-organized documents
Week 3 - Enhancement:
- Added FAQ sections to top 10 most-referenced policies
- Included executive summaries
- Improved section headings and structure
- Added jurisdiction identifiers
Week 4 - Testing:
- Tested agent responses to 50 common questions
- Remediated remaining gaps
- Validated all policies retrievable
Results After Optimization
Agent Performance:
- Successfully answered 94% of test questions with firm-specific citations
- Response quality dramatically improved
- User satisfaction: 4.4/5
Operational Improvement:
- Established quarterly review process
- New policies uploaded within 24 hours
- Monthly testing protocol implemented
Business Impact:
- Compliance team more efficient
- Employees get faster, more accurate answers
- Regulatory inspection preparation improved
- Training development faster and more accurate
Conclusion
Your knowledge base is the foundation of RuleWise agent performance. Investing in knowledge base optimization—through quality document preparation, strategic organization, lifecycle management, and regular testing—delivers dramatically better agent responses and, ultimately, better compliance outcomes.
Start by auditing your current knowledge base, remove outdated content, upload missing policies, and establish ongoing maintenance processes. The investment in knowledge base quality pays continuous dividends through superior agent performance.
A well-maintained knowledge base transforms RuleWise agents from helpful tools into indispensable compliance team members that consistently deliver accurate, relevant, firm-specific guidance.
Ready to optimize your knowledge base? Start with a comprehensive audit today.
Related articles: Agent Best Practices and Insight Agent Guide