Posted by Nodus Labs | January 30, 2023
Knowledge Base Text Analysis with NLP
Many organizations run into a problem where their knowledge base is too bloated or not very well organized. The ontology that was created a few years ago is not relevant anymore, it becomes hard to find content, let alone categorize it. This has detrimental effects not only on the quality of customer support but also on search engine optimization as it becomes increasingly hard to get the support portal to the top of Google search results.
In this case study, we will demonstrate how text analysis, NLP, and the latest AI algorithms can be used to improve any knowledge base, get an overview of its content, and understand how to develop it further. We will also demonstrate how this approach can be used to improve search engine rankings for target keywords.
We will be using our own Nodus Labs knowledge base, hosted by Zendesk, one of the most popular knowledge base CMS providers. Our goals are to
1) Retrieve the main topics to get an overview
2) Categorize it in a better way so that it’s easier to navigate
3) See what content is missing to create more relevant content for our users
4) SEO to increase acquisition via Google search
We will be updating this article with the Google Analytics data that will demonstrate whether this approach was successful in attracting more readers and if the time / engagement increased.
Update (19 February 2023): As we predicted, the structurization of our Zendesk support portal based on the text analysis below improved its search engine rankings. Here’s a comparison from Google Analytics before to after:
We also saw a 20% increase in category page views, which shows that the readers’ navigation of the support portal became much more structured.
Exporting Knowledge Base Content from Zendesk
First, we need to export Zendesk content in order to analyze it.
In order to do that, we wrote an open-source public Python script, which is available in our repo: https://github.com/noduslabs/zendesk_export
If you would like to do the same, just follow the instructions on the README.md page.
You can also use this script to backup your knowledge base as you can include the images and the category structure.
For our purposes, we get the text files only in MD format, they will be saved in the backups/yyyy-mm-dd
folder.
Knowledge Base Content Analysis
The next step is to get those MD files into text analysis tool InfraNodus that can retrieve the main topics and keywords from our knowledge base, as well as the connections between the different pages.
If you’d like to learn more about the specifics of importing the data, please, read the help article on Nodus Labs Support Portal on importing Zendesk data to InfraNodus for text analysis.
To make it easier, you can import just the content, without linking the page structure. In the example below, we imported both the content, the pages, and links between them, but you can select to see the concepts only in the graph view dialogue (1)
This is the result we get. The graph shows us the main concepts contained in our knowledge base. The Analytics panel shows the main topical groups and keywords:
As we can see, the main keywords (Analytics > Most Influential Elements) are:
graph, node, topic, infranodus, idea, add, text —
which basically talks about how you can add ideas and text to InfraNodus’ graph to get the topics.
Directly we can get feedback from here that the word “graph” is used too often, so it’s not so good for our search engine optimization tactics as nobody is using “graph” to find this type of content. However, it may be used by our readers, so we will keep the word, but also try to use a different synonym.
For instance, using InfraNodus’ own Keyword Research tool we can see the context where the word “graph” is used is associated with charts and mathematical functions, and that is not what InfraNodus is about:
Then we can also see the main categories for the topics of the knowledge base:
1. Graph Node
2. InfraNodus Import
3. AI Ideation
4. Cluster Gaps
5. Relation Panel
6. Text Network Discourse
Loosely, we can say they are talking about
a) graphs and nodes (technical terms from network science),
b) various import functions (important for infranodus),
c) AI ideation (a trending topic and an important functionality),
d) structural gaps between clusters (special sauce of InfraNodus)
e) data panels (explaining how to use the tool)
f) text network discourse (text analysis)
In combination with the previous insight we can say that the content is perhaps a bit too technical, focusing on graph and network analysis rather than text analysis. We might want to shift that. Otherwise, all the important topics like AI ideation and structural gap detection are there.
Actionable insights:
- talk less about the “graphs”, use less technical, more SEO-friendly terms
- talk more about “text analysis” and various text analysis techniques and approaches
- keep a good coverage of AI ideation and structural gap insights (already exists)
…
Modifying the Structure of the Knowledge Base
The current structure of our knowledge base is the following:
- How to Use InfraNodus
– Core Workflows
– Discovering Information
– Finding a Niche
– Developing an Idea
– Cognitive Reconfiguration
– Presenting Ideas and Exporting Graphs
– Importing External Data
– Social Network Analysis
– FAQ - Tools and Methodologies
– AI-Augmented Writing and Thinking
– Ideation and Brainstorming
– Personal Knowledge Management
– SEO Keyword Research - Network and Graph Concepts
– Essential Network Concepts
– Measures of Influence
– Network Structure Measures
– Structural Gaps
– Graph Theory Applications - Subscriptions and Payments
- Case Studies
If we visualize it as a graph with InfraNodus > Add a New Text app we see the following structure:
We can directly see that while the structure talks about SEO research, there is not so many content on SEO in the actual content (actionable insight: increase their number).
While, at the same time, the structure doesn’t reflect Text Analysis and Structural Gap categories that are present in the content.
Actionable insights:
- talk more about SEO research in the knowledge base
- add categories on text analysis and structural gap detection
Therefore, the new category structure proposed at this point of analysis is:
- How to Use InfraNodus
– Introduction: Starting to Use InfraNodus
– Add a New Text and Explore a Graph
– Getting an Overview of a Topic with the Google App
– Using InfraNodus x GPT-3 AI: a Basic Workflow Tutorial
– Adding a Text and AI Ideation
– Using the Text Editor to Add and Edit Content (to check!)
– Live AI Ideation Workflow: Develop Ideas using GPT-3 and Text Visualizatio
– Explore a Topic using GPT-3 AI and Text Data Network Visualization
– Delete a Statement from a Graph
– How to Write a Text using OpenAI’s GPT-3 as a Conversational Partner
– Text Analysis
– Analyze an Existing Discourse: Network Science and Text Mining – use this Case Study: Text Mining and Topic Modeling but extend
– How to Analyze a Book with Visual Text Mining and Networks
– How to Make a Visual Summary of an Article
– Generate a Summary for a Book or an Article with GPT-3 AI and Text Network Analysis
– Importing Files and External Data
– How to Import a CSV / Excel Spreadsheet Data
– Adding a Text File or a PDF Document
– How to Visualize Google Search Results
– Visualize the Tweets from a Twitter List
– Sentiment Analysis using Amazon Product Review Data
– Live Graph Updates: RSS, Twitter, Google Search
– Import Scientific Papers, Visualize the Scientific Discourse, Perform Literature Review
– How to Scrape Data from Any Web Page
– How to Scrape the Content of a URL / Website Page Behind a Paywall
– Mind Mapping and Visual Ideation
– Generate a Mind Map from any Text
– How to Generate a Word Cloud with a Context
– Convert Your Mind Maps into Text to Gain a Different Perspective
– How to Develop Ideas with Networks
– Workflow: Network Thinking and Mindmapping for Ideation and Brainstorming
– Brainstorming and Writing using the Network Thinking Approach
– Using the Graph Interface
– How to Read and Interpret Text Network Graphs
– How to Merge Nodes into Topics and Unlock Merged Nodes
– How to Search and Find the Content in Your Graphs
– How to Add the Nodes and Edges into the Graph Manually
– Delete the Nodes / Words and Add them to the Stopwords List
– Search and Find Relevant Parts of Text using a Network Graph
– How to see all the nodes’ labels and words on the graph?
– How to Rename a Node / Word in the Graph
– Dynamic Graph: Filtering a Certain Time Span
– Text Categorization
– Automatic Text Categorization with InfraNodus
– How to Add Tags to your Graph Data and Statements
– Filter the Graph and Statements by Tags / Categories
– Text Classification and Taxonomy using Topic Categories
– Language Settings
– How to Change Your Language Setting
– In Your Own Language: Lemmatization and Stopwords Removal
– How to Disable Stopwords Removal
– Combining Words and / or Nodes: Named Entities in a Graph
– How to Automatically Translate CSV File Data
– Organizing Your Workspace
– Open or Find an Existing Graph
– How to Rename a Graph or a Context
– How to Add Your Graph into Favorites?
– Save and Retrieve Meta Information about Your Graphs
– Delete a Whole Text Graph
– Top Graphs Synthesis
– Comparative Text Analysis
– How to Compare Text Statements with Different Tag Categories
– How to Compare Text Graphs to Find the Similarities and Differences
– Saving Your Work
– How to Use Project Notes to Interpret Existing Content
– How the Save the Current Version of the Graph?
– How to Save Your Text Graph Analytics
– Export the Mind Network
– Export your Graphs, Text Data and Analytics Results
– How to Export the Topical Clustes for Further Statistical Analysis or Machine Learning Models
– How to Share your Graph and Data: from Private to Public
– Showcase Your Work
– 🎥 How to Watch the Dynamic Evolution of a Graph?
– How to Change the Appearance and the Settings for Your Graphs
– Export High Resolution (SVG) Graph Images and Post-Production
– How to Embed Your Graphs to Other Sites
– Troubleshooting
– My Graph Doesn’t Show Up Correctly (no nodes or strange symbols)
– Lemmatization Techniques and Word Endings Cut Offs for Spanish, Portugese, Indonesian
– Problem: My CSV / MD / PDF / Text Files are Not Recognized
– Cannot Log In — How to Fix Login Issues - Advanced Network Thinking
– Cognitive Variability Framework
– Cognitive Variability: InfraNodus Thinking Dynamics Sensor
– Mind Viral Immunity
– Conversational Chatbot based on Text Network Analysis and GPT-3 AI
– Discourse Structure Analysis
– Measure the Urgency of a Discourse Using Its Network Structure
– Case Study: Measure Diversity of a Discourse
– Reveal Non-Obvious
– Revealing the Non-Obvious: Graph Exploration Workflow
– A Recommender System for Thinking and Insight Generation
– Discovering Niches
– Finding Opportunity within the Attention-Knowledge Gap
– Finding Relations Between Ideas
– What is in a Relation: How to Measure the Importance of Phrases and Bigrams in a Discourse
– Discovering Gaps in Thinking
– Identifying Structural Gaps in a Discourse
– How to Find the Gaps in a Discourse to Generate New Ideas (with a little help from GPT-3:)
– Network Art
– Making Music with Network Graphs: Auditory Feedback on Your Research Process via MIDI - Network Science
– Essential Concepts
– Measures of Influence
– Network Structure
– Structural Gaps
– Graph Theory Applications - Subscriptions & Payments
– Registration
– Managing Subscriptions
– Cancellations and Refunds - Personal Knowledge Management
– Personal Knowledge Management
PKM Workflow: AI-generated Insights for Your Obsidian / LogSeq Knowledge Graph
Visualize Your Evernote Notes to Get an Overview and Discover the New Idea
How to Import and Visualize Your Roam Research, Obsidian and Zettelkasten Markdown Format Notes
How does the [[backlink]] syntax work in InfraNodus and what’s its difference from LogSeq / Obsidian?
How to Add [Wiki-Links] / [[Square Brackets]] / Tags / Categories intoto the Graph
How Are Backlinks from Roam Research / Obsidian / Logseq Converted into a Network Grap
Obsidian vs Roam Research vs LogSeq vs RemNote
How to Export a Public Part of a Private Knowledge Graph in any PKM System
– Creative Thinking
🎥 Case Study: Creative Thinking using the Insight Recommender System
Case Study: Generate Insight using Text Network Analysis
6. Marketing, SEO, and Consulting
– SEO
How to Use SEO Keyword Research and GPT-3 AI to Generate an Outline for a Blog Article or Product Description
SERP: Study the Context Around Any Search QuerGoogle Keyword Suggestions: How to Know What People Search For
SEO: Using Text Network Analysis to Find a Content Niche
SEO: How to Analyze a Website’s Content and Extract the Top Keyword Phrases
SEO: How to Compare Two Web Pages and Find What’s Missing
Generate Top SEO Topics for Your Website from Google Search Console Keywords
– Marketing
🎥 How to Discover a New Market Niche
Case Study: Sentiment Analysis of Customer Feedback
Zendesk Knowledge Base SEO Optimization and Text Analysis
– Business Development
Crunchbase Data Visualization and Discourse Analysis
– Social Network Analysis
Import and Visualize the Social Network on Twitter for a Search Term or a Topic
CrunchBase Analysis: Investors’ Social Networks
– Media Study and News Analysis
Case Study: Analyze the News Discourse with Graphs
As can be seen, we have much more present topical clusters on text mining, discourse (gap) analysis, AI ideation, and SEO — the topics which we wanted to highlight in the knowledge base. We will wait for 2 weeks for Google to re-index the support portal and report the SEO results here.
Also, the structure of the knowledge base became more granular, so it is now easier to find the content relevant to each specific use case.
It will also make it easier to add new content as we can now easily see what kind of content is present sufficiently (e.g. SEO) and what content is missing (e.g. AI Ideation and getting insight from PKM graphs).
…