In this article I explain why SEO analysts should consider terminology databases and their artefacts (e. g. database models, controlled vocabularies, synonyms lists and links to other terminological resources or knowledge bases) as a strategic source for keywords and taxonomy extraction in a knowledge domain and why they should connect to them via APIs.
What are terminology databases?
Terminology databases (also term base or termbase) are applications used to create sector specific controlled vocabularies (knowledge domain terminologies). They are paramount for ensuring consistent and correct communication within an enterprise, being particularly relevant in multilingual communication (technical documentation, software, webshop or website localization).
Terminology databases in the Product Life Cycle
As the result of a specialized evaluation of terms, terminology databases play an important role along the entire product life cycle. From R&D till the end of product’s life: Terminology databases ensure the use of quality assured vocabularies through all company’s documents and systems – CRM, ERP, CMS, CCMS, PIM, PLM, Intranets – providing huge operational savings in multilingual context.
Terminology Databases as core component of Natural Language Processing
Being around for about 25 years, commercial terminology management systems (e. g. RWS SDL MultiTerm, across, Transit) have become a standard application in medium and large enterprises of the automotive, aerospace, medical and engineering sector. By setting the rules for the correct use of special field’s terminology, termbases ensure cost-effective translation management and precise stakeholder communication, e. g. with customers, users, suppliers, organization for certifications (TüV, ISO, …) or insurances. As core component of machine learning and machine translation, terminology databases are – along with information models – the driving force of the semantic web.
Terminology Databases in Business Intelligence & Knowledge Management
Big companies such as Schaeffler, Bosch, Brose, Siemens, Daimler, SAP, Hugo Boss have been exploiting terminology database for business intelligence and knowledge management since at least fifteen years. In these companies terminology management processes are considered core business processes. Ad-hoc developed interfaces (e. g. Babylon APIs) ensure that terminology database data are included in business intelligence suites and semantic knowledge management systems. This ensures better trendscouting in BI, controlled indexing in DMS and smart information retrieval in Intranets.
Terminology Databases in Content Optimization strategies
Having explained the relevance of terminology databases in the enterprise context, it is time to provide some reasons and use cases for including terminology databases in SEO Analytics and Search Engine Optimization strategies.
Reason 1: Terminology Databases as Structured Data
Terminology databases contain highly structured and annotated linguistic data that can be easily made available for the semantic web through data transformation softwares. For SEO analysts a valuable container of semantic relations expressed through a controlled vocabulary.
Reason 2: Terminology databases are created from experts in semantic & multilingual taxonomies
Terminology databases are created & managed by terminologists. Terminologists have a solid background in linguistic and semantic, know the rules of taxonomy creation and can provide well-formed arguments for choosing the preferred term among synonyms. Being usually fluent in several languages, they are also able to identify taxonomy gaps in interlinguistic perspective.
Reason 3: Quality assured data sets through process management and semi-automated workflows
Every terminological entry of the terminology database has been gone through an assessment process. The process usually involves specialists form different departments (e. g. marketing, R&D, language services, technical documentation, customer service, legal) proofing the validity of terms. New terminological suggestions are evaluated in terms of their semantic, usage, conformity to norms, their position within the taxonomy system and relevance for the terminology collection. SEO strategists can therefore rely on the quality of the information contained in the terminology database.
Reason 4: Terminology Databases can provide strategic artefacts for data analysis
Terminology databases can also provide useful insights regarding the specificity of the vocabulary collection, its relations to other terminology collections (Linked Data) or subject fields intersections. In the following I list some artefacts (products) of the terminology database that SEO strategists should evaluate at the beginning of any SEO project:
artefact | use case description |
---|---|
key company’s terminology and frequency of use | useful at the beginning of a SEO analytics project for the evaluation (semantic auditing) of the information architecture of the website; it helps to set priorities related to the topics/terms/keywords to be included in the website infrastructure (e. g. navigation); |
list of forbidden terms | useful at the beginning of a SEO analytics project (semantic auditing) as it states which terms should not be used in the information architecture of the website and in on-page optimization activities (URL, Headings, anchor texts, backlinks, keywords, …); |
list of company’s specific terminology | useful in competitor analysis as it provides a reference list for analysing the concepts/terms/entities in realtion to the terminology in use on competitors websites; this list enables you to highlight differences in terminology in a specific sector (knowledge domain). |
lists of synonyms (with degree of synonymity, usage, register and preferred terms) | useful when defining the information architecture of the website and when evaluating the keywords collected via APIs from the Google Search Console; this list enables a comparison from what „is known and used in the web“ in terms of terminology and keywords (content of websites and search queries) and what „is known and used in the enterprise“; |
lists of terms with collocations (phraseological units) | particularly useful in a B2B scenario as it suggests phrases that expert users are probably going to use in search queries; |
artefacts | use case description |
Reason 5: Database model as intrinsic information for advanced analytics and Competitive Intelligence
If modelled according to terminology management standards (ISO 26162-1:2019), terminology databases input models can offer SEO analysts additional information such as product & subject fields classifications. This intrinsic additional information enables ad-hoc exports of vocabularies, e. g. according to product line, target groups (marketing, technical documentation, customer support) or country and language. Particularly useful in Search Engine Advertising (SEA) for the definition of campaign groups & keywords, the terminology database model can also be used as reference for defining the information architecture of the website.
Reason 6: Semantic Interoperability, easy access via APIs, Linked Data
Managed in relational databases, vocabularies can be exported in different formats (XML, HTML, .rtf, .txt) ensuring the interoperability between systems via TBX (Term Base Exchange, ISO 30042). Interfaces between terminology management systems, translation memories and Component Content Management Systems (CCMS) ensure the use of correct terminology in technical communication. Thanks to data transformation software, TBX can be easily transformed in SKOS and RDF and made be available in the semantic web as as Linked Data resource. Special developed APIs enables the integration of terminology in marketing CMS for advanced marketing analytics.
Summary
Terminology databases play an important role along the entire customer journey and product life cycle. In this article I have provided 6 reasons why SEO analysts should consider the artefacts of terminology databases (vocabularies, database models, links to other resources) as a strategic source for keywords and taxonomy extraction and presented 5 use cases for a pratical application of terminology databases within SEO analytics practices.