
Comparison of German publications in OpenAlex and Web of Science
Introduction
OpenAlex is an open and comprehensive catalogue of research publications that have their origin all over the world. It has been maintained by OurResearch since 2021 and has access to large amount of data, including CrossRef, PubMed, preprint repositories like arXiv and institutional repositories. OpenAlex is free of charge and financed by research funders, foundations and scholarly institutions. The German Competence Network for Bibliometrics (KB) intends to exclusively utilize OpenAlex as its data source in the future to eliminate its reliance on commercial data providers. OpenAlex prioritizes to source metadata from trusted organizations like CrossRef. The platform does not verify the reliability of the content of the indexed works. Specific metadata can be incomplete and OpenAlex does not filter out questionable journals. Thus, quality control remains an issue. The focus of the analysis is on the gain or loss of publications if OpenAlex were used instead of Web of Science to identify publications with German affiliations.
Data and methods to delineate the corpora
In a first step, all German papers covered in Web of Science (WoS) that were published between 2010 and 2024 were identified. In total, there are 2,451,126 publications with at least one German affiliation in the KB’s own version of WoS. Similarly, all German papers in the in-house version of OpenAlex published between 2010 and 2024 were delimited, resulting in a total number of 3,743,519 items. A simple join based on the DOI of WoS and OpenAlex publications resulted in 2,054,750 matches. The PubMedID (PMID) was used as an additional unique identifier to join items in WoS and OpenAlex. In addition to those DOI-based matches a further 14,671 publications were found based on their PMID. Thus, a total of 55.3 % of WoS OpenAlex items were matched in the OpenAlex WoS database1.
A further 1,811,709 publications are only contained in OpenAlex or could not be matched with WoS publications. The full outer join of the German publications in WoS and OpenAlex shows that there are 4,266,344 publications that are assigned to a German country code either in WoS or OpenAlex or in the two internal databases.
Analysis
WoS items not in OpenAlex
There are a small number of 10,753 WoS items that were not matched in OpenAlex. A random sample was taken to determine the reasons for no-matches. The analysis of 50 items shows that 32 items could be found in OpenAlex using metadata other than DOI or PMID. The most common reason why these items were not matched in OpenAlex is an incorrect DOI in the WoS database (19 out of 32 cases). In most cases, searching for the DOI given in WoS did not return any results in Google. Another reason is that the DOI is missing in the OpenAlex metadata overview. In 18 out of 50 cases, the search for the WoS publication on the basis of article title did not lead to any results in OpenAlex. It is striking that most of the items (11) missing in OpenAlex were published in the last few years (2021-2024). Another reason why WoS publications are not included in OpenAlex was that there is no valid DOI on a publisher’s site and the DOI in WoS only leads to ResearchGate or does not provide a result in Google.
OpenAlex items not matched in WoS
A further analysis looks at the publications that are included in OpenAlex but were not found in WoS. A random sample was drawn from the items marked as “journal” in OpenAlex to determine the reasons for no-matches. In total, there are 643,004 journal articles published between 2010 and 2024 that were not found in WoS. The manual analysis of 50 items shows that there are 21 items that were published in sources that were never covered in WoS. In 14 cases, the journal was covered in WoS for a specific period. Among these 14 cases, there are 9 items that were published outside of the coverage period and 5 that were published in the coverage period but are missing. One reason could be the item type, e.g. one of the five items is a poster.
Five publications have the same DOI in the online version of WoS and in the online version of OpenAlex, but are not recorded in the KB database, suggesting that they were published in products not licensed by the KB. In eight cases there was no match, because the DOI is missing in WoS. Six of these items are meeting abstracts.
OpenAlex items without a German affiliation in OpenAlex
A final analysis was devoted to those 217,326 OpenAlex items that have no German affiliation but are German publications according to WoS. To determine the reasons, the country information was checked both in the online version of the database and in the internal database version for Web of Science and Scopus. The results show that there are various reasons for missing country information. In 22 of 50 cases there is no affiliation in the online version and therefore no entry in KB’s items_affiliations table. Among these 22 cases, the search on the publisher’s website shows that in 15 cases no affiliation is provided on the publisher’s website. In three of the 22 cases, the DOI link stored in OpenAlex leads to a PDF and not to a publisher page with affiliation data.
Another important reason why German publications were not identified as such in OpenAlex is that the items_affiliations table only shows the full affiliation data for some countries, but not for Germany, although Germany (DE) is listed in the online version of OpenAlex. In total, there are 10 cases where German addresses only appear in the address_full column, but are not split into organization, city, state and country.
In addition, there are three cases in which the affiliation for a German organization is present in the online version without country information. The KB-internal version of OpenAlex therefore only provides information on the institution in the address_full column without detailed metadata on the affiliation. Other reasons for missing German affiliations in the online version are that either only the institutional information without country information or e-mail-addresses are available instead of the full affiliation.
Summary of results
The following figure summarizes the results:
In total, there are 4,266,344 items in WoS or OpenAlex with German affiliation in the 15-year period 2010-2024. Of these publications, 43.4 % are identical in WoS and OpenAlex, so that these 1,852,095 German publications could be searched for in a similar way using OpenAlex as a database alone. However, a total of 1,613,704 (37.8 %) publications are only available in OpenAlex and would lead to an enormous gain in publications if OpenAlex were used exclusively. In addition, there are 217,326 publications without a DOI or a PMID in OpenAlex, so it is not clear how many of these are also covered in WoS.
The loss of publications if the retrieval were based only on OpenAlex would be as follows. The analysis shows that there are 8,768 items that are only included in WoS and are not covered by OpenAlex. Another valuable set consists of 217,326 publications with a missing German affiliation, as the comparison with WoS data shows. A final set that would be lost if the database were restricted to OpenAlex encompasses 376,446 publications (8.8 %) without DOI or PMID. Overall, if the analyses were restricted to OpenAlex, around 14 % of German publications would remain unconsidered.
Footnotes
(2,054,750+14,671) / 3,743,519 = 0.5528↩︎
Citation
@online{aman2025,
author = {Aman, Valeria},
title = {Comparison of {German} Publications in {OpenAlex} and {Web}
of {Science}},
date = {2025-08-12},
url = {http://www.open-bibliometrics.de/posts/20250812-GermanPublications/},
langid = {en}
}