FAQs - Open Knowledge Maps

Q1 How do you define "most relevant" when you are talking about most relevant resources?

At the moment, we are using the relevance ranking provided by - depending on your choice - either the PubMed API or the BASE API. Both of them mainly use text similarity between your query and the article metadata to determine the relevance. PubMed has a detailed description of their relevance ranking. BASE uses Lucene (via Solr), which describes their ranking as well on this page.

Q2 Why are you only using the top 100 resources to create the map?

We want to keep the number of resources to a manageable amount. 100 resources are already 10 times more content than is presented on a standard search results page. Nevertheless, we are investigating how to enable the exploration of larger amounts of content, while keeping cognitive load to a minimum. At the moment, you can drill deeper into a topic by providing a more specific search query. One way to do this is to expand your query with the topic of a sub-area.

In addition to cognitive considerations, there are also technical limitations. Most data providers either impose a limit on how many items can be retrieved or considerably slow down larger queries. In order to give you more content in a single knowledge map, we would have to build our own index first. This is on our roadmap, but due to the significant development and maintenance effort involved, we are still seeking funding for it.

Q3 Why are there important resources missing in my map?

At the moment, we are using the top 100 resources from the selected data source to create the map. While this is already 10 times more content than is presented on a standard search results page, we may still miss important resources due to this restriction. In the future, we hope to overcome this problem by including more resources in a knowledge map or by providing additional visualisation types. In the meantime, please let us know of cases of major omissions via info@openknowledgemaps.org.

Q4 Why does it take up to 30 seconds to create a knowledge map?

While you wait, each query is processed live: first, it is sent to the selected data provider (either PubMed or BASE), which returns the most relevant results for it. This process takes around 15 seconds. Then, our AI pipeline analyses this data to create a knowledge map for it, which takes another 15 seconds. We are committed to reducing loading times, but this would most certainly mean that we need to create our own index. This is on our roadmap, but due to the significant development and maintenance effort involved, we are still seeking funding for it.

Q5 How do you ensure the quality of the research included in a knowledge map?

We use only trusted data providers such as BASE, PubMed and OpenAIRE. These providers carefully review the data sources to make sure that they only include academic content. However, even with the most careful review, mistakes can occur. If you find content on Open Knowledge Maps that you deem unscientific, please contact us at info@openknowledgemaps.org.

Q6 Are the maps generated based on full text analysis or on metadata analysis?

The grouping of resources is based on article metadata. Currently, we use titles, abstracts, authors, journals, and subject keywords to create a word co-occurrence matrix between articles. On top of this matrix, we perform clustering and ordination algorithms. The labels for the sub-areas (bubbles) are generated from the subject keywords of the articles in this area. In cases where they are missing from the metadata, we approximate them from abstract and title. More information can be found in this article.

Q7.1 How are the area titles created?

Area titles are created from subject keywords of resources that have been assigned to the same area. We select those keywords and phrases that appear frequently in one area, and seldom in other areas.

This happens in three steps:

First we collect single keywords and phrases with a length of up to three words for each area (e.g. “mitigation” or “climate change adaptation”).
Then we score keywords and phrases according to how frequently they occur in one area compared to other areas (extractive summarization with a bag-of-words TF-IDF algorithm).
Finally, we select the top three keywords or phrases for each area.

In many cases, subject keywords are missing in a paper’s metadata, and we infer them from titles and abstracts. In addition we perform smaller tasks, for example we remove stopwords, or we try to identify the correct upper- and lower-casing for abbreviations.

Q7.2 Why are area titles sometimes in different languages?

In some cases, area titles can be multilingual, when there are resources in different languages in the knowledge map or when keywords are provided in more than one language. This is a difficult problem that is made even more challenging by poor metadata quality: at the moment, we only have correct language metadata for a minority of resources.

Q8.1 What does the placement of the areas (bubbles) and the resources mean?

In general, the placement of areas (bubbles) can be interpreted as follows:

Closeness of areas implies subject similarity. The closer two areas, the closer they are subject-wise. The overlap of two areas implies strong subject similarity, but it does not mean that the two areas share common resources. Resources are always assigned to a single area only.
Centrality of areas implies subject similarity with the rest of the map, not importance. The closer an area is to the center, the closer it is subject-wise to all the other areas in the map.

Nevertheless, the placement of the areas should only be taken as an indication as the map is untangled in the beginning to improve readability. The placement of resources within an area has no specific meaning, as they are moved around significantly during the initial arrangement of the map to avoid overlap. More information can be found in this article.

Q8.2 What determines the size of the areas and the resources?

The calculation of the size of areas and resources differs between data integrations.

In our PubMed integration, the size of a paper is determined by the citation count of the paper. The more citations a paper has received, the larger it is.

The size of an area is determined by the sum of the citations that the resources in this area have received. The citation metrics are provided by Crossref.

In our BASE integration, the size of an area is determined by the number of resources it contains. The more resources there are in an area, the larger it is.

All resources are of the same size. There are no additional metrics available in BASE to provide a specific size for each paper.

Q9 Why does the overview visualisation work better for some research topics than others?

The visualisation depends on the search results that we get for a given query. If there are for example not enough articles on the topic, or if the metadata quality is low, this will impact the visualisation.

Here are some general search tips:

Our service is not yet optimised for natural language prompts. Try keywords instead of long phrases or questions. For example “climate change” and impact instead of “How does climate change impact our lives?”
To increase the coverage try out different search options under “Refine your search”. By default we only consider certain resource types and we also prefer resources with an abstract. You can override these default settings, which may increase your chances of finding what you are looking for.

We have a number of routines in place to improve your chances of getting a useful map, but we do not always succeed. If you come across a map that needs improvement, we'd love to hear from you at info@openknowledgemaps.org.

Q10 How should I cite Open Knowledge Maps?

To cite an individual map, please use the citation provided for each map. Please click on the cite button on the left hand side of each map.
To cite the open source software Head Start, please see the read-me on Github. It also includes relevant research resources.
To reference the website and the search, please use the following citation:
Open Knowledge Maps (2019). Open Knowledge Maps: A Visual Interface to the World's Scientific Knowledge. https://openknowledgemaps.org

Q11 Where can I find more information on the background of Open Knowledge Maps?

Please see our Github page for a list of relevant research resources and project reports.

Q12 How can I include my repository / data source on Open Knowledge Maps?

Open Knowledge Maps uses BASE as its main data source. You can check if your data source is already indexed by BASE on this page. If not, you can suggest it as a new source using this form.

To get included in PubMed, check if your journal is already included using information on this page. If not, you can suggest it as a new title for MEDLINE.

Q13 How did Open Knowledge Maps come about?

Open Knowledge Maps was founded by Peter Kraker in 2015. Peter had worked on knowledge domain visualisations in his PhD and developed the first version of the open source visualisation framework Headstart out of frustration with the existing discovery tools for scientific knowledge. In January 2016, Peter posted a Call for Collaborators on his blog, which brought a first team of volunteers together. Since 2016 Open Knowledge Maps is a registered non-profit organization.

Q14 Can I use Open Knowledge Maps to visualise my own collection(s)?

Using our Custom Services, you are able to embed Open Knowledge Maps services in your own discovery systems and provide attractive, visual entry points to your holdings. The Custom Services can be used to complement existing discovery systems such as a library catalog, or to make the contents of a specific collection (e.g. a research data management system) more visible. More information is available here.

Q15 How is Open Knowledge Maps funded?

We are a charitable non-profit organization run by a group of dedicated team members and volunteers. We propose to fund Open Knowledge Maps in a collective effort. Organizations are invited to become supporting members and co-create the platform with us. If your organization is interested to become a supporting member please get in touch with founder Peter Kraker at: pkraker@openknowledgemaps.org

We are also seeking third-party funding for our roadmap to realize the full potential of the idea. If you are interested in funding specific efforts, please contact us on info@openknowledgemaps.org.

You can also help sustain Open Knowledge Maps by making a donation.

Q16 How can I contribute?

You can contribute in a number of ways: we love to hear your feedback and ideas as this helps us to improve Open Knowledge Maps. If you like the project, please spread the word as far as you can.

You can also help sustain Open Knowledge Maps by making a donation.

Q17 I would like to introduce Open Knowledge Maps to my peers. Do you have any materials available?

We do! Check out our training and promotional materials including presentations in English and Spanish and a How-To for running an Open Knowledge Maps workshop.

Q18 How do I increase the visibility of my research online?

We have created a workshop for this topic entitled "Academic SEO". You can find a recording of this workshop on Youtube. We have also published the presentation including speaker notes and a short introduction in our training materials.

Q19 Are you available for collaborations and joint projects?

Yes! We partner with funders, research organizations and infrastructures that share our goals to develop innovative open science projects. Get in touch if you are interested in such a collaboration.