Answers to the most frequently asked questions about Open Knowledge Maps.
Q1 How do you define "most relevant" when you are talking about most relevant papers?
At the moment, we are using the relevance ranking provided by - depending on your choice - either the PubMed API or the BASE API. Both of them mainly use text similarity between your query and the article metadata to determine the relevance. PubMed has a detailed description of their relevance ranking. BASE uses Lucene (via Solr), which describe their ranking as well on this page.
Q2 Why are you only using the top 100 papers to create the map?
We want to keep the number of papers to a manageable amount. 100 papers are already 10 times more content than is presented on a standard search results page. Nevertheless, we are investigating how to enable the exploration of larger amounts of content, while keeping cognitive load to a minimum. At the moment, you can drill deeper into a topic by providing a more specific search query. One way to do this is to expand your query with the topic of a sub-area.
In addition to cognitive considerations, there are also technical limitations. Most data providers either impose a limit on how many items can be retrieved or considerably slow down larger queries. In order to give you more content in a single knowledge map, we would have to build our own index first. This is on our roadmap, but due to the significant development and maintenance effort involved, we are still seeking funding for it.
Q3 Why are there important papers missing in my map?
At the moment, we are using the top 100 papers from the selected data source to create the map.
While this is already 10 times more content than is presented on a standard search results page, we may still miss important
papers due to this restriction. In addition, we can only use papers that have an abstract - otherwise we do not have enough content
for our automatic analysis.
In the future, we hope to overcome this problem by including more papers in a map and by enabling users to manually add papers to automatically created maps. In the meantime, please let us know of cases of major omissions via firstname.lastname@example.org.
Q4 Why does it take up to 30 seconds to create a knowledge map?
While you wait, each query is processed live: first, it is sent to the selected data provider (either PubMed or BASE), which returns the most relevant results for it. This process takes around 15 seconds. Then, we analyse this data to create a knowledge map for it, which takes another 15 seconds. We are committed to reduce loading times, but this would most certainly mean that we need to create our own index. This is on our roadmap, but due to the significant development and maintenance effort involved, we are still seeking funding for it.
Q5 How do you ensure the quality of the research included in a knowledge map?
We use only trusted data providers such as BASE, PubMed and OpenAIRE. These providers carefully review the data sources to make sure that they only include academic content. However, even with the most careful review, mistakes can occur. If you find content on Open Knowledge Maps that you deem unscientific, please contact us at email@example.com.
Q6 Are the maps generated based on full text analysis or on metadata analysis?
The grouping of papers is based on article metadata. Currently, we use titles, abstracts, authors, journals, and subject keywords to create a word co-occurrence matrix between articles. On top of this matrix, we perform clustering and ordination algorithms. The labels for the sub-areas (bubbles) are generated from the subject keywords of the articles in this area. In cases where they are missing from the metadata, we approximate them from abstract and title. More information can be found in this article.
Q7.1 How are the area titles created?
Area titles are created from subject keywords of documents that have been assigned to the same area. We select those keywords and phrases that appear frequently in one area, and seldom in other areas.
This happens in three steps:
In many cases, subject keywords are missing in a paper’s metadata, and we infer them from titles and abstracts. In addition we perform smaller tasks, for example we remove stopwords, or we try to identify the correct upper- and lower-casing for abbreviations.
Q7.2 Why are area titles sometimes in different languages?
In some cases, area titles can be multilingual, when there are papers in different languages in the knowledge map or when keywords are provided in more than one language. This is a difficult problem that is made even more challenging by poor metadata quality: at the moment, we only have correct language metadata for a minority of documents.
As part of the EU-funded project TRIPLE, we are working on better solutions for these cases, including the ability to select, which language(s) you would like to restrict your knowledge map to. Stay tuned, we will keep you updated on our progress via our news channels.
Q8.1 What does the placement of the areas (bubbles) and the papers mean?
In general, the placement of areas (bubbles) can be interpreted as follows:
Nevertheless, the placement of the areas should only be taken as an indication as the map is untangled in the beginning to improve readability. The placement of papers within an area has no specific meaning, as they are moved around significantly during the initial arrangement of the map to avoid overlap. More information can be found in this article.
Q8.2 What determines the size of the areas and the papers?
The calculation of the size of areas and papers differs between data integrations.
In our PubMed integration, the size of a paper is determined by the citation count of the paper. The more citations a paper has received, the larger it is.
The size of an area is determined by the sum of the citations that the papers in this area have received. The citation metrics are provided by PubMed.
In our BASE integration, the size of an area is determined by the number of documents it contains. The more papers there are in an area, the larger it is.
All papers are of the same size. There are no additional metrics available in BASE to provide a specific size for each paper.
Q9 Why does the overview visualization work better for some research topics than others?
The visualization depends on the search results that we get for a given query. If there are for example not enough articles on the topic, or if the metadata quality is low, this will impact the visualization. We have a number of routines in place to improve your chances of getting a useful map, but we do not always succeed. If you come across a map that needs improvement, we'd love to hear from you at firstname.lastname@example.org.
Q10 How should I cite Open Knowledge Maps?
Q11 Where can I find more information on the background of Open Knowledge Maps?
Please see our Github page for a list of relevant research papers and project reports.
Q12 How can I include my repository / data source on Open Knowledge Maps?
Q13 How did Open Knowledge Maps come about?
Open Knowledge Maps was founded by Peter Kraker in 2015. Peter had worked on knowledge domain visualizations in his PhD and developed the first version of the open source visualization framework Headstart out of frustration with the existing discovery tools for scientific knowledge. In January 2016, Peter posted a Call for Collaborators on his blog, which brought a first team of volunteers together. Since 2016 Open Knowledge Maps is a registered non-profit organization.
Q14 Can I use Open Knowledge Maps to visualize my own collection(s)?
Absolutely! Open Knowledge Maps is based on the open source software Head Start, which is able to create knowledge maps from a wide variety of data, including text, metadata and references. If you have a collection that you would like to visualize with Open Knowledge Maps, check out our docs to get started. If you are interested in a collaboration project check out our present and past collaboration projects and learn more about how we can work together. Get in touch with your project proposal ideas at email@example.com.
Q15 How is Open Knowledge Maps funded?
We are a charitable non-profit organization run by a group of dedicated team members and volunteers. We propose to fund Open Knowledge Maps in a collective effort. Organizations are invited to become supporting members and co-create the platform with us. If your organization is interested to become a supporting member please get in touch with founder Peter Kraker at: firstname.lastname@example.org
You can also help sustain Open Knowledge Maps by making a donation.
Q16 How can I contribute?
You can also help sustain Open Knowledge Maps by making a donation.
Q17 I would like to introduce Open Knowledge Maps to my peers. Do you have any materials available?
We do! Check out our training and promotional materials including presentations in English and Spanish and a How-To for running an Open Knowledge Maps workshop.
Q18 How do I increase the visibility of my research online?
We have created a workshop for this topic entitled "Academic SEO". You can find a recording of this workshop on Youtube. We have also published the presentation including speaker notes and a short introduction in our training materials.
Q19 Are you available for collaborations and joint projects?
Yes! Check out our present and past collaboration projects and learn more about how we can work together.
You couldn't find an answer to your question? Get in touch and we will get back to you as soon as we can.