Work · Research

Rethinking Search in Science

Everyone occasionally needs information — forgotten knowledge from school, or something even experts can't agree on. This is about finding the most elusive information.


1. Popularity scores: a hindrance to new discoveries

Groundbreaking science is often hindered by popularity bias. Search engines prioritize content by general appeal and prior interactions — colliding with the scientific goal of surfacing new discoveries. Useful-but-unpopular information includes:

2. Keyword searches

Scientific concepts and their interrelations can rarely be reduced to simple keywords without losing essential context. Researchers sift through overloads of loosely related documents in a tedious, iterative process of refining terms. This yields a high rate of false positives and may also increase the likelihood of missing critical information (false negatives).

3. Semantic searches

Semantic search is a step forward — but not flawless. Today’s vectorization tools cannot losslessly convert running text into vector representations. Scientific articles, which discuss observations bordering on off-topic, are particularly affected: during vectorization, details get lost, so querying for that detail can’t surface the document.

4. A novel solution: graph-based document representation

Each scientific document can be conceptualized as a network of interconnected pieces of information rather than a linear block of text. Each observation, result, and concept becomes a node; the relationships become edges. This allows nuanced indexing based on the strength and nature of connections, not mere occurrence.

Advantages:

Although implementing graph-based representations can be resource-intensive, the potential for transforming scientific research is profound.

Conclusion

The adoption of graph-based document representations challenges the status quo of document search and opens a pathway to more effective scientific inquiry — ensuring crucial information is no longer obscured by the limitations of conventional search. This leap could well be a pivotal moment in accelerating scientific progress.

See the solution Literature review