Dr. Philipp Scharpf

Senior Researcher

During my master’s thesis research in physics on the topic “Simulation and Visualization of Gravitational Waves from Binary Black Holes” at the University of Konstanz, I realized the current need for effective tools to process mathematical formulas in document analysis and Mathematical Information Retrieval (MathIR).

Since then, I review, develop, and evaluate Mathematical Entity Linking (MathEL) methods and applications. Methods include STEM document (formula & identifier) annotation recommendation, formula concept retrieval (classification & clustering), and nearest-neighbor retrieval. Applications include mathematical (STEM) document classification & clustering, Wikipage readability & accessibility, formula search, question answering, and question generation. Particularly, I am focused on the Wikidata knowledge graph.


  • Mathematical Information Retrieval (MathIR)
  • Mathematical Entity Linking (MathEL)
  • Formula search
  • Math document classification & clustering
  • Question answering
  • Question generation
  • Recommender systems
  • Wikipedia & Wikidata


The following list illustrates examples of the student projects (see pdf file here). Feel free to contact me if you have interest in any of these or my other research topics.


For a list of my publications see also my profiles on Google Scholar and ResearchGate.

  • Discovery and Recognition of Formula Concepts using Machine Learning P. Scharpf, M. Schubotz, H. S. Cohl, C. Breitinger, and B. Gipp in Scientometrics Journal (Springer) 2023 PDF
  • Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling P. Scharpf, M. Schubotz, and B. Gipp in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2022 PDF
  • Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation
    P. Scharpf, M. Schubotz, and B. Gipp
    in Companion Proceedings of the Web Conference (WWW) 2021
  • Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language
    P. Scharpf, M. Schubotz, A. Youssef, F. Hamborg, N. Meuschke, and B. Gipp
    in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2020
  • Introducing MathQA – A Math-Aware Question Answering System
    Schubotz, M. Scharpf, P. Gipp, B.
    Information Discovery and Delivery Journal (Emerald Publishing) 2018


2022 – present

Big Data and Learning Analytics, University of Stuttgart

2017 – present

Senior Researcher
Gipp Lab, University of Konstanz, Wuppertal, and Göttingen

2022 – present

Data and AI Consultant
AI4Future Dataconsulting, Konstanz Germany

2018 – 2021

Data and AI Consultant
BMT Business Meets Technology AG, Bottighofen Switzerland

2019 – 2020

Machine Learning and Information Science Researcher
Chair for Data & Knowledge Engineering, University of Wuppertal

2018 – 2019

Machine Learning and Information Science Researcher
Dept. of Computer and Information Science, University of Konstanz


Physics, M. Sc.
Dept. of Physics, University of Konstanz


Physics, B. Sc.
Dept. of Physics, University of Zürich