John Philip McCrae

I am the leader of the Unit for Linguistic Data at the Data Science Institute of the National University of Ireland Galway. My work has focussed on the intersection of NLP and data science, and I have lead the development of the linguistic linked open data cloud, a large-scale integration of many language resources. I am the co-ordinator of the PrĂȘt-Ă -LLOD project, funded by the European Union H2020 project, which aims to make linguistic linked open data ready-to-use. I am also a work package leader in the ELEXIS project on building a new lexicographic infrastructure for Europe. In addition, I am funded by the Irish Research Council under the Laureate program with the Cardamom project focused on the development of the comparable deep models for minority and historical languages. Finally, I am a PI in the SFI Insight Centre for Data Analytics and, from 2021, a PI in the SFI ADAPT Centre. I am also a member of the Centre for Applied Linguistics and Multilingualism (CALM) and have active research collaborations with Fidelity Investments and Huawei.

I completed my PhD within 3 years while still publishing a journal article (with 47 citations) and contributing to the BioCaster system for detecting disease outbreaks by processing texts in East Asian languages. After joining Bielefeld University in 2009, I played a leading role in at least two major scientific breakthroughs. Firstly, the development of the lemon Lexicon Model for Ontologies was a major contribution to the representation of semantics relative to natural language and is now being used by most relevant research groups and was one of the most significant outcomes of the Monnet project, an FP7 funded project. Secondly, out of the work on this topic I have been instrumental in creating the topic of linguistic linked open data as a major research theme which has been supported by over a dozen workshops and events and was a major theme of the 2016 Language Resource and Evaluation Conference (LREC). This topic lead to the Lider project, which used linguistic linked open data as an enabler for content analytics in enterprise and was funded by FP7, where I played a major role in writing the grant and in implementing the work plan. More recently, my work in linked data has played a pivotal role in obtaining funding for the ELEXIS project (under H2020-INFRAIA), where we will apply linked data technologies to lexicography.

My work has lead to over 100 publications, nearly all of these citations are for work that did not involve my PhD supervisor and I have co-authored with over 150 co-authors from institutions around the world. You can see more about my publications here.

Full CV

Education

Employment