Oxford BioNLP

OxBioNLP is a data extraction, inference and visualisation digital platform. It contributes to the automation of writing bias-free systematic literature reviews, and to discover new public health questions with evidence in the context of the COVID-19 pandemic.

Research

OxBioNLP is proposing a new set of standards in writing a systematic review. The obvious benefits are an extended scope of literature survey with automation of abstract and full-text screening using transformer-based language modelling approaches, faceted bias detection using advanced NLP and information retrieval techniques, clustering-based search, question discovery and answering, data visualization.
This is a step forward in science automation that Oxford can lead on in the international COVID-19 research collaborations.

In this extended version, beyond the remaining tasks stated in the previous section, we focus our attention on the knowledge discovery and search component. OxBioNLP-v1.1 will provide the fastest and most advanced knowledge discovery and search experience on COVID-19 data lakes thanks to our research and development team in Oxford (Computer Science, Pembroke College and Nuffield Department of Medicine). This vision has become reality after a series of cross-divisional meetings within the University of Oxford. Thanks to our sponsor Amazon Web Services’ support, we have the chance to receive in-kind consultancy on how we can use AWS technologies efficiently. We also have the chance to get direct technical support from AWS UK, thanks to Laura Hyatt. We have been in communication with AWS research scientists in Cambridge UK and also the Alexa team in the US during the project.

OxBioNLP-v1.0.0 was presented at ICASR 2021.

Latest Publications

System Architecture: Deliverables

Data Collection: CORD-19 is a rapidly growing repository of over 400,000 scholarly articles about COVID- 19, SARS-CoV-2. If the research community is to stay abreast with this accelerating growth in the literature, it is vital to extract and visualise insights efficiently. Our goal is to complement CORD-19 in a unique way by extracting articles and opinion reviews from the following platforms, journals and preprint servers: Nature, Science, CDC, NEJM, JAMA, Lancet, Cell, Wiley, BMJ, OUP, Elsevier, PubMed, Embase, medRxiv and bioRxiv.

NLP Pipeline: We will design and develop a data processing pipeline to analyse the full text of the scientific publications. Articles will be enriched with new annotations. We expect this natural language processing pipeline to understand the content of large number of articles and to discover new questions to be addressed with evidence.

Oxford BioNLP

Research

News & Events

OxBioNLP-v1.0.0 was presented at ICASR 2021.

Latest Publications