Microsoft launches text analytics to organise a deluge of healthcare data

The software company’s new feature comes as the healthcare industry is flooded with data amid the COVID-19 pandemic.

Updated - July 14, 2020 02:35 pm IST

Published - July 14, 2020 01:10 pm IST

File photo.

File photo.

(Subscribe to our Today's Cache newsletter for a quick snapshot of top 5 tech stories. Click here to subscribe for free.)

Microsoft announced a new analytics feature in its Azure Cognitive service that will allow developers to process and extract insights from unstructured medical data.

“Text Analytics enables health care providers, researchers, and companies to extract rich insights and relationships from unstructured medical data,” Eric Boyd, Corporate Vice President, Azure AI said.

The software company’s new feature comes as the healthcare industry is flooded with data amid the COVID-19 pandemic.

Most of the data is in the form of unstructured text, such as doctor’s notes, medical publications, electronic health records, clinical trials protocols, medical encounter transcripts and more, the company said.

However, healthcare organisations, providers, researchers, pharmaceutical companies, and others are facing challenges in identifying and drawing insights from this unstructured information.

Microsoft’s Analytics helps providers quickly access and process this information to find solutions that will improve health outcomes.

Azure Text Analytics for Health is a containerised service that extracts and labels relevant medical information from texts such as doctor’s notes, discharge summaries, clinical documents, and electronic health records.

The Text Analytics currently performs Named Entity Recognition (NER), relation extraction, entity negation and entity linking for English-language text in the developer’s own environment that meets security and data governance requirements.

NER identifies words and phrases in unstructured text that can be associated with one or more semantic types, such as diagnosis, medication name, symptom, age. It supports only English language documents, a list entities and labels only relevant medical information from unstructured texts.

Image of named entity recognition. Picture by special arrangement.

Image of named entity recognition. Picture by special arrangement.


Relation extraction detects meaningful connections between concepts mentioned in text. For example, a “time of condition” relation is found by associating a condition name with a time.

Image of relation extraction. Picture by special arrangement.

Image of relation extraction. Picture by special arrangement.


Entity linking associates names entities mentioned in text to concepts found in a predefined database. Text Analytics for Health supports linking to the health and biomedical vocabularies found in the database of Unified Medical Language System (UMLS) Meta-thesaurus knowledge source.

The feature supports negation detection for different entities mentioned in the text. The meaning of medical content is affected by modifiers like negation, which can have critical implication if misdiagnosed.

COVID-19 has accelerated the urgency for organisations to find new ways to process these data and generate new insights. The new feature will help the healthcare industry to extract richer insights, save time and reduce costs, and improve customer engagement, Microsoft said in a statement.

Microsoft has trained the health feature on a diverse range of medical data covering various formats of clinical notes, clinical trials protocols, and more, said Hadas Bitran, Group Manager, Microsoft Healthcare.

“It is capable of processing a broad range of data types and tasks, without the need for time-intensive, manual development of custom models to extract insights from the data.”

Microsoft has also partnered with the Allen Institute for AI and research groups to prepare the COVID-19 Open Research Dataset, a free resource of over 47,000 scholarly articles for use by the global research community.

With Cognitive Search and Text Analytics, we developed the COVID-19 search engine, which enables researchers to evaluate and gain insights from the overwhelming amount of information, Hadas said.

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.