Mohammad Al-Ubaydli’s blog

Semantic wikis

Posted in Medicine, People / organisations, Technology by Dr Mohammad Al-Ubaydli on August 18, 2008

The semantic web is the most important technology development that I wait to reach mass adoption. Even the Economist’s essayists understand this. But mass adoption is still far away.

In the meantime, the technology is maturing nicely, and over the last week I have begun making use of Semantic MediaWiki (SMW) software to document clinical knowledge. Part of the course that I teach at UCL’s medical school this September is to get medical students to populate this wiki.

It is worth reading the Semantic MediaWiki user manual to understand why the technology is so useful. Here is the list of five reasons, modified with my own explanations the clinical benefits:

  1. Manually generated lists. Wikipedia is full of manually edited listings such the causes of secondary hypertension. Errors are common when a list has to be updated manually. Furthermore, the number of potentially interesting lists is huge, and it is impossible to provide all of them in acceptable quality. In SMW, lists are generated automatically like this. They are always up-to-date and can easily be customised to obtain further information.
  2. Searching information. Much of Wikipedia’s knowledge is hopelessly buried within millions of pages of text, and can hardly be retrieved at all. For example, there is no list of diseases that present with coughing and weight loss in Wikipedia. A SMW query would be much more effective than a text search is.
  3. Inflationary use of categories. The need for better structuring becomes apparent by the enormous use of categories in Wikipedia. While this is generally helpful, it has also led to a number of categories that would be mere query results in SMW. For some examples consider the category deaths from leukemia lists people who have died from the disease, but the disease itself is absent from the category vascular disorders, which only includes Migraine, cluster headache and reflex neurovascular dystrophy. The contents of the categories could easily be replaced by simple queries that use just a handful of annotations, for example Category:Disease, Property:body system, Category:People, Property:death from, and Property:date of death would suffice to create thousands of similar listings on the fly, and to remove hundreds of Wikipedia categories.
  4. Inter-language consistency. Apart from overcoming the differences between hematologists in the USA and haemtologists in the UK, you can ask for the incidence of leukemia (or even lukaemia) in the Chinese Wikipedia without reading a single word of this language. This can be exploited to detect possible inconsistencies that can then be resolved by editors. For example, the classification is slightly different for leukemia in the English Wikipedia and leukämie in the German one.
  5. External reuse. Some desktop tools today make use of Wikipedia’s content, e.g. the media player Amarok displays articles about artists during playback. However, such reuse is limited to fetching some article for immediate reading. The program cannot exploit the information (e.g. to find songs of artists that have worked for the same label), but can only show the text in some other context. SMW leverages a wiki’s knowledge to be useable outside the context of its textual article. Since semantic data can be published under a free license, it could even be shipped with a software to save bandwidth and download time.

The last reason is crucial for me. The development of decision support systems is stymied by each provider trying to create their own, text-based, proprietary system. The semantic incompatibility of these systems means that creators of electronic medical records cannot integrate the work of decision support systems providers.

It remains to say that I learned about the semantic wikipedia by attending HealthCamp Md, which was organized by the very wonderful Mark Scrimshire. Aside from inspiring me to try Twitter (I am still waiting for my new smartphone to arrive in the UK) and to start HealthCamp UK, he allowed me to meet Melanie Swan and Mike Cariaso. As a bona fide futurist, Melanie had already had her DNA analysis back from 23andMe and gladly shared the data with Mike. Mike ran these through the tools he had created at SNPedia. Within a couple of hours he was able to give Melanie an explanation of what was known about her DNA sequence, using the latest information documented in the SNPedia semantic wiki.

I was shocked and awed, and wanted to create something similar for doctors to use.