Posted in Medicine, People / organisations, Technology by Dr Mohammad Al-Ubaydli on August 18, 2008

The semantic web is the most important technology development that I wait to reach mass adoption. Even the Economist’s essayists understand this. But mass adoption is still far away.

In the meantime, the technology is maturing nicely, and over the last week I have begun making use of Semantic MediaWiki (SMW) software to document clinical knowledge. Part of the course that I teach at UCL’s medical school this September is to get medical students to populate this wiki.

It is worth reading the Semantic MediaWiki user manual to understand why the technology is so useful. Here is the list of five reasons, modified with my own explanations the clinical benefits:

  1. Manually generated lists. Wikipedia is full of manually edited listings such the causes of secondary hypertension. Errors are common when a list has to be updated manually. Furthermore, the number of potentially interesting lists is huge, and it is impossible to provide all of them in acceptable quality. In SMW, lists are generated automatically like this. They are always up-to-date and can easily be customised to obtain further information.
  2. Searching information. Much of Wikipedia’s knowledge is hopelessly buried within millions of pages of text, and can hardly be retrieved at all. For example, there is no list of diseases that present with coughing and weight loss in Wikipedia. A SMW query would be much more effective than a text search is.
  3. Inflationary use of categories. The need for better structuring becomes apparent by the enormous use of categories in Wikipedia. While this is generally helpful, it has also led to a number of categories that would be mere query results in SMW. For some examples consider the category deaths from leukemia lists people who have died from the disease, but the disease itself is absent from the category vascular disorders, which only includes Migraine, cluster headache and reflex neurovascular dystrophy. The contents of the categories could easily be replaced by simple queries that use just a handful of annotations, for example Category:Disease, Property:body system, Category:People, Property:death from, and Property:date of death would suffice to create thousands of similar listings on the fly, and to remove hundreds of Wikipedia categories.
  4. Inter-language consistency. Apart from overcoming the differences between hematologists in the USA and haemtologists in the UK, you can ask for the incidence of leukemia (or even lukaemia) in the Chinese Wikipedia without reading a single word of this language. This can be exploited to detect possible inconsistencies that can then be resolved by editors. For example, the classification is slightly different for leukemia in the English Wikipedia and leukämie in the German one.
  5. External reuse. Some desktop tools today make use of Wikipedia’s content, e.g. the media player Amarok displays articles about artists during playback. However, such reuse is limited to fetching some article for immediate reading. The program cannot exploit the information (e.g. to find songs of artists that have worked for the same label), but can only show the text in some other context. SMW leverages a wiki’s knowledge to be useable outside the context of its textual article. Since semantic data can be published under a free license, it could even be shipped with a software to save bandwidth and download time.

The last reason is crucial for me. The development of decision support systems is stymied by each provider trying to create their own, text-based, proprietary system. The semantic incompatibility of these systems means that creators of electronic medical records cannot integrate the work of decision support systems providers.

It remains to say that I learned about the semantic wikipedia by attending HealthCamp Md, which was organized by the very wonderful Mark Scrimshire. Aside from inspiring me to try Twitter (I am still waiting for my new smartphone to arrive in the UK) and to start HealthCamp UK, he allowed me to meet Melanie Swan and Mike Cariaso. As a bona fide futurist, Melanie had already had her DNA analysis back from 23andMe and gladly shared the data with Mike. Mike ran these through the tools he had created at SNPedia. Within a couple of hours he was able to give Melanie an explanation of what was known about her DNA sequence, using the latest information documented in the SNPedia semantic wiki.

I was shocked and awed, and wanted to create something similar for doctors to use.

  1. cariaso said, on August 18, 2008 at 3:58 pm

    Melanie Swan didn’t just share her genome with me, she shared it with the whole world. So for those who are curious you can see what she learned at

    In addition to the ways in which Semantic MediaWiki lets you browse your data, there are also query interfaces. Some are quite simple and designed to be embedded directly into your wiki. But for anyone who feels this can’t be a ‘real’ database, because real databases speak SQL, you should be aware that Semantic MediaWiki can also expose a SPARQL interface which is a real standard, and extremely SQL-like.

  2. Greg said, on August 18, 2008 at 4:09 pm

    Any doctor worldwide can use the SNPedia database for information – and we know there are at least a handful of doctors now who are savvy enough to – and for their own personal use. It would be great if we also got more medical students to use (and contribute to or even edit) SNPedia.

    The biggest hurdle is getting the health care community up to speed on all of this. Frankly, since in the vast majority of cases the information is not (yet) useful enough to help guide diagnoses or treatment decisions, we’re anticipating that it will take some time, and most likely won’t really come together until after full genomic sequence is readily available. [Microarrays are just sampling a subset of the genome.] It will also happen sooner in countries with more progressive health care systems (like most European countries).

    As to creating a report of a patient’s genome for physicians and other health care providers to use, we encourage you and others to use and help improve our existing applications. The Promethease software you saw at HealthCamp will get better and better as the user community grows and contributes suggestions.

  3. VocabControl » Semantic wikis said, on September 3, 2008 at 11:08 am

    […] Semantic wikis is a description of how semantic technology could be used to overcome retrieval problems in large-scale resources – in this case medical information. Once we start looking at DNA there is just so much data that we have to find clever ways of organising it. An excellent post from a fascinating and highly entertaining blog that ranges over many subjects. […]

  4. Fran said, on September 3, 2008 at 11:15 am

    This is a great analysis of some of the key knowledge organisation issues that crop up all over the place, but in medical information it strikes me that the problems are particularly severe. The vagueness around symptom description especially requires some very clever and flexible categorisation. Biowisdom – have been doing some interesting things cross-linking databases and working on data visualisations.

  5. […] attended HealthCamp Maryland back in June, had a great time and learned a huge amount. Judging by the early applications there […]

