Making sense of HIV prevalence data

30th November, 2009

Marching on with my bold ambition to engage people with science; the upcoming World AIDS day (1st December in case you did not know),and spurred by the book I am reading, Bad Science by Ben Goldacre (which is a revelation on the misrepresentation of science in the media) – I have decided to focus this blog on AIDS research in Malawi. A quick and dirty search on Medline – the largest international biomedical journal database, reveals that so far in 2009, 35 scientific papers have been published that report on AIDS in Malawi. After a close examination of these articles, I excluded three that are not AIDS specific (one was on visually impaired children and the other two on human resource). Before I proceed I would like to caution that not all published studies are indexed in Medline and searches of other databases e.g. Africa Journals Online might reveal more papers. However saying that, a journal’s inclusion in Medline is a mark of quality. I will discuss articles that were of particular interest because they discussed a sensitive issue, had far reaching implications for policy and practice or highlighted an important public health crisis.

One article that caught my eye is titled Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys by Reniers and Jeffrey published in the journal, AIDS1. This article discusses the relationship between prior knowledge of one’s HIV status and the likelihood to refuse an HIV test and how this can bias HIV prevalence estimates. This article got me thinking about HIV prevalence, a very common statistic that is frequently cited by activists, policy makers, scientists and all manner of institutions.

HIV prevalence as defined by the WHO is the percent of people with HIV infection among all people aged 15-49 years. This statistic is very important for purposes of advocacy, programme planning and evaluation. e.g. adequate stocking of ARVs in clinics, monitoring the trend and the impact of programs. The WHO website provides some insight on the rationale for its use and methods for its estimation. I wont bore you with the calculations but you should know that in the 1990s this figure was generated from surveillance systems that used data on pregnant women who attended preselected antenatal clinics (ANC). The major assumption was that prevalence amongst pregnant women was a close approximation of the prevalence in the adult population of men and women between the ages of 15-49. Now you may well ask why specifically the age range 15-49? Unfortunately, I was only able to get a vague answer from a UNAIDS document that states this age range covers people in their most sexually active years, with people most likely to become HIV infected in these years. Depending on who is reporting the prevalence in Malawi it can range from 15% to 12% but the official figures according to various sources are as follows

–         UNAIDS, WHO. 2007 AIDS epidemic update reports prevalence in Malawi in 2004 for population based survey at 12.7% . Whilst for the years 2001, 2003, and 2005 is 15%, 14.2% and 14.1% respectively.

–         The 2004 Malawi Demographic Health Survey (MDHS) reports an observed prevalence of 11.8% and a nonresponse-adjusted estimate of 12.7%. Nonresponse adjustment refers to adjusting for persons who were not tested for reasons such as they either refused or were not at home when the health worker came for the test.  (The MDHS project  is a goldmine of information gathering data on a range of health and demographic indicators like fertility, childhood mortality levels, awareness and use of family planning methods, and knowledge and behaviours related to HIV/AIDS and other sexually transmitted infections at the regional and national level. They do not survey everyone but use fancy statistics to sample a select Malawian population which is extrapolated as representative of the whole population. They conduct interviews with subjects and in the case of HIV do blood tests.)

–         I did try and access figures from the National AIDS Commission but NAC’s website was not responding (!*^!%!).

If you look at the figures you may notice (with horror) that in 2003 the figure was 14.2% and then 2004 12.7% and then went up in 2005 to 14.1% but it is incorrect to analyse it in this way because the the source of the data varies and is not comparable. The data for 2004 was sourced through population based surveys – the MDHS – which actual tested a sample of the population while the other figures are based on ANC data. Both methods have their pros and cons. It is now recognized that the use of prevalence estimates based on ANC typically overestimate true prevalence because the women who attend ANC are not representative of all pregnant women e.g. women in rural and remote areas do not attend ANC so therefore would be underrepresented in such a sample. Additionally women with HIV associated infertility are not captured; and men and non-pregnant women are not included. The UNAIDS Reference Group on Estimation, Modelling and Projections responsible for developing methods for calculating prevalence have now developed schemes to correct and improve ANC data.  There is an interesting comment on the WHO website “The main indicator proposed for monitoring progress towards achieving the international goals is HIV prevalence among young people aged 15-24 years which is a better proxy for monitoring HIV incidence than prevalence among ages 15-49 years.”

Population-based surveys like MDHS are expensive and logistically difficult to carry out and are therefore not conducted every year. But taken together, both sources complement each other and provide a clear picture of overall trends, geographical distribution of HIV, and information on high risk groups.

Going back to the study by Reniers and Jeffreys,   they revise the 2004 HIV prevalence estimate to 13.3% (in one place it is quoted as 13.2% which is wrong!) and suggest that “our estimates are conservative”. What the authors are putting forward is that people who know their HIV positive status are more 4.62 times more likely to refuse to participate in a health survey such as the MDHS. This refusal can bias the final prevalence estimate, bearing this in mind they have revised this figure to a prevalence of 13.3% for Malawi in 2004. So what does this mean in actual numbers, we would need know to the actual population of people aged 15-49 in 2004 but unfortunately censuses (yes censusus is a plural for census) have only been done in 2008 and 1998 so let us for example say they were 6 million aged 15-49 in 2004 (assumed from  CIA World Factbook Malawi 2004 )  for the MDHS  figure of 12.7% means  762,000 HIV infected people – and for the authors 13.3% is 798,000 HIV infected people,  a difference of 36,000 people. And if you remember that these figures are important for planning and monitoring purposes then 36,000 can be significant. ( I am not a statistician so I am unable to say whether this is statistically significant).

I think I have rambled for far too long on statistics and science but I hope now you have a better understanding of prevalence, how it is derived, and how figures may vary depending on population based survey data or ANC data. Finally and importantly remember that a prevalence of 10%  means that 1 in ten people between the age 15-49 are infected, not every tenth person!

I will as promised in the next blog look at some other scientific papers on AIDS research in Malawi.

Don’t forget to buy your red ribbon, support an AIDS charity, and say a prayer for those living and affected by HIV and AIDS.


1. Reniers G, Eaton J (2009) . Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys. AIDS.13;23(5):621-9.

2 thoughts on “Making sense of HIV prevalence data

  1. so what in your view is the closest-to-true current prevalence? 13.3 % ?

    Just read that “In 2004 every hour 10 Malawians were dying of HIV. That figure has gone down by 80 percent to 1.5 to two people dying per hour,” – Dr Mary Shaba.

  2. To be honest…its always better to have a higher approximation for planning purposes…but its odd because in the actual paper they dont discuss the implications for the difference in the prevalence figures.

    If Mary Shaba is right then that would mean that close to 90,000 people died of HIV in 2004 which is close enough to figures UNAIDS reported i.e. in 2003 low estimate was 60,000 and the high estimate 120,000…for sure life saving ARVs are making a difference!

Comments are closed.