
04.12.2008
“U-S-A! Num-ber One! U-S-A! Num-ber One!” So goes the patriotic chant. Many American patriots may be disappointed to find out that, if you look at governance indicators, U.S.A. is rarely ranked number one. Governance indicators are those country scores or rankings you may have heard about in news reports or political speeches. Several companies and non-profit organisations purport to measure a multifaceted area of public policy or public administration by measuring each facet. A set of indicators is the result. Each indicator may be analyzed on its own or, more likely, it will be clumped together with others to create a composite indicator; a number that summarizes the larger policy area, such as standard of living, respect for human rights, quality of democratic institutions, and the like. And when several composite indicators result, there is a temptation to clump those together to create one big aggregate indicator—a kind of “how good is your country’s governance system” summary score. In any case, the results usually don’t make for catchy chants: “U-S-A! Num-ber Seven! Just Ahead of Finland! On Four Out of Five Indicators!”
My fellow Canadians have a complex about governance indicators. For a long time, Canada sat at the top of the United Nation’s Human Development Index, which most people interpret as a proxy for the standard of living. The 2007-08 Index puts Canada at number four. There are no more bragging rights. News reporters interrogate politicians about what’s gone wrong whenever the new numbers come out. Ordinary citizens sing songs of despair from their roof tops.
These governance indicators raise more questions than they answer. Do these indicators really measure what pundits and researchers claim they measure? How accurate are they? Is a rank of, say, fourth much better than the rank of fifth … or twelfth … or twentieth? These questions are important because the profile of these indicators is growing. These catch-all numbers make it easy for news organisations to quickly post stories that stoke nationalist sentiment. That’s not a coincidence. A single number that summarizes a complicated policy area is easier to report than the messy details of policy. Most of these indicators were created with the specific intent to shame and cajole governments into making policy and institutional improvements. Activists fuss and bleat whenever their home nation isn’t atop the pile. Pundits have yet another excuse to play the blame game. The lazier policy wonks write alarmist reports of national decline. Investors fidget nervously. But do governance indicators really deserve all of this Sturm und Drang?

Uses and Abuses of Governance Indicators by Christiane Arndt and Charles Oman (Organisation for Economic Co-operation and Development, 2006), pp. 122.
The answer is no. Christiane Arndt and Charles Oman explain why in their book Uses and Abuses of Governance Indicators. The book is written in a technocratic style that comes with the obligatory constructive suggestions and robotic tone. However, the analysis screams: if you only knew how these indicators are cooked up, you wouldn’t be paying attention to small differences in ranking; you may even decide to ignore the indicators entirely! As with all statistics, the methodological details are what really matter when it comes to interpretation. So let’s take a closer look.
First, let’s get the particulars out of the way. Arndt & Oman focus on five governance indicators.
- The International Country Risk Guide rating system (by Companyrisk.com) grades countries in terms of investment risk.
- The Freedom in the World rating system (by Freedom House) grades the political rights and civil liberties of countries.
- The Corruption Perceptions Index (by Transparency International) rates countries according to citizens’ views about state corruption.
- The World Bank’s Country Policy & Institutions Assessments rate countries according to economic, trade, regulation, labour market, social welfare, and public management policies
- The World Bank also produces a set of governance indicators called the KKZ, which stands for the last names of the inventors (Kaufmann, Kraay, and Zoido-Lobaton). This indicator rates countries on topics such as quality of democracy, corruption, political stability, regulation, the rule of law, and government effectiveness.
This selection represents only a fraction of the indicators available. These particular indicators were developed, for the most part, because of a dissatisfaction with traditional assessments of investment risk for most countries. Traditional assessments failed to alert investors and traders to regimes that aren’t always “business friendly” or “freedom loving” … or unstable political systems on the verge of revolution … or kleptocracies that fleece biz-folk of their pocket change by collecting bribes and kickbacks … or, or. So, the argument goes, let’s just rate countries for investment worthiness much like watchdog agencies (e.g., Moody’s) rate companies for credit worthiness. That’s easier said that done.
The first thing to know about governance indicators is that the vast majority of them measure the subjective perceptions of residents, not independently verifiable facts. In the case of corruption indicators, for example, the indicator doesn’t actually measure the extent of corruption. Instead, it measures the extent to which the locals think there is corruption. So if the locals have high standards and a petty scandal happens, then the score may be low. Conversely, if corruption is prevalent and the locals have grown accustomed to a certain amount of it, they may not rate the country low on the scale. Even if a governance indicator is based on facts, these facts are almost always about the formal policies and organisational arrangements of the state. The indicators don’t measure whether these policies and arrangements work as intended, nor do they provide any indication of the degree of success. So, at first blush, these indicators would seem to have little resemblance to those of credit-rating agencies. Those agencies measure things such as indebtedness and profits which show a company’s affairs quite plainly (notwithstanding the creative accounting and disclosure trickery of some companies).
I’m not going to discuss the strengths and weaknesses of each set of indicators mentioned above. You can read the book for that. Instead, I’m just going to itemize some of the problems with these indicators and their use. Obviously, each problem doesn’t necessarily relate to every indicator.
- The inappropriate tracking of trends. Some users show the evolution of governance indicators over time even though indicators are not comparable from one time period to the next. These comparability problems may be because of changes in the way indicators are measured or because countries are rated relative to others for a given time period.
- Measurement errors. Almost all statistics have some amount of measurement error; measurers and measuring conditions are rarely perfect. However, many governance indicators have measurement error that is so large that it is inappropriate to compare countries with similar scores. When compound indicators are created from sets of indicators, the errors are also compounded. Moreover, the amount of measurement error can vary depending on the country, with poor countries and politically restrictive countries causing particular problems.
- Country exclusions. Some composite indicators kick particular countries out of the mix for various methodological reasons. Excluded countries are usually the poorly performing ones. Some sets of indicators exclude countries because not enough information is available. Again, the countries suspected of being the worst performers tend to be the ones that are less forthcoming with information.
- Grouping together apples and oranges. Some composite indicators lump together radically different areas of policy and administration. This is particularly a problem with the all-summarising aggregate indicators. So if a country scores well in one area and poorly in another, the summary score will give the mistaken impression that the country is satisfactory. Likewise, if a country scores very well on most indicators but scores very badly on one extremely important indicator, the country will be graded as doing very well. The point is that, when combining indicators, there needs to be some sort of logic and coherence to the grouping (i.e., correlation). That’s not always the case.
- Missing details. A number of governance indicators are publicly available but all of the methodological details my not be disclosed. Proprietary governance indicators rarely come with adequate methodological details, including the basics about how a particular indicator is put together. I suspect that is because companies don’t want to make it easy for others to reproduce (and possibly improve) on their indicators. These companies may also be hiding questionable methodological practices and other quality shortcomings. Arndt & Oman emphasize the complexity of creating composite indicators as a major disclosure problem. The “lack of transparency” is what concerns them the most because it makes it difficult to detect other shortcomings and complications.
- Unrealistic, embedded assumptions. Indicators are models of a sort and all models are based on assumptions. Statistical procedures used to create indicators also involve underlying assumptions. Even among the most carefully constructed governance indicators, underlying assumptions can be unrealistic. For example, Arndt & Oman single out the KKZ indicator. When a composite indicator is constructed (for, say, the subject “rule of law”), the individual indicators are weighted according to their strength of correlation with each other. This is to improve the coherence of the grouping, so to speak. Yet to perform the calculation, it is also assumed that the measurement errors associated with each indicator are not correlated, which is a grossly unrealistic assumption. This creates large margins of error and undermines the comparability of indicators across countries.
- Sample bias. Sets of indicators are built from several sources. For example, some sources may be major international surveys (such as Latinobarometer), while other sources may be private companies (such as The Economist Intelligence Unit). So a particular indicator may be heavily reliant on, say, business surveys and expert assessments (as the KKZ indicator is). Major segments of society are excluded, a serious bias. In the case of a private-sector bias, an indicator may just be measuring a bandwagon of perceptions among business-minded people, not the perceptions of the adult population of a country.
Taken together, these problems suggest that governance indicators are vulnerable to all sorts of threats to accuracy, reliability, and comparability (between countries and across time). In practice, each indicator suffers from several of these problems. The difference between a first-place country and a tenth-place country, therefore, is likely to be the results of errors and bias, especially if the scores are very close.
For a couple of years, my research work tended to revolve around the construction and analysis of internationally comparable measures (including the ranking of countries). The subjects I was measuring were fairly tangible things, such as different types of income (e.g., personal versus household income). I was astonished by the number of caveats that you have to add because its so rare for countries to measure something in exactly the same way (e.g., the household unit was defined differently, different income sources and taxes were included, and different survey methods were used). It’s equally astonishing how those footnoted caveats were abandoned as soon as someone used my numbers to “construct a story line.”
During that time, I also analyzed governance indicators, such as those related to government labour market policies. Here too, I struggled with the different ways in which governments defined things and the difficulties of establishing equivalence between government programmes across countries. Even though each national government submitted data based on standard criteria for measurement, some governments wouldn’t or couldn’t follow the criteria very closely. For example, one measure didn’t take account of state structures in a consistent way. For some federations, provincial or state activities were included; for other federations, provincial or state activities were ignored. That particular inconsistency wasn’t even documented despite the massive amount of error it caused (actual policy spending was higher by several hundred percent in the Canadian case). It wasn’t even documented inside the original database (which I had direct access to). It was only known by a small network of insiders and policy wonks who learned about it through word-of-mouth.
In summary, when it comes to governance indicators, I’m reluctant to take the indicators at their face value. You should be reluctant too. If you can’t obtain the methodological details and understand their consequences, then walk away from the indicator. Unfortunately, some people think that constructing arguments with a handy number is better than having no number at all, regardless of how flawed or irrelevant that number may be.
Arndt & Oman’s book helps make this clear, provided that you have a long attention-span and affinity for statistical details. There is also a lot of repetition because of the book’s report-like format. Nonetheless, this book is the first place you should look before working with international governance indicators. The book contains references to several other guides to governance indicators. It’s good to see that the Organisation for Economic Co-operation and Development is publishing books that feature critical analysis and debunking.
Review by Peter Stoyko
Update (14.09.09)
There’s an interesting article in The New York Times today. It talks about a report commissioned by Nicholas Sarkosy, the President of France. The report calls for an alternative to Gross Domestic Product (GDP), a measure of the market value of goods and services produced within an economy. The idea is to create a measure that speaks of “national well-being” or “social progress”. The inspiration includes some of the indictors I mention above. Given the problems I’ve discussed, all I can say about the proposal is … uh … good luck with that.
It is no coincidence that France is advocating this. Historically, France has bristled whenever a new governance indicator has been published by an international body because France has never taken pride in its inevitable mediocre ranking. This began in the early 1970s when the OECD published a (deeply flawed) report about income inequality that portrayed France as a relatively unequal society. Political turmoil ensued in a country that supposedly values egalité. What French politicians have always wanted, I suspect, is a “why can’t every country be like France?” indicator. (I’m only being slightly facetious.) However, there is a whiff of hypocrisy here. Among OECD member states, France is one of the most difficult places to gather the statistics to feed into an aggregate economic or governance indicator. That’s because of a legal regime which, for better or worse, forbids the government from gathering all sorts of data about citizens. So if national statistical agencies succeeded in creating a sophisticated measure (or set of measures) of economic activity that includes social considerations, I wouldn’t be surprised if France had a hard time producing its own numbers without gutting all of those so-called privacy safeguards. Politicians have been unwilling to do that in the past. And that’s why it’s so difficult to compare France with other countries using statistics … unless you drop all of the qualifying footnotes and just pretend French measures are equivalent.
REFERENCES
