Written by: Director Robert Groves
Several weeks ago, at the initiative of Brian Pink, the Australian statistician, leaders of the government statistical agencies from Australia, Canada, New Zealand, United Kingdom, and the United States held a summit meeting to identify common challenges and share information about current initiatives. While there had been casual sharing of partial information in previous years among these leaders, this event was unprecedented.
The five countries share languages and some cultural features; they vary in size and in the organization of their statistical systems. They also vary in the current health of their national economies, their regional economic foci, and key social and political issues. None of them have population registers with mandatory updating features. The legal frameworks of the countries’ statistical systems give different powers to the chief statistician.
While meetings of this character happen periodically in many sectors, the findings of the meeting were notable on one dimension – the five countries’ statisticians report that the strategic activities now being mounted are very nearly identical. They perceive the same likely future challenges for central government statistical agencies, and they are making similar organizational changes to prepare for the future. While they vary in specific current innovations, the components of the full future vision are remarkably similar.
Ingredients of the future vision:
- The volume of data generated outside the government statistical systems is increasing much faster than the volume of data collected by the statistical systems; almost all of these data are digitized in electronic files.
- As this occurs, the leaders expect that relative cost, timeliness, and effectiveness of traditional survey and census approaches of the agencies may become less attractive.
- Blending together multiple available data sources (administrative and other records) with traditional surveys and censuses (using paper, internet, telephone, face-to-face interviewing) to create high quality, timely statistics that tell a coherent story of economic, social and environmental progress must become a major focus of central government statistical agencies.
- This requires efficient record linkage capabilities, the building of master universe frames that act as core infrastructure to the blending of data sources, and the use of modern statistical modeling to combine data sources with highest accuracy.
- Agencies will need to develop the analytical and communication capabilities to distill insights from more integrated views of the world and impart a stronger systems view across government and private sector information.
- There are growing demands from researchers and policy-related organizations to analyze the micro-data collected by the agencies, to extract more information from the data.
In some of the countries the difficulty of obtaining high participation rates in surveys and censuses is growing, creating cost inflation due to the need for greater efforts to contact and persuade sample units of the value of their participation. At the same time, central government budgets are constrained, not amenable to major initiatives.
In most of the countries the agencies realize that there are data resources that are not being fully used to the benefit of the country’s statistics. Many of these are controlled by other government agencies that use the data for program administration. In all of the countries, many of the record systems of lower-level geographical units, businesses, and program agencies of the central government are increasingly digitized. These record systems often contain some data of relevance to the statistical agency’s mandate to describe its society and economy.
Further, there are different surveys the agencies conduct that could be linked together to increase the amount of information – sometimes different economic surveys of the same unit; sometimes a mix of household data and employer data. Such linking can produce new statistical information without the need to collect any new data. All agencies report efforts to link such data resources together.
The global internet is currently offering near real-time data on durable and nondurable goods prices, housing sales, and other relevant events. Since the data reflect global phenomena, new conceptual puzzles arise in using the data to describe the nation-state. The global internet search capabilities also generate verbal data describing billions of information requests (e.g., Google Search) and behavioral reports (e.g., tweets). All of these sources of data are fallible. They fail to offer complete coverage of the population and behaviors of interest. They tend to be lean, reporting only the behavior and time and geography, not other characteristics of the person who performed the behavior. In contrast to other data sources the internet data often have global reach, with little concern about nation-state boundaries.
There are practical implications for the management and governance of internal activities of the agencies:
- The traditional functional separations among population census, economic surveys, and household/person surveys are not well-fitted to a world of multiple data sourced censuses and surveys. Hence, management changes are being considered to unify data collection processes under the same structures.
- Generalized IT systems that serve census, economic surveys, and demographic surveys are being developed. These have the advantages of reduced maintenance costs, flexibility in rotating staff across subunits, and new functions suited to linked files.
- The staffs of the statistical agencies need to learn about the purposes and procedures of program data resources; sometimes this involves placing statistical agency staff in program agencies (e.g., within tax authorities). They will be less wedded to collecting data and more attentive to generating and utilizing that which is most appropriate and cost effective.
- There is an increasing need for high-speed, “big data” software systems for record linkage and extraction of key information from massive files.
- Efficient and sophisticated imputations procedures are needed to make the combined data sources jointly useful.
- There is more use of statistical modeling for statistical estimation, to provide more timely and small area estimates.
- The agencies are inventing new ways to give secure access to micro-data for legitimate research purposes, to increase their impact of their work.
For some decades these organizations have been increasing use of software systems to improve the efficiency of data collection and data processing. These have held costs of producing statistical information at lower levels than would have been expected. The future will see the integration of new data resources, each fallible in some way, combined through linking and statistical modeling to produce the requisite statistical information for their countries. Through this, the agencies hope to provide more statistics at lower cost.
In short, the five countries are actively inventing a future unlike the past, requiring new ways of thinking and calling for new skills. The payoff sought is timelier, more trustworthy, and lower cost statistical information measuring new components of the society, economy, and environment, telling a richer story of our countries’ progress.