Virtual Pooled Registry Cancer Linkage System

The Virtual Pooled Registry Cancer Linkage System (VPR-CLS) is a secure online service designed to:

  • efficiently connect researchers performing minimal risk linkage studies with multiple U.S. population-based cancer registries;
  • perform linkages utilize a single cohort file, standard linkage software, and consistent matching algorithms;
  • provide researchers with initial aggregate match count results across participating registries (Phase I); and
  • streamline the process of applying for release of individual-level data on matched cases (Phase II).

Coordinated by the North American Association of Cancer Registries (NAACCR) with funding from the National Cancer Institute, the VPR-CLS provides a single location to facilitate timely access to and use of high quality cancer surveillance data for minimal risk linkage studies. Use of an automated, standard linkage methodology and streamlined application process will significantly reduce the level of effort researchers and registries must dedicate to the linkage and approval process. The technology for the VPR-CLS has been developed by Information Management Services, Inc. (IMS), which also serves as the system’s third party honest broker.

The VPR-CLS consists of, and has been developed in, two distinct phases:

  • Phase I supports a standardized linkage with participating registries and release of provision of aggregate match counts (by registry and diagnosis year), allowing the researcher to prioritize registries from which to request additional data. Phase I includes a web-based application, secure data transfer protocols between researchers, IMS and registries, and use of a single record linkage software (Match*Pro) optimized for linkages between cancer registries and research cohorts.
  • Phase II supports the researcher’s process of applying to registries and their IRBs for release of individual-level cancer data on matched cases identified in Phase I. It includes use of a Templated IRB/Registry Application, compilation of any required registry applications or agreements, a Templated Data Use Agreement, use of a Central IRB for many of the participating registries, and a tracking and reminder system.

To learn more about how the VPR-CLS works, check out the video below!


Map of VPR_registries_2022


The VPR-CLS was officially launched in February 2022 after going through various phases of development and pilot testing.  Below are highlights of this progress:

  • Registry participation in the VPR-CLS has steadily increased and currently includes 45 registries representing 95% of the U.S. population plus Puerto Rico.
  • 15 studies ranging in size from 14,273 to 10.9 million participants have completed Phase I linkages with registries.
  • Nearly all of these studies have leveraged the VPR-CLS streamlined application and tracking system to request individual-level data on matched cases.
  • Various mechanisms have been implemented to simplify and streamline the previously cumbersome process of filling out individual registry/IRB applications and data use agreements, including:
  • A single Templated IRB/Registry Application (TIRA) that has been adopted by nearly all registries and their IRBs, reducing the number of individual applications from 58 down to 6 plus the TIRA. The TIRA is completed once by the researcher and submitted to all accepting registries through the VPR-CLS.
    • A comprehensive list of any registry and IRB application requirements with links to the applications, agreements, and the data request contact person.
    • A VPR Templated DUA (VPR DUA) that has been adopted by 25 registries to date, reducing the number of individual DUAs the researcher must complete while also standardizing the terms and conditions with which the researcher must comply.
    • An NCI contract with the Biomedical Research Alliance of New York (BRANY) Central IRB to perform review of VPR linkage studies in lieu of the local/state IRBs that can cede review.
    • Application tracking to easily monitor the review, approval, and data release status across registries.
    • Automated system notifications to remind registries and researchers what needs to be done to ensure progress of the request.


The completeness of population-based cancer surveillance in the United States offers an unprecedented opportunity for large scale epidemiologic research. Yet, a significant barrier to cancer epidemiology in the United States has been the disjointed and cumbersome process of applying for and linking with individual registries one at a time. The resources and time associated with approaching each registry individually were significant and could result in few matched cases. In addition, there was no consistency among states in application, approval, and linkage procedures.

The Virtual Pooled Registry Cancer Linkage System (VPR-CLS) will address many of these limitations.  The VPR-CLS is designed to facilitate minimal risk cohort linkage and provides the following efficiencies:

  • Researchers submit a single application for multi-registry linkages.
  • A single cohort file is securely exchanged and simultaneously linked with multiple cancer registries using a standardized protocol and linkage software.
  • Linkage occurs behind the registry firewall and only the number of matched cases, by registry and diagnosis year, are released back to the researchers.
  • Match counts help researchers prioritize which registries to approach for release of cancer information on the matched cases.
  • Providing researchers with a streamlined and efficient registry and IRB application process.


For cohort studies that typically spend large amounts of money on follow-up for cancer cohorts or to identify cancer cases among existing cohorts, the VPR-CLS would provide the following benefits:

  1. Facilitate a systematic process for linking with multiple registries
  2. Provide follow-up information on vital status
  3. Allow researchers to determine the number of matches prior to completing the lengthy and costly IRB review and approval process

For post marketing surveillance with smaller sample sizes, limited years of follow-up, and high rates of loss to follow-up, the VPR-CLS would provide these additional benefits:

  1. More efficient identification of cancer diagnoses following receipt of a drug or device
  2. Ability to link exposed patients routinely over extended time intervals
  3. Leveraging population-based cancer registries to provide maximum power for identification of relatively rare cancers that may have significant risk.

For cancer registries, in addition to supporting research cohort linkages, the VPR-CLS has the added potential to facilitate the following activities to enhance data quality, completeness, and accuracy:

  1. De-duplication between central registries: The VPR-CLS provides a platform for ability to de-duplicate cases that are captured in multiple states, and counted as incident cases in each state. Currently there is no systematic way to identify cases that claim residency in multiple states, and therefore incidence rates may be inflated. Cross-state de-duplication would address this issue and provide more accurate estimates of incidence.
  2. Identification of multiple primaries: Currently, multiple primary cancer incidence is based on information contained within a single registry. Therefore if an individual has a diagnosis of two separate cancers in two states, there is currently no mechanism to capture this information and link it to a single individual. The VPR-CLS would make such quality assurance possible and therefore provide more accurate estimates of the incidence of multiple primary cancers.
  3. Sharing of additional treatment, progression, and follow-up information: Identification of patients treated in more than one state could result in the exchange of treatment data between states, along with information on recurrence and disease progression. Registries could also share vital status and save time for manual searching while improving survival statistics.


Cancer registries capture a wealth of standardized data that can enhance research studies. Routinely collected information includes patient demographics (residence and age at diagnosis, race, sex, etc.), cancer diagnosis (date of diagnosis, primary site, histology, behavior, etc.), stage of disease SEER Summary Stage and AJCC TNM Stage, treatment surgery, radiation, etc., and follow-up information (vital status, date of last contact, etc.). Completeness and availability of this information may vary from registry to registry depending on data collection requirements and registry resources.

Cancer registries generally make their data available 24 months after the close of a diagnosis year (e.g. cancers diagnosed in 2014 are made available at the end of 2016). This timeframe allows reporting facilities time to report complete information and for central registries to edit the data, consolidate multiple reports for a single case, and link with other data sources, including death files, to ensure the most complete and high quality data. The 12-month data (e.g. cancer cases diagnosed in 2015 made available at the end of 2016) may not be as complete; however, these data may be used for linkages and preliminary cancer statistics.

Questions about the VPR-CLS can be addressed to Castine Clerkin, VPR-CLS Program Manager, .

Copyright © 2018 NAACCR, Inc. All Rights Reserved | naaccr-swoosh-only See NAACCR Partners and Sponsors