Training Provider Outcomes Toolkit

The Training Provider Outcomes Toolkit (TPOT) is a collection of tools for securely collecting, connecting, analyzing, aggregating, and publishing data on wage and employment outcomes for education and training participants.

State and local workforce development boards aim to improve employment and workforce investment in their regions.

Collecting data from training providers

There are many hundreds or thousands of training providers within the purview of each workforce development board. Each one must securely upload their participant data to their workforce board in order to be eligible for federal funds. The resulting aggregate statistics on participant outcomes will be invaluable marketing materials for successful programs, regardless of federal funding. This means that the workforce development boards must be equipped to receive and validate the data.

Training providers range from small trade apprenticeships to community colleges to multi-state organizations, with a wide range of data sophistication. The way(s) in which the workforce data board collects participant outcomes must be easy and accessible to all organizations. At the same time, it must be easy for the board itself to automatically process and validate the datasets.

The Data Package

In order to support these use-cases, the toolkit uses the Data Package specification in order to encapsulate and describe training provider outcome data in a systematic fashion.

Create a Data Package specification

Each workforce development board should create a data package specification that defines the specific fields and values that they require the training providers to use.

The Data Packagist tool allows you to easily define these packages in the browser. You can also view existing package specifications with the online data package viewer. Finally, you can validate your specification with an online validator.

Uploading website

Once a data package specification is created, it may be used to easily customize a self-hosted website that enables validating and uploading data packages from training providers. This site is designed to be simple to modify and setup a custom instance for each training provider. A demo uploading website is available at The source code is available for customization on GitHub.

The data warehouse

The upload website is designed to securely upload the data packages to a data warehouse for each workforce board. Stay tuned for updates.

Data matching

Once the workforce board has collected the training providers' outcomes, it must link the participant information with wage and employment information from other departments. In order to facilitate this matching, we have developed an easy-to-use and open-source deduplication package, SuperDeduper. This allows for a combination of exact record linkage and probabalistic fuzzy matching in order to link as many participants as possible to the other datasets.

Data processing

A wide variety of tools have been developed that support the Data Package format. It is possible to load your data package into many analysis languages including Python and R. It's also possible to load the data packages into many database systems including SQL Server and PostgreSQL . Work is underway to support Excel and Google Sheets.

Data export

Once the individual outcome data has been aggregated into ETP scorecards for each program, the data must be publicly accessible for others to analyze and build upon. We are in the process of developing an API that can serve these scorecards in a machine-readable manner that will allow for development of more sophisticated tools and analyses. We are developing a framework to allow easy deployment of a web service for this API at GitHub: etp-api. An alpha version is available with sample data at