Synthetic Data

NDSSL Proto-Entities

NDSSL has produced several synthetic data sets that are being released to the larger academic community for research. The data sets are based on detailed microscopic simulation-based modeling and integration techniques. Data Sets 1 and 2 represent a synthetic population of the city of Portland. Data Set 3 is a movie showing how the topology of a hypothetical ad-hoc network would change as the vehicles carrying the move. It is based on microscopic simulation of vehicular traffic in Washington, DC.

Data Set Release One

Name Description Details Sample
Portland ProtoPopulation ProtoPopulation for Portland, OR. with demographics and home locations demographics-portland-1-v1.txt Sample Population
Portland ProtoLocations ProtoPopulation for Portland, OR. with demographics and home locations locations-portland-1-v1.txt Sample Locations
Portland Activities Protopopulation activities for Portland, OR. activities-portland-1-v1.txt Sample Activities
Portland Contact Graph Protopopulation contact graph for Portland, OR. contact-portland-2-v1.txt Sample Contact Graph
Portland Dendrogram Dendrogram of infections for Portland, OR. dendro-portland-1-v1.txt Sample Dendrogram

Data Set Release Two

Name Description Details Sample
Portland ProtoPopulation ProtoPopulation for Portland, OR. with demographics and home locations demographics-person-portland-1-v2.txt
demographics-household-portland-1-v2.txt
Sample Population
Sample Household
Portland ProtoLocations ProtoPopulation for Portland, OR. with demographics and home locations locations-portland-1-v2.txt Sample Locations
Portland Activities Protopopulation activities for Portland, OR. activities-portland-1-v2.txt Sample Activities
Portland Contact Graph Protopopulation contact graph for Portland, OR. contact-portland-1-v2.txt Sample Contact Graph
Portland Dendrogram Dendrogram of infections for Portland, OR. dendro-portland-1-v2.txt Sample Dendrogram

Data Set Release Three

Name Description Details Sample
Washington Network Movie Ad-hoc network topology for Washington, DC DCWireless.avi (20 MB) Sample Frame

Data Set Release Four

This data set is being released in conjunction with the following paper:

Samarth Swarup, Stephen G. Eubank, Madhav V. Marathe, Computational Epidemiology as a Challenge Domain for Multiagent Systems, In Proceedings of The Thirteenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS). Paris, France, May 5-9 2014.

Click here to download the data.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.

If you make use of this data, please acknowledge its source by citing the above paper.

Montgomery County, Virginia, USA

We are making available a synthetic population of Montgomery County, which contains 77,820 people grouped into 32,827 households. Each person has a daily activity schedule. The total number of activities for the population is 429,590, which are conducted in 26,941 distinct locations (in addition to home locations). The resulting social contact network has 77,820 nodes (one per person) and 2,019,220 edges.

The data set contains several demographic variables associated with each synthetic person. Age and household income distributions are shown below, for example.

 

Age distribution for Montgomery County

Age distribution for Montgomery County

Household income distribution for Montgomery County

Household income distribution for Montgomery County

 

Each person has an associated, geo-located, daily activity schedule. When two people are at the same location for an overlapping time, we induce an edge in the social contact network between them. This social contact network, along with the contact duration for each edge is also provided. The degree distribution of the social contact network is shown below.

Montgomery County Social Contact Network: Degree Distribution

Montgomery County Social Contact Network: Degree Distribution


Data Set Release Five

This is a release of the code used for the simulations in the following paper:

Kristian Lum, Samarth Swarup, Stephen G. Eubank, James Hawdon, The Contagious Nature of Imprisonment: An Agent-based Model to Explain Racial Disparities in Incarceration Rates, J. R. Soc. Interface 11(98):20140409, June 2014.

Click here to download the code.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.

If you make use of this code, please acknowledge its source by citing the above paper.


Selected Publications The following were presented at the Workshop on Spatial Data Mining , part of the 2006 SIAM Conference on Data Mining.

M.V. Marathe. Interaction Based Computer Modeling for Comprehensive Incident Characterization to Support Pandemic Preparedness. NDSSL-TR-06-019 Network Dynamics and Simulation Science Laboratory, Virginia Polytechnic Institute and State University, 1880 Pratt Dr, Building XV, Blacksburg, VA, 24061 http://ndssl.vbi.vt.edu/Publications/ndssl-tr-06-019.pdf

M.V. Marathe. Synthetic Data for Data Mining to Support Epidemiological Modeling.NDSSL-TR-06-020 Network Dynamics and Simulation Science Laboratory, Virginia Polytechnic Institute and State University, 1880 Pratt Dr, Building XV, Blacksburg, VA, 24061. http://ndssl.vbi.vt.edu/Publications/ndssl-tr-06-020.pdf

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.

If you make use of this data, please acknowledge its source with the following citations:
Synthetic Data Products for Societal Infrastructures and Proto-Populations: Data Set 1.0, NDSSL-TR-06-006, Network Dynamics and Simulation Science Laboratory, Virginia Polytechnic Institute and State University, 1880 Pratt Dr, Building XV, Blacksburg, VA, 24061, ndssl.vbi.vt.edu/Publications/ndssl-tr-06-006.pdf

Synthetic Data Products for Societal Infrastructures and Proto-Populations: Data Set 2.0, NDSSL-TR-07-003, Network Dynamics and Simulation Science Laboratory, Virginia Polytechnic Institute and State University, 1880 Pratt Dr, Building XV, Blacksburg, VA, 24061, ndssl.vbi.vt.edu/Publications/ndssl-tr-07-003.pdf

Synthetic Data Products for Societal Infrastructures and Proto-Populations: Data Set 3.0, NDSSL-TR-07-010, Network Dynamics and Simulation Science Laboratory, Virginia Polytechnic Institute and State University, 1880 Pratt Dr, Building XV, Blacksburg, VA, 24061, ndssl.vbi.vt.edu/Publications/ndssl-tr-07-010.pdf

We are always interested in how people are making use of our data. If you publish work using any of this data, please send us a reference. If you have any suggestions on how we can make this data more useful to you, we like to know about that as well. Finally, we have other data that we can make available through individual arrangements. Contact us at ndssl@vbi.vt.edu

Click here to download the complete data sets.