README
This document provides a summary of the dataset contents:
Citation metadata
DOI - https://doi.org/10.7910/DVN/SKP9IB
title
Development of an AI/ML-ready knee ultrasound dataset in a population-based cohort
grantNumber
grantNumberValue: R01AR077060
grantNumberAgency: NIAMS
grantNumberValue: 3R01AR077060-03S1
dsDescription
dsDescriptionValue:
About the data
An ultrasound dataset to use in the discovery of ultrasound features associated with pain and radiographic change in KOA is highly innovative and will be a major step forward for the field. These ultrasound images originate from the diverse and inclusive population-based Johnston County Health Study (JoCoHS). This dataset is designed to adhere to FAIR principles and was funded in part by an Administrative Supplement to Improve the AI/ML-Readiness of NIH-Supported Data (3R01AR077060-03S1). This dataset includes a subset of JoCoHS participants who underwent ultrasound imaging and radiographic evaluations.
dsDescriptionDate: 2024-10-29
dsDescriptionValue:
To begin learning about the dataset, visit our User Guide for an all-in-one document containing statistics and other details to help you work with the data.
dsDescriptionDate: 2024-10-29
publication
publicationCitation: Yerich NV, Alvarez C, Schwartz TA, Savage-Guin S, Renner JB, Bakewell CJ, Kohler MJ, Lin J, Samuels J, Nelson AE. A Standardized, Pragmatic Approach to Knee Ultrasound for Clinical Research in Osteoarthritis: The Johnston County Osteoarthritis Project. ACR Open Rheumatol. 2020 Jul;2(7):438-448. doi: 10.1002/acr2.11159. PMID: 32597564; PMCID: PMC7368135.
publicationIDType: pmid
publicationIDNumber: 32597564
publicationURL: https://doi.org/10.1002/acr2.11159
author
authorName: Nelson, Amanda
authorAffiliation: University of North Carolina at Chapel Hill
authorIdentifierScheme: ORCID
authorIdentifier: 0000-0002-9344-7877
dateOfCollection
dateOfCollectionStart: 2019-03-14
dateOfCollectionEnd: 2024-06-01
subject
Medicine, Health and Life Sciences
Dataset files
25 files currently in this dataset
filename | directory | categories | description |
---|---|---|---|
README.notebook.public.ipynb Size: 55.0 kB | code | ['code', 'Jupyter', 'notebook'] | This file provides documentation describing the dataset and a preview of the data. This notebook also includes code to use as a dataset documenting framework for your own projects. |
_notebook.config.template.json Size: 3.9 kB | code | ['code', 'json'] | This file is used with the README.notebook.public.ipynb and contains the notebook variables. It also contains an embedded schema to describe/validate the JSON config. |
_notebook.instructions.md Size: 3.5 kB | code | ['code', 'Jupyter', 'Markdown'] | This file is used with the README.notebook.public.ipynb and contains instructions on using the notebook. |
_notebook_installer.py Size: 699 Bytes | code | ['code', 'Jupyter', 'notebook', 'Python script'] | The purpose of this file is to install any Python modules used by the Jupyter notebook that are not included in the Jupyter environment by default. This script is only useful to those wanting to run the code within the Jupyter notebook. |
_notebook_worker.py Size: 18.2 kB | code | ['code', 'Jupyter', 'notebook', 'Python script'] | This file contains the scripts to help describe the dataset within the Jupyter Notebook. This script is ONLY useful to those wanting to run the code within the Jupyter Notebook. This code is kept separate from the Notebook to prevent the Notebook becoming bloated with code. |
example.us.image.png Size: 127.5 kB | data/image/example | ['data', 'example'] | This is an example ultrasound file that represents one of the images from the image archives and should not be used for analysis purposes. Images in this dataset are saved in lossless .png format. |
imageArchive.11.zip Size: 104.7 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right anterior suprapatellar longitudinal, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.12.zip Size: 133.0 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right anterior suprapatellar longitudinal with power Doppler, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.13.zip Size: 112.6 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right anterior suprapatellar transverse in 30 degrees flexion, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.14.zip Size: 83.6 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right medial longitudinal, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.15.zip Size: 81.8 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right lateral longitudinal, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.16.zip Size: 88.5 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right anterior suprapatellar transverse in maximal flexion, ultrasound images (file count: 865). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.17.zip Size: 101.6 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of right posterior medial transverse, ultrasound images (file count: 835). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.21.zip Size: 104.2 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left anterior suprapatellar longitudinal, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.22.zip Size: 132.3 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left anterior suprapatellar longitudinal with power Doppler, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.23.zip Size: 112.6 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left anterior suprapatellar transverse in 30 degrees flexion, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.24.zip Size: 83.2 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left medial longitudinal, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.25.zip Size: 81.7 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left lateral longitudinal, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.26.zip Size: 89.3 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left anterior suprapatellar transverse in maximal flexion, ultrasound images (file count: 865). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
imageArchive.27.zip Size: 101.2 MB | data/image/ultrasound | ['data', 'image'] | This file contains an archive of left posterior medial transverse, ultrasound images (file count: 834). There should only be one image per subject in this archive. Images are saved in lossless .png format. |
dataTable.IMAGE_REF.tab Size: 1.2 MB Row count: 12,063 | data/reference | ['data', 'reference'] | This file contains the ultrasound image metadata. Use this file to determine the number of ultrasound image types available, file sizes, and references to the files found in the ancillary image archives. |
dataTable.SUBJECT.tab Size: 45.0 kB Row count: 881 | data/reference | ['data', 'reference'] | This is the first file generated for this dataset and contains the participant/subject reference IDs, basic subject demographics, PA knee KL grades and reported pain in knees. |
dvDatasetMetadata.json Size: 87.8 kB | data/reference | ['data', 'reference'] | This file contains metadata created BEFORE data was uploaded to the repository dataset to help describe the data we EXPECT to have uploaded to the dataset. This is file was created since the repository does not contain built-in tools to describe data files and variables in depth, which is needed to both document the data and perform validation. This file is used to validate and describe categorical variables within the datatable files. |
README.md Size: 31.3 kB | documentation | ['README'] | This document contains an overview of the dataset and is a good starting point to understanding the data. |
curation_log.md Size: 17.1 kB | documentation | ['curation'] | This document contains the dataset curation checklist and notes regarding the considerations and steps taken to curate the data. |
25 files EXPECTED in this dataset
File dataTable.SUBJECT.csv
Preview
E03SUBJECTID | E03GENDER | E03PASKR | E03PASKL | E03RADRPAKKL | E03RADLPAKKL | E03AGE |
---|---|---|---|---|---|---|
72bb0a51-f020-11ed-b527-0a580a5f736a | Female | Moderate | None | Moderate OA | Moderate OA | 50-54 |
72bb0a76-f020-11ed-b527-0a580a5f736a | Female | Severe | None | Mild OA | No image to read | 60-64 |
72bb0a8a-f020-11ed-b527-0a580a5f736a | Female | Moderate | Moderate | Questionable OA | Mild OA | 65-70 |
72bb0a99-f020-11ed-b527-0a580a5f736a | Female | None | Mild | No OA | Questionable OA | 45-49 |
72bb0ab5-f020-11ed-b527-0a580a5f736a | Male | None | None | Questionable OA | Mild OA | 40-44 |
Variable descriptions
[
{
"name": "E03SUBJECTID",
"label": "A unique identifier for a subject in a study.",
"value": {
"format": "Universal Unique Identifier (UUID version 1) generated according to RFC 4122",
"dataType": "object",
"unique": "True"
},
"identifier": "True",
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C69256"
]
}
},
{
"name": "E03GENDER",
"label": "What was your gender at birth?",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"1": "Male",
"2": "Female"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C164527"
]
}
},
{
"name": "E03PASKR",
"surveyQuestion": "On MOST days of ANY ONE MONTH in the LAST 12 MONTHS did you have pain, aching, or stiffness in any of the following? Please rate as none, mild, moderate, or severe. - Right - Knee",
"label": "Right Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS",
"source": "Qualtrics survey (1h2J)",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "None",
"1": "Mild",
"2": "Moderate",
"3": "Severe",
"888": "No response"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125625",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125623",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C105779",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=249913002",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=30989003"
]
}
},
{
"name": "E03PASKL",
"surveyQuestion": "On MOST days of ANY ONE MONTH in the LAST 12 MONTHS did you have pain, aching, or stiffness in any of the following? Please rate as none, mild, moderate, or severe. - Left - Knee",
"label": "Left Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS",
"source": "Qualtrics survey (1h2J)",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "None",
"1": "Mild",
"2": "Moderate",
"3": "Severe",
"888": "No response"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125625",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125623",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C105779",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=249913002",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=30989003"
]
}
},
{
"name": "E03RADRPAKKL",
"label": "Kellgren-Lawrence (KL) Grade read from right PA knee radiograph",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "No OA",
"1": "Questionable OA",
"2": "Mild OA",
"3": "Moderate OA",
"4": "Severe OA",
"99": "Total joint replacement",
"-99": "No image to read"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C115514"
]
}
},
{
"name": "E03RADLPAKKL",
"label": "Kellgren-Lawrence (KL) Grade read from left PA knee radiograph",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "No OA",
"1": "Questionable OA",
"2": "Mild OA",
"3": "Moderate OA",
"4": "Severe OA",
"99": "Total joint replacement",
"-99": "No image to read"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C115514"
]
}
},
{
"name": "E03AGE",
"label": "Age categorized in 5-year increments at time of JoCoHS enrollment",
"value": {
"format": "categorical",
"unitOfMeasure": "year",
"dataType": "Int64",
"unique": "False",
"category": {
"1": "35-39",
"2": "40-44",
"3": "45-49",
"4": "50-54",
"5": "55-59",
"6": "60-64",
"7": "65-70"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C69260"
]
}
}
]
General statistics
What was your gender at birth? (E03GENDER)
category | counts |
---|---|
Female | 587 |
Male | 294 |
Right Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS (E03PASKR)
category | counts |
---|---|
None | 399 |
Mild | 196 |
Moderate | 181 |
Severe | 103 |
No response | 2 |
Left Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS (E03PASKL)
category | counts |
---|---|
None | 448 |
Moderate | 168 |
Mild | 168 |
Severe | 95 |
No response | 2 |
Kellgren-Lawrence (KL) Grade read from right PA knee radiograph (E03RADRPAKKL)
category | counts |
---|---|
No OA | 303 |
Questionable OA | 290 |
Mild OA | 119 |
Moderate OA | 89 |
Severe OA | 63 |
No image to read | 15 |
Total joint replacement | 2 |
Kellgren-Lawrence (KL) Grade read from left PA knee radiograph (E03RADLPAKKL)
category | counts |
---|---|
No OA | 339 |
Questionable OA | 262 |
Mild OA | 120 |
Moderate OA | 87 |
Severe OA | 55 |
No image to read | 15 |
Total joint replacement | 3 |
Age categorized in 5-year increments at time of JoCoHS enrollment (E03AGE)
category | counts |
---|---|
65-70 | 171 |
60-64 | 153 |
55-59 | 149 |
50-54 | 144 |
40-44 | 104 |
45-49 | 102 |
35-39 | 58 |
File dataTable.IMAGE_REF.csv
Preview
E03SUBJECTID | E03USIMGT | E03USIMGF | E03USIMGZ | E03USIMGD |
---|---|---|---|---|
72bb0a51-f020-11ed-b527-0a580a5f736a | left anterior suprapatellar longitudinal | 72bb0a51-f020-11ed-b527-0a580a5f736a_21.png | 132460 | Left |
72bb0a51-f020-11ed-b527-0a580a5f736a | left anterior suprapatellar longitudinal with power Doppler | 72bb0a51-f020-11ed-b527-0a580a5f736a_22.png | 154179 | Left |
72bb0a51-f020-11ed-b527-0a580a5f736a | left anterior suprapatellar transverse in 30 degrees flexion | 72bb0a51-f020-11ed-b527-0a580a5f736a_23.png | 133222 | Left |
72bb0a51-f020-11ed-b527-0a580a5f736a | left medial longitudinal | 72bb0a51-f020-11ed-b527-0a580a5f736a_24.png | 121758 | Left |
72bb0a51-f020-11ed-b527-0a580a5f736a | left lateral longitudinal | 72bb0a51-f020-11ed-b527-0a580a5f736a_25.png | 108043 | Left |
Variable descriptions
[
{
"name": "E03SUBJECTID",
"label": "A unique identifier for a subject in a study.",
"value": {
"format": "Universal Unique Identifier (UUID version 1) generated according to RFC 4122",
"dataType": "object",
"unique": "True"
},
"identifier": "True",
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C69256"
]
}
},
{
"name": "E03USIMGT",
"label": "Knee ultrasound image type",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"11": "right anterior suprapatellar longitudinal",
"12": "right anterior suprapatellar longitudinal with power Doppler",
"13": "right anterior suprapatellar transverse in 30 degrees flexion",
"14": "right medial longitudinal",
"15": "right lateral longitudinal",
"16": "right anterior suprapatellar transverse in maximal flexion",
"17": "right posterior medial transverse",
"21": "left anterior suprapatellar longitudinal",
"22": "left anterior suprapatellar longitudinal with power Doppler",
"23": "left anterior suprapatellar transverse in 30 degrees flexion",
"24": "left medial longitudinal",
"25": "left lateral longitudinal",
"26": "left anterior suprapatellar transverse in maximal flexion",
"27": "left posterior medial transverse"
}
},
"identifier": "False",
"ontology": {
"source": [
"https://ncithesaurus.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C32221"
]
}
},
{
"name": "E03USIMGF",
"label": "Ultrasound image file name",
"fileFormat": "PNG",
"value": {
"format": "Each filename contains the E03SUBJECTID and E03USIMGT value in the form of E03SUBJECTID+'_'+E03USIMGT+'.png'",
"dataType": "string (42 character)"
},
"identifier": "False",
"ontology": {
"source": [
"https://ncithesaurus.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C171191"
]
}
},
{
"name": "E03USIMGZ",
"label": "Ultrasound image file size in bytes",
"value": {
"format": "Int64",
"unitOfMeasure": "byte",
"dataType": "Int64"
},
"identifier": "False"
},
{
"name": "E03USIMGD",
"label": "Knee imaged - left or right knee",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"1": "Right",
"2": "Left"
}
},
"identifier": "False",
"ontology": {
"source": [
"https://ncithesaurus.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C25306"
]
}
}
]
General statistics
Knee ultrasound image type (E03USIMGT)
category | counts |
---|---|
right anterior suprapatellar longitudinal | 867 |
right anterior suprapatellar longitudinal with power Doppler | 867 |
right lateral longitudinal | 867 |
right medial longitudinal | 867 |
left anterior suprapatellar longitudinal | 866 |
left anterior suprapatellar longitudinal with power Doppler | 866 |
left anterior suprapatellar transverse in 30 degrees flexion | 866 |
left medial longitudinal | 866 |
left lateral longitudinal | 866 |
right anterior suprapatellar transverse in 30 degrees flexion | 866 |
left anterior suprapatellar transverse in maximal flexion | 865 |
right anterior suprapatellar transverse in maximal flexion | 865 |
right posterior medial transverse | 835 |
left posterior medial transverse | 834 |
Knee imaged - left or right knee (E03USIMGD)
category | counts |
---|---|
Right | 6034 |
Left | 6029 |
This output was generated from the README.notebook.public.ipynb
Jupyter Notebook associated with this dataset. The purpose of this output is to provide documentation for the dataset. The methods used to generate this output allow for an automated approach to generate human-readable documentation for a dataset during the data curation process.