Skip to main content

README

TOC

This document provides a summary of the dataset contents:

Citation metadata

DOI - https://doi.org/10.7910/DVN/SKP9IB

title

Development of an AI/ML-ready knee ultrasound dataset in a population-based cohort

grantNumber

grantNumberValue: R01AR077060

grantNumberAgency: NIAMS

grantNumberValue: 3R01AR077060-03S1

dsDescription

dsDescriptionValue:

About the data

An ultrasound dataset to use in the discovery of ultrasound features associated with pain and radiographic change in KOA is highly innovative and will be a major step forward for the field. These ultrasound images originate from the diverse and inclusive population-based Johnston County Health Study (JoCoHS). This dataset is designed to adhere to FAIR principles and was funded in part by an Administrative Supplement to Improve the AI/ML-Readiness of NIH-Supported Data (3R01AR077060-03S1). This dataset includes a subset of JoCoHS participants who underwent ultrasound imaging and radiographic evaluations.

dsDescriptionDate: 2024-10-29

dsDescriptionValue:

To begin learning about the dataset, visit our User Guide for an all-in-one document containing statistics and other details to help you work with the data.

dsDescriptionDate: 2024-10-29

publication

publicationCitation: Yerich NV, Alvarez C, Schwartz TA, Savage-Guin S, Renner JB, Bakewell CJ, Kohler MJ, Lin J, Samuels J, Nelson AE. A Standardized, Pragmatic Approach to Knee Ultrasound for Clinical Research in Osteoarthritis: The Johnston County Osteoarthritis Project. ACR Open Rheumatol. 2020 Jul;2(7):438-448. doi: 10.1002/acr2.11159. PMID: 32597564; PMCID: PMC7368135.

publicationIDType: pmid

publicationIDNumber: 32597564

publicationURL: https://doi.org/10.1002/acr2.11159

author

authorName: Nelson, Amanda

authorAffiliation: University of North Carolina at Chapel Hill

authorIdentifierScheme: ORCID

authorIdentifier: 0000-0002-9344-7877

dateOfCollection

dateOfCollectionStart: 2019-03-14

dateOfCollectionEnd: 2024-06-01

subject

Medicine, Health and Life Sciences

Dataset files

25 files currently in this dataset

filenamedirectorycategoriesdescription
README.notebook.public.ipynb
Size: 55.0 kB
code['code', 'Jupyter', 'notebook']This file provides documentation describing the dataset and a preview of the data. This notebook also includes code to use as a dataset documenting framework for your own projects.
_notebook.config.template.json
Size: 3.9 kB
code['code', 'json']This file is used with the README.notebook.public.ipynb and contains the notebook variables. It also contains an embedded schema to describe/validate the JSON config.
_notebook.instructions.md
Size: 3.5 kB
code['code', 'Jupyter', 'Markdown']This file is used with the README.notebook.public.ipynb and contains instructions on using the notebook.
_notebook_installer.py
Size: 699 Bytes
code['code', 'Jupyter', 'notebook', 'Python script']The purpose of this file is to install any Python modules used by the Jupyter notebook that are not included in the Jupyter environment by default. This script is only useful to those wanting to run the code within the Jupyter notebook.
_notebook_worker.py
Size: 18.2 kB
code['code', 'Jupyter', 'notebook', 'Python script']This file contains the scripts to help describe the dataset within the Jupyter Notebook. This script is ONLY useful to those wanting to run the code within the Jupyter Notebook. This code is kept separate from the Notebook to prevent the Notebook becoming bloated with code.
example.us.image.png
Size: 127.5 kB
data/image/example['data', 'example']This is an example ultrasound file that represents one of the images from the image archives and should not be used for analysis purposes. Images in this dataset are saved in lossless .png format.
imageArchive.11.zip
Size: 104.7 MB
data/image/ultrasound['data', 'image']This file contains an archive of right anterior suprapatellar longitudinal, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.12.zip
Size: 133.0 MB
data/image/ultrasound['data', 'image']This file contains an archive of right anterior suprapatellar longitudinal with power Doppler, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.13.zip
Size: 112.6 MB
data/image/ultrasound['data', 'image']This file contains an archive of right anterior suprapatellar transverse in 30 degrees flexion, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.14.zip
Size: 83.6 MB
data/image/ultrasound['data', 'image']This file contains an archive of right medial longitudinal, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.15.zip
Size: 81.8 MB
data/image/ultrasound['data', 'image']This file contains an archive of right lateral longitudinal, ultrasound images (file count: 867). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.16.zip
Size: 88.5 MB
data/image/ultrasound['data', 'image']This file contains an archive of right anterior suprapatellar transverse in maximal flexion, ultrasound images (file count: 865). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.17.zip
Size: 101.6 MB
data/image/ultrasound['data', 'image']This file contains an archive of right posterior medial transverse, ultrasound images (file count: 835). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.21.zip
Size: 104.2 MB
data/image/ultrasound['data', 'image']This file contains an archive of left anterior suprapatellar longitudinal, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.22.zip
Size: 132.3 MB
data/image/ultrasound['data', 'image']This file contains an archive of left anterior suprapatellar longitudinal with power Doppler, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.23.zip
Size: 112.6 MB
data/image/ultrasound['data', 'image']This file contains an archive of left anterior suprapatellar transverse in 30 degrees flexion, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.24.zip
Size: 83.2 MB
data/image/ultrasound['data', 'image']This file contains an archive of left medial longitudinal, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.25.zip
Size: 81.7 MB
data/image/ultrasound['data', 'image']This file contains an archive of left lateral longitudinal, ultrasound images (file count: 866). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.26.zip
Size: 89.3 MB
data/image/ultrasound['data', 'image']This file contains an archive of left anterior suprapatellar transverse in maximal flexion, ultrasound images (file count: 865). There should only be one image per subject in this archive. Images are saved in lossless .png format.
imageArchive.27.zip
Size: 101.2 MB
data/image/ultrasound['data', 'image']This file contains an archive of left posterior medial transverse, ultrasound images (file count: 834). There should only be one image per subject in this archive. Images are saved in lossless .png format.
dataTable.IMAGE_REF.tab
Size: 1.2 MB
Row count: 12,063
data/reference['data', 'reference']This file contains the ultrasound image metadata. Use this file to determine the number of ultrasound image types available, file sizes, and references to the files found in the ancillary image archives.
dataTable.SUBJECT.tab
Size: 45.0 kB
Row count: 881
data/reference['data', 'reference']This is the first file generated for this dataset and contains the participant/subject reference IDs, basic subject demographics, PA knee KL grades and reported pain in knees.
dvDatasetMetadata.json
Size: 87.8 kB
data/reference['data', 'reference']This file contains metadata created BEFORE data was uploaded to the repository dataset to help describe the data we EXPECT to have uploaded to the dataset. This is file was created since the repository does not contain built-in tools to describe data files and variables in depth, which is needed to both document the data and perform validation. This file is used to validate and describe categorical variables within the datatable files.
README.md
Size: 31.3 kB
documentation['README']This document contains an overview of the dataset and is a good starting point to understanding the data.
curation_log.md
Size: 17.1 kB
documentation['curation']This document contains the dataset curation checklist and notes regarding the considerations and steps taken to curate the data.

25 files EXPECTED in this dataset

File dataTable.SUBJECT.csv

Preview

E03SUBJECTIDE03GENDERE03PASKRE03PASKLE03RADRPAKKLE03RADLPAKKLE03AGE
72bb0a51-f020-11ed-b527-0a580a5f736aFemaleModerateNoneModerate OAModerate OA50-54
72bb0a76-f020-11ed-b527-0a580a5f736aFemaleSevereNoneMild OANo image to read60-64
72bb0a8a-f020-11ed-b527-0a580a5f736aFemaleModerateModerateQuestionable OAMild OA65-70
72bb0a99-f020-11ed-b527-0a580a5f736aFemaleNoneMildNo OAQuestionable OA45-49
72bb0ab5-f020-11ed-b527-0a580a5f736aMaleNoneNoneQuestionable OAMild OA40-44

Variable descriptions

[
{
"name": "E03SUBJECTID",
"label": "A unique identifier for a subject in a study.",
"value": {
"format": "Universal Unique Identifier (UUID version 1) generated according to RFC 4122",
"dataType": "object",
"unique": "True"
},
"identifier": "True",
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C69256"
]
}
},
{
"name": "E03GENDER",
"label": "What was your gender at birth?",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"1": "Male",
"2": "Female"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C164527"
]
}
},
{
"name": "E03PASKR",
"surveyQuestion": "On MOST days of ANY ONE MONTH in the LAST 12 MONTHS did you have pain, aching, or stiffness in any of the following? Please rate as none, mild, moderate, or severe. - Right - Knee",
"label": "Right Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS",
"source": "Qualtrics survey (1h2J)",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "None",
"1": "Mild",
"2": "Moderate",
"3": "Severe",
"888": "No response"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125625",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125623",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C105779",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=249913002",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=30989003"
]
}
},
{
"name": "E03PASKL",
"surveyQuestion": "On MOST days of ANY ONE MONTH in the LAST 12 MONTHS did you have pain, aching, or stiffness in any of the following? Please rate as none, mild, moderate, or severe. - Left - Knee",
"label": "Left Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS",
"source": "Qualtrics survey (1h2J)",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "None",
"1": "Mild",
"2": "Moderate",
"3": "Severe",
"888": "No response"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125625",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C125623",
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C105779",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=249913002",
"https://phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=30989003"
]
}
},
{
"name": "E03RADRPAKKL",
"label": "Kellgren-Lawrence (KL) Grade read from right PA knee radiograph",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "No OA",
"1": "Questionable OA",
"2": "Mild OA",
"3": "Moderate OA",
"4": "Severe OA",
"99": "Total joint replacement",
"-99": "No image to read"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C115514"
]
}
},
{
"name": "E03RADLPAKKL",
"label": "Kellgren-Lawrence (KL) Grade read from left PA knee radiograph",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"0": "No OA",
"1": "Questionable OA",
"2": "Mild OA",
"3": "Moderate OA",
"4": "Severe OA",
"99": "Total joint replacement",
"-99": "No image to read"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C115514"
]
}
},
{
"name": "E03AGE",
"label": "Age categorized in 5-year increments at time of JoCoHS enrollment",
"value": {
"format": "categorical",
"unitOfMeasure": "year",
"dataType": "Int64",
"unique": "False",
"category": {
"1": "35-39",
"2": "40-44",
"3": "45-49",
"4": "50-54",
"5": "55-59",
"6": "60-64",
"7": "65-70"
}
},
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C69260"
]
}
}
]

General statistics

What was your gender at birth? (E03GENDER)

categorycounts
Female587
Male294

Right Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS (E03PASKR)

categorycounts
None399
Mild196
Moderate181
Severe103
No response2

Left Knee - pain, aching, or stiffness on MOST days of ANY ONE MONTH in the LAST 12 MONTHS (E03PASKL)

categorycounts
None448
Moderate168
Mild168
Severe95
No response2

Kellgren-Lawrence (KL) Grade read from right PA knee radiograph (E03RADRPAKKL)

categorycounts
No OA303
Questionable OA290
Mild OA119
Moderate OA89
Severe OA63
No image to read15
Total joint replacement2

Kellgren-Lawrence (KL) Grade read from left PA knee radiograph (E03RADLPAKKL)

categorycounts
No OA339
Questionable OA262
Mild OA120
Moderate OA87
Severe OA55
No image to read15
Total joint replacement3

Age categorized in 5-year increments at time of JoCoHS enrollment (E03AGE)

categorycounts
65-70171
60-64153
55-59149
50-54144
40-44104
45-49102
35-3958

File dataTable.IMAGE_REF.csv

Preview

E03SUBJECTIDE03USIMGTE03USIMGFE03USIMGZE03USIMGD
72bb0a51-f020-11ed-b527-0a580a5f736aleft anterior suprapatellar longitudinal72bb0a51-f020-11ed-b527-0a580a5f736a_21.png132460Left
72bb0a51-f020-11ed-b527-0a580a5f736aleft anterior suprapatellar longitudinal with power Doppler72bb0a51-f020-11ed-b527-0a580a5f736a_22.png154179Left
72bb0a51-f020-11ed-b527-0a580a5f736aleft anterior suprapatellar transverse in 30 degrees flexion72bb0a51-f020-11ed-b527-0a580a5f736a_23.png133222Left
72bb0a51-f020-11ed-b527-0a580a5f736aleft medial longitudinal72bb0a51-f020-11ed-b527-0a580a5f736a_24.png121758Left
72bb0a51-f020-11ed-b527-0a580a5f736aleft lateral longitudinal72bb0a51-f020-11ed-b527-0a580a5f736a_25.png108043Left

Variable descriptions

[
{
"name": "E03SUBJECTID",
"label": "A unique identifier for a subject in a study.",
"value": {
"format": "Universal Unique Identifier (UUID version 1) generated according to RFC 4122",
"dataType": "object",
"unique": "True"
},
"identifier": "True",
"ontology": {
"source": [
"https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C69256"
]
}
},
{
"name": "E03USIMGT",
"label": "Knee ultrasound image type",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"11": "right anterior suprapatellar longitudinal",
"12": "right anterior suprapatellar longitudinal with power Doppler",
"13": "right anterior suprapatellar transverse in 30 degrees flexion",
"14": "right medial longitudinal",
"15": "right lateral longitudinal",
"16": "right anterior suprapatellar transverse in maximal flexion",
"17": "right posterior medial transverse",
"21": "left anterior suprapatellar longitudinal",
"22": "left anterior suprapatellar longitudinal with power Doppler",
"23": "left anterior suprapatellar transverse in 30 degrees flexion",
"24": "left medial longitudinal",
"25": "left lateral longitudinal",
"26": "left anterior suprapatellar transverse in maximal flexion",
"27": "left posterior medial transverse"
}
},
"identifier": "False",
"ontology": {
"source": [
"https://ncithesaurus.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C32221"
]
}
},
{
"name": "E03USIMGF",
"label": "Ultrasound image file name",
"fileFormat": "PNG",
"value": {
"format": "Each filename contains the E03SUBJECTID and E03USIMGT value in the form of E03SUBJECTID+'_'+E03USIMGT+'.png'",
"dataType": "string (42 character)"
},
"identifier": "False",
"ontology": {
"source": [
"https://ncithesaurus.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C171191"
]
}
},
{
"name": "E03USIMGZ",
"label": "Ultrasound image file size in bytes",
"value": {
"format": "Int64",
"unitOfMeasure": "byte",
"dataType": "Int64"
},
"identifier": "False"
},
{
"name": "E03USIMGD",
"label": "Knee imaged - left or right knee",
"value": {
"format": "categorical",
"dataType": "Int64",
"unique": "False",
"category": {
"1": "Right",
"2": "Left"
}
},
"identifier": "False",
"ontology": {
"source": [
"https://ncithesaurus.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C25306"
]
}
}
]

General statistics

Knee ultrasound image type (E03USIMGT)

categorycounts
right anterior suprapatellar longitudinal867
right anterior suprapatellar longitudinal with power Doppler867
right lateral longitudinal867
right medial longitudinal867
left anterior suprapatellar longitudinal866
left anterior suprapatellar longitudinal with power Doppler866
left anterior suprapatellar transverse in 30 degrees flexion866
left medial longitudinal866
left lateral longitudinal866
right anterior suprapatellar transverse in 30 degrees flexion866
left anterior suprapatellar transverse in maximal flexion865
right anterior suprapatellar transverse in maximal flexion865
right posterior medial transverse835
left posterior medial transverse834

Knee imaged - left or right knee (E03USIMGD)

categorycounts
Right6034
Left6029

This output was generated from the README.notebook.public.ipynb Jupyter Notebook associated with this dataset. The purpose of this output is to provide documentation for the dataset. The methods used to generate this output allow for an automated approach to generate human-readable documentation for a dataset during the data curation process.