Included Databases
International Soundscape Database (ISD)
Module for handling the International Soundscape Database (ISD).
This module provides functions for loading, validating, and analyzing data from the International Soundscape Database. It includes utilities for data retrieval, quality checks, and basic analysis operations.
Notes
The ISD is a large-scale database of soundscape surveys and recordings collected across multiple cities. This module is designed to work with the specific structure and content of the ISD.
Examples:
>>> import soundscapy.databases.isd as isd
>>> df = isd.load()
>>> isinstance(df, pd.DataFrame)
True
>>> 'PAQ1' in df.columns
True
load
load()
Load the example "ISD" csv file to a DataFrame.
RETURNS | DESCRIPTION |
---|---|
DataFrame
|
DataFrame containing ISD data. |
Notes
This function loads the ISD data from a local CSV file included with the soundscapy package.
References
Mitchell, A., Oberman, T., Aletta, F., Erfanian, M., Kachlicka, M., Lionello, M., & Kang, J. (2022). The International Soundscape Database: An integrated multimedia database of urban soundscape surveys -- questionnaires with acoustical and contextual information (0.2.4) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6331810
Examples:
>>> from soundscapy.surveys.survey_utils import PAQ_IDS
>>> df = load()
>>> isinstance(df, pd.DataFrame)
True
>>> set(PAQ_IDS).issubset(df.columns)
True
Source code in soundscapy/databases/isd.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
|
validate
validate(df, paq_aliases=_PAQ_ALIASES, allow_paq_na=False, val_range=(1, 5))
Perform data quality checks and validate that the dataset fits the expected format.
PARAMETER | DESCRIPTION |
---|---|
df
|
ISD style dataframe, including PAQ data.
TYPE:
|
paq_aliases
|
List of PAQ names (in order) or dict of PAQ names with new names as values.
TYPE:
|
allow_paq_na
|
If True, allow NaN values in PAQ data, by default False.
TYPE:
|
val_range
|
Min and max range of the PAQ response values, by default (1, 5).
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tuple[DataFrame, Optional[DataFrame]]
|
Tuple containing the cleaned dataframe and optionally a dataframe of excluded samples. |
Notes
This function renames PAQ columns, checks PAQ data quality, and optionally removes rows with invalid or missing PAQ values.
Examples:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({
... 'PAQ1': [np.nan, 2, 3, 3], 'PAQ2': [3, 2, 6, 3], 'PAQ3': [2, 2, 3, 3],
... 'PAQ4': [1, 2, 3, 3], 'PAQ5': [5, 2, 3, 3], 'PAQ6': [3, 2, 3, 3],
... 'PAQ7': [4, 2, 3, 3], 'PAQ8': [2, 2, 3, 3]
... })
>>> clean_df, excl_df = validate(df, allow_paq_na=True)
>>> clean_df.shape[0]
2
>>> excl_df.shape[0]
2
Source code in soundscapy/databases/isd.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
|
Soundscape Attributes Translation Project (SATP)
Module for handling the Soundscape Attributes Translation Project (SATP) database.
This module provides functions for loading and processing data from the Soundscape Attributes Translation Project database. It includes utilities for data retrieval from Zenodo and basic data loading operations.
Examples:
>>> import soundscapy.databases.satp as satp
>>> df = satp.load_zenodo()
>>> isinstance(df, pd.DataFrame)
True
>>> 'Language' in df.columns
True
>>> participants = satp.load_participants()
>>> isinstance(participants, pd.DataFrame)
True
>>> 'Country' in participants.columns
True
load_participants
load_participants(version='latest')
Load the SATP participants dataset from Zenodo.
PARAMETER | DESCRIPTION |
---|---|
version
|
Version of the dataset to load. The default is "latest".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DataFrame
|
DataFrame containing the SATP participants dataset. |
Source code in soundscapy/databases/satp.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
load_zenodo
load_zenodo(version='latest')
Load the SATP dataset from Zenodo.
PARAMETER | DESCRIPTION |
---|---|
version
|
Version of the dataset to load. The default is "latest".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DataFrame
|
DataFrame containing the SATP dataset. |
Source code in soundscapy/databases/satp.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|