Skip to content

Included Databases

International Soundscape Database (ISD)

load

load()

Load example "ISD" [1]_ csv file to DataFrame

RETURNS DESCRIPTION
DataFrame

dataframe of ISD data

References

.. [1] Mitchell, Andrew, Oberman, Tin, Aletta, Francesco, Erfanian, Mercede, Kachlicka, Magdalena, Lionello, Matteo, & Kang, Jian. (2022). The International Soundscape Database: An integrated multimedia database of urban soundscape surveys -- questionnaires with acoustical and contextual information (0.2.4) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6331810

Source code in soundscapy/databases/isd.py
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def load():
    """Load example "ISD" [1]_ csv file to DataFrame

    Returns
    -------
    pd.DataFrame
        dataframe of ISD data

    References
    ----------
    .. [1] Mitchell, Andrew, Oberman, Tin, Aletta, Francesco, Erfanian, Mercede, Kachlicka, Magdalena, Lionello, Matteo, & Kang, Jian. (2022). The International Soundscape Database: An integrated multimedia database of urban soundscape surveys -- questionnaires with acoustical and contextual information (0.2.4) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6331810
    """

    with resources.path("soundscapy.data", "ISD v1.0 Data.csv") as f:
        data = pd.read_csv(f)
    data = rename_paqs(data, _PAQ_ALIASES)

    return data

validate

validate(df, paq_aliases=_PAQ_ALIASES, allow_paq_na=False, verbose=1, val_range=(1, 5))

Performs data quality checks and validates that the dataset fits the expected format

PARAMETER DESCRIPTION
df

ISD style dataframe, incl PAQ data

TYPE: DataFrame

paq_aliases

list of PAQ names (in order) or dict of PAQ names with new names as values, by default None

TYPE: list or dict DEFAULT: _PAQ_ALIASES

allow_lockdown

if True will keep Lockdown data in the df, by default True

TYPE: bool

allow_paq_na

remove rows which have any missing PAQ values otherwise will remove those with 50% missing, by default False verbose : int, optional how much info to print while running, by default 1

TYPE: bool DEFAULT: False

val_range

min and max range of the PAQ response values, by default (5, 1)

TYPE: tuple DEFAULT: (1, 5)

RETURNS DESCRIPTION
tuple

cleaned dataframe, dataframe of excluded samples

Source code in soundscapy/databases/isd.py
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
def validate(
    df: pd.DataFrame,
    paq_aliases: Union[List, Dict] = _PAQ_ALIASES,
    allow_paq_na: bool = False,
    verbose: int = 1,
    val_range: Tuple = (1, 5),
):
    """Performs data quality checks and validates that the dataset fits the expected format

    Parameters
    ----------
    df : pd.DataFrame
        ISD style dataframe, incl PAQ data
    paq_aliases : list or dict, optional
        list of PAQ names (in order)
        or dict of PAQ names with new names as values, by default None
    allow_lockdown : bool, optional
        if True will keep Lockdown data in the df, by default True
    allow_paq_na : bool, optional
        remove rows which have any missing PAQ values
        otherwise will remove those with 50% missing, by default False    verbose : int, optional
        how much info to print while running, by default 1
    val_range : tuple, optional
        min and max range of the PAQ response values, by default (5, 1)

    Returns
    -------
    tuple
        cleaned dataframe, dataframe of excluded samples
    """
    if verbose > 0:
        print("Renaming PAQ columns.")
    df = rename_paqs(df, paq_aliases)

    if verbose > 0:
        print("Checking PAQ data quality.")
    l = likert_data_quality(df, verbose, allow_paq_na, val_range)
    if l is None:
        excl_df = None
    else:
        excl_df = df.iloc[l, :]
        df = df.drop(df.index[l])
    return df, excl_df

Soundscape Attributes Translation Project (SATP)

load_participants

load_participants(version='latest')

Load the SATP participants dataset from Zenodo.

PARAMETER DESCRIPTION
version

Version of the dataset to load. The default is "latest".

TYPE: str DEFAULT: 'latest'

RETURNS DESCRIPTION
df

Dataframe containing the SATP participants dataset.

TYPE: DataFrame

Source code in soundscapy/databases/satp.py
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def load_participants(version: str = "latest") -> pd.DataFrame:
    """
    Load the SATP participants dataset from Zenodo.
    Parameters
    ----------
    version : str, optional
        Version of the dataset to load. The default is "latest".
    Returns
    -------
    df : pandas.DataFrame
        Dataframe containing the SATP participants dataset.
    """
    url = _url_fetch(version)
    return pd.read_excel(url, engine="openpyxl", sheet_name="Participants").drop(
        columns=["Unnamed: 3", "Unnamed: 4"]
    )

load_zenodo

load_zenodo(version='latest')

Load the SATP dataset from Zenodo.

PARAMETER DESCRIPTION
version

Version of the dataset to load. The default is "latest".

TYPE: str DEFAULT: 'latest'

RETURNS DESCRIPTION
df

Dataframe containing the SATP dataset.

TYPE: DataFrame

Source code in soundscapy/databases/satp.py
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def load_zenodo(version: str = "latest") -> pd.DataFrame:
    """
    Load the SATP dataset from Zenodo.
    Parameters
    ----------
    version : str, optional
        Version of the dataset to load. The default is "latest".
    Returns
    -------
    df : pandas.DataFrame
        Dataframe containing the SATP dataset.
    """
    url = _url_fetch(version)
    return pd.read_excel(url, engine="openpyxl", sheet_name="Main Merge")