Skip to content

Survey Analysis

This section provides an overview of the survey instruments used in soundscape research. It includes a brief description of each instrument, as well as information on how to access and use them.

surveys

The module containing functions for dealing with soundscape survey data.

Notes

The functions in this module are designed to be fairly general and can be used with any dataset in a similar format to the ISD. The key to this is using a simple dataframe/sheet with the following columns: Index columns: e.g. LocationID, RecordID, GroupID, SessionID Perceptual attributes: PAQ1, PAQ2, ..., PAQ8 Independent variables: e.g. Laeq, N5, Sharpness, etc.

The key functions of this module are designed to clean/validate datasets, calculate ISO coordinate values or SSM metrics, filter on index columns. Functions and operations which are specific to a particular dataset are located in their own modules under soundscape.databases.

add_iso_coords

add_iso_coords(data, val_range=(1, 5), names=('ISOPleasant', 'ISOEventful'), overwrite=False, angles=(0, 45, 90, 135, 180, 225, 270, 315))

Calculate and add ISO coordinates as new columns in dataframe

Calls calculate_paq_coords()

PARAMETER DESCRIPTION
angles

DEFAULT: (0, 45, 90, 135, 180, 225, 270, 315)

data

ISD Dataframe

TYPE: DataFrame

val_range

(max, min) range of original PAQ responses, by default (5, 1)

DEFAULT: (1, 5)

names

Names for new coordinate columns, by default ["ISOPleasant", "ISOEventful"]

TYPE: list DEFAULT: ('ISOPleasant', 'ISOEventful')

RETURNS DESCRIPTION
DataFrame

Dataframe with new columns added

See Also

:func:soundscapy.database.calculate_paq_coords

Source code in soundscapy/utils/surveys.py
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
def add_iso_coords(
    data,
    val_range=(1, 5),
    names=("ISOPleasant", "ISOEventful"),
    overwrite=False,
    angles=(0, 45, 90, 135, 180, 225, 270, 315),
):
    """Calculate and add ISO coordinates as new columns in dataframe

    Calls `calculate_paq_coords()`

    Parameters
    ----------
    angles
    data : pd.DataFrame
        ISD Dataframe
    val_range: tuple, optional
        (max, min) range of original PAQ responses, by default (5, 1)
    names : list, optional
        Names for new coordinate columns, by default ["ISOPleasant", "ISOEventful"]

    Returns
    -------
    pd.DataFrame
        Dataframe with new columns added

    See Also
    --------
    :func:`soundscapy.database.calculate_paq_coords`
    """
    if names[0] in data.columns:
        if overwrite:
            data = data.drop(names[0], axis=1)
        else:
            raise Warning(
                f"{names[0]} already in dataframe. Use `overwrite` to replace it."
            )
    if names[1] in data.columns:
        if overwrite:
            data = data.drop(names[1], axis=1)
        else:
            raise Warning(
                f"{names[1]} already in dataframe. Use `overwrite` to replace it."
            )
    isopl, isoev = calculate_iso_coords(data, val_range=val_range, angles=angles)
    data = data.assign(**{names[0]: isopl, names[1]: isoev})
    return data

adj_iso_ev

adj_iso_ev(values, angles, scale=None)

Calculate the adjusted ISOEventful value

This calculation is based on the formulae given in Aletta et. al. (2024), adapted from ISO12913-3. These formulae were developed to enable the use of adjusted angles and are as follows:

\[ P_{ISO} = \frac{1}{\lambda_{pl}} \sum_{i=1}^{8} \cos(\theta_i) \cdot \sigma_i \]

$$ E_{ISO} = \frac{1}{\lambda_{ev}} \sum_{i=1}^{8} \sin(\theta_i) \cdot \sigma_i $$ where i indexes each circumplex scale, $ heta_i$ is the adjusted angle for the circumplex scale for the appropriate language, and \(\sigma_i\) is the response value for the circumplex scale. The \(\frac{1}{\lambda}\) provides a scaling factor (equivalent to the \(\frac{1}{(4 + \sqrt{32})}\) from ISO 12913-3) to bring the range of ISOPleasant, ISOEventful to (-1, +1):

\[ \lambda_{pl} = \frac{\rho}{2} \sum_{i=1}^{8} \left| \cos(\theta_i) \right| \]

where \(\rho\) is the range of the PAQ values (i.e. 5 - 1 = 4). \(\lambda_{ev}\) is calculated in the same but using \(\sin(\theta_i)\) as before.

PARAMETER DESCRIPTION
values

angles

scale

DEFAULT: None

The

RETURNS DESCRIPTION
float
Source code in soundscapy/utils/surveys.py
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
def adj_iso_ev(values, angles, scale=None):
    """
    Calculate the adjusted ISOEventful value

    This calculation is based on the formulae given in Aletta et. al. (2024), adapted from ISO12913-3.
    These formulae were developed to enable the use of adjusted angles and are as follows:

    $$
    P_{ISO} = \\frac{1}{\\lambda_{pl}} \\sum_{i=1}^{8} \\cos(\\theta_i) \\cdot \\sigma_i
    $$

    $$
    E_{ISO} = \\frac{1}{\\lambda_{ev}} \\sum_{i=1}^{8} \\sin(\\theta_i) \\cdot \\sigma_i
    $$
    where i indexes each circumplex scale, $\theta_i$ is the adjusted angle for the circumplex scale for the
    appropriate language, and $\\sigma_i$ is the response value for the circumplex scale.
    The $\\frac{1}{\\lambda}$ provides a scaling factor (equivalent to the $\\frac{1}{(4 + \\sqrt{32})}$
    from ISO 12913-3) to bring the range of ISOPleasant, ISOEventful to (-1, +1):

    $$
    \\lambda_{pl} = \\frac{\\rho}{2} \\sum_{i=1}^{8} \\left| \\cos(\\theta_i) \\right|
    $$

    where $\\rho$ is the range of the PAQ values (i.e. 5 - 1 = 4). $\\lambda_{ev}$ is calculated in the same
    but using $\\sin(\\theta_i)$ as before.

    Parameters
    ----------
    values: tuple or np.array
    angles: tuple
    scale: float, optional
    The scale to use for the adjusted ISOPleasant value, by default None

    Returns
    -------
    float

    """
    iso_ev = np.sum(
        [np.sin(np.deg2rad(angle)) * values[i] for i, angle in enumerate(angles)]
    )
    if scale:
        iso_ev = iso_ev / (
            scale / 2 * np.sum(np.abs([np.sin(np.deg2rad(angle)) for angle in angles]))
        )

    return iso_ev

adj_iso_pl

adj_iso_pl(values, angles, scale=None)

Calculate the adjusted ISOPleasant value

This calculation is based on the formulae given in Aletta et. al. (2024), adapted from ISO12913-3. These formulae were developed to enable the use of adjusted angles and are as follows:

\[ P_{ISO} = \frac{1}{\lambda_{pl}} \sum_{i=1}^{8} \cos(\theta_i) \cdot \sigma_i \]

$$ E_{ISO} = \frac{1}{\lambda_{ev}} \sum_{i=1}^{8} \sin(\theta_i) \cdot \sigma_i $$ where i indexes each circumplex scale, $ heta_i$ is the adjusted angle for the circumplex scale for the appropriate language, and \(\sigma_i\) is the response value for the circumplex scale. The \(\frac{1}{\lambda}\) provides a scaling factor (equivalent to the \(\frac{1}{(4 + \sqrt{32})}\) from ISO 12913-3) to bring the range of ISOPleasant, ISOEventful to (-1, +1):

\[ \lambda_{pl} = \frac{\rho}{2} \sum_{i=1}^{8} \left| \cos(\theta_i) \right| \]

where \(\rho\) is the range of the PAQ values (i.e. 5 - 1 = 4). \(\lambda_{ev}\) is calculated in the same but using \(\sin(\theta_i)\) as before.

PARAMETER DESCRIPTION
values

TYPE: tuple

angles

TYPE: tuple

scale

DEFAULT: None

The

RETURNS DESCRIPTION
float
Source code in soundscapy/utils/surveys.py
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
def adj_iso_pl(values: tuple, angles: tuple, scale=None) -> float:
    """
    Calculate the adjusted ISOPleasant value

    This calculation is based on the formulae given in Aletta et. al. (2024), adapted from ISO12913-3.
    These formulae were developed to enable the use of adjusted angles and are as follows:

    $$
    P_{ISO} = \\frac{1}{\\lambda_{pl}} \\sum_{i=1}^{8} \\cos(\\theta_i) \\cdot \\sigma_i
    $$

    $$
    E_{ISO} = \\frac{1}{\\lambda_{ev}} \\sum_{i=1}^{8} \\sin(\\theta_i) \\cdot \\sigma_i
    $$
    where i indexes each circumplex scale, $\theta_i$ is the adjusted angle for the circumplex scale for the
    appropriate language, and $\\sigma_i$ is the response value for the circumplex scale.
    The $\\frac{1}{\\lambda}$ provides a scaling factor (equivalent to the $\\frac{1}{(4 + \\sqrt{32})}$
    from ISO 12913-3) to bring the range of ISOPleasant, ISOEventful to (-1, +1):

    $$
    \\lambda_{pl} = \\frac{\\rho}{2} \\sum_{i=1}^{8} \\left| \\cos(\\theta_i) \\right|
    $$

    where $\\rho$ is the range of the PAQ values (i.e. 5 - 1 = 4). $\\lambda_{ev}$ is calculated in the same
    but using $\\sin(\\theta_i)$ as before.

    Parameters
    ----------
    values: tuple or np.array
    angles: tuple
    scale: float, optional
    The scale to use for the adjusted ISOPleasant value, by default None

    Returns
    -------
    float

    """
    iso_pl = np.sum(
        [np.cos(np.deg2rad(angle)) * values[i] for i, angle in enumerate(angles)]
    )
    if scale:
        iso_pl = iso_pl / (
            scale / 2 * np.sum(np.abs([np.cos(np.deg2rad(angle)) for angle in angles]))
        )

    return iso_pl

calculate_iso_coords

calculate_iso_coords(results_df, val_range=(5, 1), angles=(0, 45, 90, 135, 180, 225, 270, 315))

Calculates the projected ISOPleasant and ISOEventful coordinates

If a value is missing, by default it is replaced with neutral (3). The raw PAQ values should be Likert data from 1 to 5 and the column names should match the PAQ_cols given above.

PARAMETER DESCRIPTION
angles

DEFAULT: (0, 45, 90, 135, 180, 225, 270, 315)

results_df

Dataframe containing ISD formatted data

TYPE: DataFrame

RETURNS DESCRIPTION
tuple

ISOPleasant and ISOEventful coordinate values

Source code in soundscapy/utils/surveys.py
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
def calculate_iso_coords(
    results_df: pd.DataFrame,
    val_range: tuple = (5, 1),
    angles=(0, 45, 90, 135, 180, 225, 270, 315),
):
    """Calculates the projected ISOPleasant and ISOEventful coordinates

    If a value is missing, by default it is replaced with neutral (3).
    The raw PAQ values should be Likert data from 1 to 5 and the column
    names should match the PAQ_cols given above.

    Parameters
    ----------
    angles
    results_df : pd.DataFrame
        Dataframe containing ISD formatted data

    Returns
    -------
    tuple
        ISOPleasant and ISOEventful coordinate values
    """
    # TODO: Add if statements for too much missing data
    scale = max(val_range) - min(val_range)

    ISOPleasant = return_paqs(results_df, incl_ids=False).apply(
        lambda row: adj_iso_pl(row, angles, scale), axis=1
    )
    ISOEventful = return_paqs(results_df, incl_ids=False).apply(
        lambda row: adj_iso_ev(row, angles, scale), axis=1
    )

    return ISOPleasant, ISOEventful

calculate_polar_coords

calculate_polar_coords(results_df, scaling='iso')

Calculates the polar coordinates

Based on the calculation given in Gurtman and Pincus (2003), pg. 416.

The raw PAQ values should be Likert data from 1 to 5 and the column names should match the PAQ_cols given above.

PARAMETER DESCRIPTION
results_df

Dataframe containing ISD formatted data

TYPE: DataFrame

scaling

The scaling to use for the polar coordinates, by default 'iso'

Options are 'iso', 'gurtman', and 'none'

For 'iso', the cartesian coordinates are scaled to (-1, +1) according to the basic method given in ISO12913.

For 'gurtman', the polar coordinates are scaled according to the method given in Gurtman and Pincus (2003), pg. 416.

For 'none', no scaling is applied.

TYPE: str DEFAULT: 'iso'

RETURNS DESCRIPTION
tuple

Polar coordinates

Source code in soundscapy/utils/surveys.py
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
def calculate_polar_coords(results_df: pd.DataFrame, scaling: str = "iso"):
    """Calculates the polar coordinates

    Based on the calculation given in Gurtman and Pincus (2003), pg. 416.

    The raw PAQ values should be Likert data from 1 to 5 and the column
    names should match the PAQ_cols given above.

    Parameters
    ----------
    results_df : pd.DataFrame
        Dataframe containing ISD formatted data
    scaling : str, optional
        The scaling to use for the polar coordinates, by default 'iso'

        Options are 'iso', 'gurtman', and 'none'

        For 'iso', the cartesian coordinates are scaled to (-1, +1) according to the basic
        method given in ISO12913.

        For 'gurtman', the polar coordinates are scaled according to the method given in
        Gurtman and Pincus (2003), pg. 416.

        For 'none', no scaling is applied.

    Returns
    -------
    tuple
        Polar coordinates
    """
    # raise error if scaling is not one of the options
    if scaling not in ["iso", "gurtman", "none"]:
        raise ValueError(
            f"Scaling must be one of 'iso', 'gurtman', or 'none', not {scaling}"
        )

    scale_to_one = True if scaling == "iso" else False
    isopl, isoev = calculate_iso_coords(results_df)

    if scaling == "gurtman":
        isopl = isopl * 0.25
        isoev = isoev * 0.25

    r, theta = _convert_to_polar_coords(isopl, isoev)
    return r, theta

convert_column_to_index

convert_column_to_index(df, col, drop=False)

Reassign an existing column as the dataframe index

Source code in soundscapy/utils/surveys.py
90
91
92
93
94
95
96
def convert_column_to_index(df, col: str, drop=False):
    """Reassign an existing column as the dataframe index"""
    assert col in df.columns, f"col: {col} not found in dataframe"
    df.index = df[col]
    if drop:
        df = df.drop(col, axis=1)
    return df

likert_data_quality

likert_data_quality(df, verbose=0, allow_na=False, val_range=(1, 5))

Basic check of PAQ data quality

The likert_data_quality function takes a DataFrame and returns a list of indices that should be dropped from the DataFrame. The function checks for:

  • Rows with all values equal to 1 (indicating no PAQ data)

  • Rows with more than 4 NaN values (indicating missing PAQ data)

  • Rows where any value is greater than 5 or less than 1 (indicating invalid PAQ data)

RETURNS DESCRIPTION
A list of indices that need to be removed from the dataframe
Source code in soundscapy/utils/surveys.py
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
def likert_data_quality(
    df: pd.DataFrame,
    verbose: int = 0,
    allow_na: bool = False,
    val_range: tuple = (1, 5),
) -> Union[List, None]:
    """Basic check of PAQ data quality

    The likert_data_quality function takes a DataFrame and returns a list of indices that
    should be dropped from the DataFrame. The function checks for:

    - Rows with all values equal to 1 (indicating no PAQ data)

    - Rows with more than 4 NaN values (indicating missing PAQ data)

    - Rows where any value is greater than 5 or less than 1 (indicating invalid PAQ data)

    Parameters
    ----------
        df: pd.DataFrame
            Specify the dataframe to be evaluated
        verbose: int, optional
            Determine whether or not the function should print out information about the data quality check, by default 0
        allow_na: bool
            Ensure that rows with any missing values are dropped, by default False
        val_range: tuple, optional
            Set the range of values that are considered to be valid, by default (1, 5).

    Returns
    -------

        A list of indices that need to be removed from the dataframe

    """

    paqs = return_paqs(df, incl_ids=False)
    l = []
    for i in range(len(paqs)):
        row = paqs.iloc[i]
        if allow_na is False and row.isna().sum() > 0:
            l.append(i)
            continue
        if row["PAQ1"] == row["PAQ2"] == row["PAQ3"] == row["PAQ4"] == row[
            "PAQ5"
        ] == row["PAQ6"] == row["PAQ7"] == row["PAQ8"] and row.sum() != np.mean(
            val_range
        ):
            l.append(i)
        elif row.isna().sum() > 4:
            l.append(i)
        elif row.max() > max(val_range) or row.min() < min(val_range):
            l.append(i)
    if l:
        if verbose > 0:
            print(f"Identified {len(l)} samples to remove.\n{l}")
        return l
    if verbose > 0:
        print("PAQ quality confirmed. No rows dropped.")
    return None

mean_responses

mean_responses(df, group)

Calculate the mean responses for each PAQ

PARAMETER DESCRIPTION
df

Dataframe containing ISD formatted data

TYPE: DataFrame

RETURNS DESCRIPTION
Dataframe

Dataframe containing the mean responses for each PAQ

Source code in soundscapy/utils/surveys.py
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
def mean_responses(df: pd.DataFrame, group: str) -> pd.DataFrame:
    """Calculate the mean responses for each PAQ

    Parameters
    ----------
    df : pd.DataFrame
        Dataframe containing ISD formatted data

    Returns
    -------
    pd.Dataframe
        Dataframe containing the mean responses for each PAQ
    """
    df = return_paqs(df, incl_ids=False, other_cols=[group])
    return df.groupby(group).mean()

rename_paqs

rename_paqs(df, paq_aliases=None, verbose=0)

The rename_paqs function renames the PAQ columns in a dataframe.

Soundscapy works with PAQ IDs (PAQ1, PAQ2, etc), so if you use labels such as pleasant, vibrant, etc. these will need to be renamed.

It takes as input a pandas DataFrame and returns the same DataFrame with renamed columns. If no arguments are passed, it will attempt to rename all of the PAQs based on their column names.

RETURNS DESCRIPTION
A pandas dataframe with the paq_ids column names
Source code in soundscapy/utils/surveys.py
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
def rename_paqs(
    df: pd.DataFrame, paq_aliases: Union[Tuple, Dict] = None, verbose: int = 0
) -> pd.DataFrame:
    """
    The rename_paqs function renames the PAQ columns in a dataframe.

    Soundscapy works with PAQ IDs (PAQ1, PAQ2, etc), so if you use labels such as pleasant, vibrant, etc. these will
    need to be renamed.

    It takes as input a pandas DataFrame and returns the same DataFrame with renamed columns.
    If no arguments are passed, it will attempt to rename all of the PAQs based on their column names.

    Parameters
    ----------
        df: pd.DataFrame
            Specify the dataframe to be renamed
        paq_aliases: tuple or dict, optional
            Specify which paqs are to be renamed, by default None.

            If None, will check if the column names are in our pre-defined options (i.e. pleasant, vibrant, etc).

            If a tuple is passed, the order of the tuple must match the order of the PAQs in the dataframe.

            Allow the function to be called with a dictionary of aliases if desired
        verbose: int, optional
            Print out a message if the paqs are already correctly named, by default 0

    Returns
    -------

        A pandas dataframe with the paq_ids column names

    """
    if paq_aliases is None:
        if any(i in b for i in PAQ_IDS for b in df.columns):
            if verbose > 0:
                print("PAQs already correctly named.")
            return df
        if any(i in b for i in PAQ_NAMES for b in df.columns):
            paq_aliases = PAQ_NAMES

    if type(paq_aliases) == list:
        return df.rename(
            columns={
                paq_aliases[0]: PAQ_IDS[0],
                paq_aliases[1]: PAQ_IDS[1],
                paq_aliases[2]: PAQ_IDS[2],
                paq_aliases[3]: PAQ_IDS[3],
                paq_aliases[4]: PAQ_IDS[4],
                paq_aliases[5]: PAQ_IDS[5],
                paq_aliases[6]: PAQ_IDS[6],
                paq_aliases[7]: PAQ_IDS[7],
            }
        )
    elif type(paq_aliases) == dict:
        return df.rename(columns=paq_aliases)

return_paqs

return_paqs(df, incl_ids=True, other_cols=None)

Return only the PAQ columns

PARAMETER DESCRIPTION
incl_ids

whether to include ID cols too (i.e. RecordID, GroupID, etc), by default True

TYPE: bool DEFAULT: True

other_cols

other columns to also include, by default None

TYPE: list DEFAULT: None

RETURNS DESCRIPTION
DataFrame

dataframe containing only the PAQ columns

Source code in soundscapy/utils/surveys.py
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
def return_paqs(df, incl_ids=True, other_cols=None):
    """Return only the PAQ columns

    Parameters
    ----------
    incl_ids : bool, optional
        whether to include ID cols too (i.e. RecordID, GroupID, etc), by default True
    other_cols : list, optional
        other columns to also include, by default None

    Returns
    -------
    pd.DataFrame
        dataframe containing only the PAQ columns
    """
    cols = PAQ_IDS
    if incl_ids:
        id_cols = [
            name
            for name in ["RecordID", "GroupID", "SessionID", "LocationID"]
            if name in df.columns
        ]

        cols = id_cols + cols
    if other_cols:
        cols = cols + other_cols
    return df[cols]

simulation

simulation(n=3000, val_range=(1, 5), add_iso_coords=False, **coord_kwargs)

Generate random PAQ responses

The PAQ responses will follow a uniform random distribution for each PAQ, meaning e.g. for calm either 1, 2, 3, 4, or 5 is equally likely.

PARAMETER DESCRIPTION
n

number of samples to simulate, by default 3000

TYPE: int DEFAULT: 3000

add_iso_coords

should we also calculate the ISO coordinates, by default False

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
Dataframe

dataframe of randomly generated PAQ response

Source code in soundscapy/utils/surveys.py
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
def simulation(n=3000, val_range=(1, 5), add_iso_coords=False, **coord_kwargs):
    """Generate random PAQ responses

    The PAQ responses will follow a uniform random distribution
    for each PAQ, meaning e.g. for calm either 1, 2, 3, 4, or 5
    is equally likely.
    Parameters
    ----------
    n : int, optional
        number of samples to simulate, by default 3000
    add_iso_coords : bool, optional
        should we also calculate the ISO coordinates, by default False

    Returns
    -------
    pd.Dataframe
        dataframe of randomly generated PAQ response
    """
    np.random.seed(42)
    df = pd.DataFrame(
        np.random.randint(min(val_range), max(val_range) + 1, size=(n, 8)),
        columns=PAQ_IDS,
    )
    if add_iso_coords:
        isopl, isoev = calculate_iso_coords(df, **coord_kwargs)
        df = df.assign(ISOPleasant=isopl, ISOEventful=isoev)
    return df

ssm_cosine_fit

ssm_cosine_fit(y, angles=(0, 45, 90, 135, 180, 225, 270, 315), bounds=([0, 0, 0, -np.inf], [np.inf, 360, np.inf, np.inf]))

Fit a cosine model to the data

PARAMETER DESCRIPTION
angles

List of angles

TYPE: list DEFAULT: (0, 45, 90, 135, 180, 225, 270, 315)

y

List of y values

TYPE: list

bounds

Bounds for the parameters

TYPE: tuple DEFAULT: ([0, 0, 0, -inf], [inf, 360, inf, inf])

RETURNS DESCRIPTION
tuple

(amp, delta, elev, dev)

Source code in soundscapy/utils/surveys.py
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
def ssm_cosine_fit(
    y,
    angles=(0, 45, 90, 135, 180, 225, 270, 315),
    bounds=([0, 0, 0, -np.inf], [np.inf, 360, np.inf, np.inf]),
):
    """Fit a cosine model to the data

    Parameters
    ----------
    angles : list
        List of angles
    y : list
        List of y values
    bounds : tuple
        Bounds for the parameters

    Returns
    -------
    tuple
        (amp, delta, elev, dev)
    """

    def form(theta, amp, delta, elev, dev):
        return elev + amp * np.cos(np.radians(theta - delta)) + dev

    param, covariance = optimize.curve_fit(
        form,
        xdata=angles,
        ydata=y,
        bounds=bounds,
    )
    r2 = _r2_score(y, form(angles, *param))
    amp, delta, elev, dev = param
    return amp, delta, elev, dev, r2

ssm_metrics

ssm_metrics(df, paq_cols=PAQ_IDS, method='cosine', val_range=(5, 1), scale_to_one=True, angles=(0, 45, 90, 135, 180, 225, 270, 315))

Calculate the SSM metrics for each response

PARAMETER DESCRIPTION
df

Dataframe containing ISD formatted data

TYPE: DataFrame

paq_cols

List of PAQ columns, by default PAQ_IDS

TYPE: list DEFAULT: PAQ_IDS

method

Method by which to calculate the SSM, by default 'cosine' 'cosine' fits a cosine model to the data, using the Structural Summary Method developed by Gurtman (1994; Gurtman & Balakrishnan, 1998). 'polar_conversion' directly converts the ISO coordinates to polar coordinates.

TYPE: str DEFAULT: 'cosine'

RETURNS DESCRIPTION
DataFrame

Dataframe containing the SSM metrics

Source code in soundscapy/utils/surveys.py
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
def ssm_metrics(
    df: pd.DataFrame,
    paq_cols: list = PAQ_IDS,
    method: str = "cosine",
    val_range: tuple = (5, 1),
    scale_to_one: bool = True,
    angles: Tuple = (0, 45, 90, 135, 180, 225, 270, 315),
):
    """Calculate the SSM metrics for each response

    Parameters
    ----------
    df : pd.DataFrame
        Dataframe containing ISD formatted data
    paq_cols : list, optional
        List of PAQ columns, by default PAQ_IDS
    method : str, optional
        Method by which to calculate the SSM, by default 'cosine'
        'cosine' fits a cosine model to the data, using the Structural Summary Method developed
        by Gurtman (1994; Gurtman & Balakrishnan, 1998).
        'polar_conversion' directly converts the ISO coordinates to polar coordinates.

    Returns
    -------
    pd.DataFrame
        Dataframe containing the SSM metrics
    """
    # Check that the PAQ columns are present
    if not set(paq_cols).issubset(df.columns):
        raise ValueError("PAQ columns are not present in the dataframe.")

    # # Check that the PAQ values are within the range
    # if not _check_paq_range(df, paq_cols, val_range, verbose):
    #     raise ValueError("PAQ values are not within the specified range.")

    if method == "polar":
        # Calculate the coordinates
        vl, theta = calculate_polar_coords(df)

        mean = np.mean(df[paq_cols], axis=1)
        mean = mean / abs(max(val_range) - min(val_range)) if scale_to_one else mean

        # Calculate the SSM metrics
        df = df.assign(
            vl=vl,
            theta=theta,
            mean_level=mean,
        )
        return df

    elif method == "cosine":
        ssm_df = df[paq_cols].apply(
            lambda y: ssm_cosine_fit(y, angles=angles), axis=1, result_type="expand"
        )

        df = df.assign(
            amp=ssm_df.iloc[:, 0],
            delta=ssm_df.iloc[:, 1],
            elev=ssm_df.iloc[:, 2],
            dev=ssm_df.iloc[:, 3],
            r2=ssm_df.iloc[:, 4],
        )
        return df

    else:
        raise ValueError("Method must be either 'polar' or 'cosine'.")