How to analyse and represent soundscape perception¶

Andrew Mitchell, Francesco Aletta, Jian Kang

This notebook provides examples for analysing and visualising soundscape assessment data from the International Soundscape Database (ISD). The custom functions created for this purpose are stored in the isd.py file.

The ISD contains survey and acoustic data collected in urban public spaces with the goal of creating a unified dataset for the development of a predictive soundscape model and a set of soundscape indices. We have created a new visualisation method in order to properly analyse and examine the assessment of the locations included in the dataset. This method focuses on enabling sophisticated statistical analyses and ensuring the variety of responses in a location are properly considered.

In this notebook we will walk you through both using the code itself and interpreting the soundscape perception of urban spaces.

In [1]:

Copied!





# Add soundscapy to the Python path
# import sys
# sys.path.append('../..')
# imports
%matplotlib inline
import soundscapy as sspy
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
# warnings.simplefilter('ignore')

%config InlineBackend.figure_format = 'svg'
# Add soundscapy to the Python path
# import sys
# sys.path.append('../..')
# imports
%matplotlib inline
import soundscapy as sspy
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
# warnings.simplefilter('ignore')

%config InlineBackend.figure_format = 'svg'

The current ISO 12913 framework¶

Although different methods are proposed for data collection in ISO12913 Part 2, in the context of this study we focus on the questionnaire-based soundscape assessment (Method A), because it is underpinned by a theoretical relationship among the items of the questionnaire that compose it. The core of this questionnaire is the 8 perceptual attributes (PA) originally derived in Axlesson et al. (2010): pleasant, vibrant (or exciting), eventful, chaotic, annoying, monotonous, uneventful, and calm. In the questionnaire procedure, these PAs are assessed independently of each other, however, they are conceptually considered to form a two-dimensional circumplex with Pleasantness and Eventfulness on the x- and y-axis, respectively, where all regions of the space are equally likely to accomodate a given soundscape assessment. In Axelsson et al. (2010) a third primary dimension, Familiarity was also found, however this only accounted for 8% of the variance and is typically disregarded as part of the standard circumplex. As will be made clear throughout, the circumplex model has several aspects which make it useful for representing the soundscape perception of a space as a whole.

Coordinate transformation¶

To facilitate the analysis of the perceptual attribute (PA) responses, the Likert scale responses are coded from 1 (Strongly disagree) to 5 (Strongly agree) as ordinal variables. In order to reduce the 8 PA values into a pair of coordinates which can be plotted on the Pleasant-Eventful axes, Part 3 of the ISO 12913 provides a trigonometric transformation, based on the $45\degree$ relationship between the diagonal axes and the pleasant and eventful axes. This tranformation projects the coded values from the individual PAs down onto the primary Pleasantness and Eventfulness dimensions, then adds them together to form a single coordinate pair. In theory, this coordinate pair then encapsulates information from all 8 PA dimensions onto a more easily understandable and analyzable 2 dimensions.

The ISO coordinates are thus calculated by:

$$ ISOPleasant = [(pleasant - annoying) + \cos{45\degree} * (calm - chaotic) + \cos{45\degree} * (vibrant - monotonous)] * \frac{1}{(4 + \sqrt{32})} $$

$$ ISOEventful = [(eventful - uneventful) + \cos{45\degree} * (chaotic - calm) + \cos{45\degree} * (vibrant - monotonous)] * \frac{1}{(4 + \sqrt{32})} $$

where the PAs are arranged around the circumplex as shown in Figure 1. The $\cos{45\degree}$ term operates to project the diagonal terms down ono the x and y axes, and the $\frac{1}{4 + \sqrt{32}}$ scales the resulting coordinates to the range (-1, 1). The result of this transformation is demonstrated in Figure 1.

To give an example of this, we create two example survey responses, with different PAQ answers:

In [2]:

Copied!





sample_transform = {
    "RecordID": ["EX1", "EX2"],
    "pleasant": [4, 2],
    "vibrant": [4, 3],
    "eventful": [4, 5],
    "chaotic": [2, 5],
    "annoying": [1, 5],
    "monotonous": [3, 5],
    "uneventful": [3, 3],
    "calm": [4, 1],
}
sample_transform = pd.DataFrame().from_dict(sample_transform)
sample_transform = sspy.surveys.convert_column_to_index(sample_transform, "RecordID")
sample_transform
sample_transform = {
    "RecordID": ["EX1", "EX2"],
    "pleasant": [4, 2],
    "vibrant": [4, 3],
    "eventful": [4, 5],
    "chaotic": [2, 5],
    "annoying": [1, 5],
    "monotonous": [3, 5],
    "uneventful": [3, 3],
    "calm": [4, 1],
}
sample_transform = pd.DataFrame().from_dict(sample_transform)
sample_transform = sspy.surveys.convert_column_to_index(sample_transform, "RecordID")
sample_transform

Out[2]:

	RecordID	pleasant	vibrant	eventful	chaotic	annoying	monotonous	uneventful	calm
RecordID
EX1	EX1	4	4	4	2	1	3	3	4
EX2	EX2	2	3	5	5	5	5	3	1

We can visualise how these individual PAQ answers are arranged on the circumplex with a radar plot.

In [3]:

Copied!

fig = plt.figure(figsize=(4, 4))
plt.rcParams["figure.dpi"] = 350
sspy.plotting.likert.paq_radar_plot(sample_transform)
fig = plt.figure(figsize=(4, 4))
plt.rcParams["figure.dpi"] = 350
sspy.plotting.likert.paq_radar_plot(sample_transform)

Out[3]:

<PolarAxes: >

No description has been provided for this image

Now, we can apply the transform formula from above to calculate the ISOPleasant and ISOEventful values and add them to the dataframe.

In [4]:

Copied!

sample_transform = sspy.surveys.rename_paqs(sample_transform)
sample_transform = sspy.surveys.add_iso_coords(sample_transform)
sample_transform
sample_transform = sspy.surveys.rename_paqs(sample_transform)
sample_transform = sspy.surveys.add_iso_coords(sample_transform)
sample_transform

Out[4]:

	RecordID	PAQ1	PAQ2	PAQ3	PAQ4	PAQ5	PAQ6	PAQ7	PAQ8	ISOPleasant	ISOEventful
RecordID
EX1	EX1	4	4	4	2	1	3	3	4	0.53033	0.030330
EX2	EX2	2	3	5	5	5	5	3	1	-0.75000	0.353553

Finally, we can plot these values on a two dimensional plane to visualise how the transform went from the 8 dimensions shown in the radar plot to the two ISO dimensions. This is done by calling the circumplex_scatter() function included in isd.py. This will create a plotting axis with the appropriate circumplex grid and labels, then plot the ISOPleasant and ISOEventful values as the x and y coordinates.

This treatment of the 8 PAs makes several assumptions and inferences about the relationships between the dimensions. As stated in the standard:

According to the two-dimensional model, vibrant soundscapes are both pleasant and eventful, chaotic soundscapes are both eventful and unpleasant, monotonous soundscapes are both unpleasant and uneventful, and finally calm soundscapes are both uneventful and pleasant.

In [5]:

Copied!





colors = ["b", "r"]
palette = sns.color_palette(colors)
sspy.plotting.scatter(
    sample_transform,
    hue="RecordID",
    legend="brief",
    s=100,
    palette=palette,
    alpha=0.45,
    diagonal_lines=True
)
colors = ["b", "r"]
palette = sns.color_palette(colors)
sspy.plotting.scatter(
    sample_transform,
    hue="RecordID",
    legend="brief",
    s=100,
    palette=palette,
    alpha=0.45,
    diagonal_lines=True
)

Out[5]:

<Axes: title={'center': 'Soundscape Scatter Plot'}, xlabel='ISOPleasant', ylabel='ISOEventful'>

The way forward: Probabilistic soundscape representation¶

Given the identified issues with the recommended methods for statistical analysis and their shortcomings in representing the variety in perception of the soundscape in a space, how then should we discuss or present the results of these soundscape assessments? Ideally the method will: 1) take advantage of the circumplex coordinates and their ability to be displayed on a scatter plot and be treated as continuous variables, 2) scale from a dataset of 10 responses to thousands of responses, 3) facilitate the comparison of the soundscapes of different locations and conditions, and 4) encapsulate the nuances and diversity of soundscape perception by representing the distribution of responses.

We therefore present a series of visualisations of the soundscape assessments of several urban spaces included in the International Soundscape Database (ISD) which reflect these goals. The specific locations selected from the ISD are chosen for demonstration only and these methods can be applied to any location. Rather than attempting to represent a single individual's soundscape or of describing a location's soundscape as a single average assessment (as in Part 3 of the ISO technical specification), this representation shows the whole range of perception of the users of the space.

To begin, we can load the dataset directly from the ISD:

In [6]:

Copied!





ssid = sspy.isd.load()
ssid, excl = sspy.isd.validate(ssid, allow_paq_na=False)
ssid = sspy.isd.add_iso_coords(ssid)
ssid.head()
ssid = sspy.isd.load()
ssid, excl = sspy.isd.validate(ssid, allow_paq_na=False)
ssid = sspy.isd.add_iso_coords(ssid)
ssid.head()

Renaming PAQ columns.
Checking PAQ data quality.
Identified 109 samples to remove.
[6, 9, 13, 30, 32, 46, 190, 213, 229, 244, 296, 412, 413, 428, 464, 485, 655, 734, 739, 762, 766, 780, 1067, 1274, 1290, 1316, 1320, 1338, 1346, 1347, 1397, 1425, 1431, 1446, 1447, 1470, 1485, 1491, 1504, 1505, 1510, 1512, 1517, 1522, 1523, 1527, 1599, 1698, 1734, 1817, 1911, 1948, 2069, 2107, 2109, 2111, 2150, 2199, 2277, 2293, 2384, 2386, 2490, 2523, 2584, 2592, 2695, 2762, 2767, 2783, 2789, 2825, 2826, 2832, 2840, 2856, 2859, 2879, 2883, 2889, 2910, 2932, 2956, 2969, 3031, 3058, 3077, 3124, 3149, 3163, 3185, 3202, 3210, 3211, 3212, 3213, 3214, 3215, 3216, 3272, 3302, 3365, 3414, 3491, 3502, 3510, 3517, 3533, 3583]

Out[6]:

	LocationID	SessionID	GroupID	RecordID	start_time	end_time	latitude	longitude	Language	Survey_Version	...	THD_THD_Max	THD_Min_Max	THD_Max_Max	THD_L5_Max	THD_L10_Max	THD_L50_Max	THD_L90_Max	THD_L95_Max	ISOPleasant	ISOEventful
0	CarloV	CarloV2	2CV12	1434	2019-05-16 18:46:00	2019-05-16 18:56:00	37.17685	-3.590392	eng	engISO2018	...	-0.09	-11.76	54.18	34.82	26.53	5.57	-9.0	-10.29	0.219670	-0.133883
1	CarloV	CarloV2	2CV12	1435	2019-05-16 18:46:00	2019-05-16 18:56:00	37.17685	-3.590392	eng	engISO2018	...	-0.09	-11.76	54.18	34.82	26.53	5.57	-9.0	-10.29	-0.426777	0.530330
2	CarloV	CarloV2	2CV13	1430	2019-05-16 19:02:00	2019-05-16 19:12:00	37.17685	-3.590392	eng	engISO2018	...	-2.10	-19.32	72.52	32.33	24.52	0.25	-16.3	-17.33	0.676777	-0.073223
3	CarloV	CarloV2	2CV13	1431	2019-05-16 19:02:00	2019-05-16 19:12:00	37.17685	-3.590392	eng	engISO2018	...	-2.10	-19.32	72.52	32.33	24.52	0.25	-16.3	-17.33	0.603553	-0.146447
4	CarloV	CarloV2	2CV13	1432	2019-05-16 19:02:00	2019-05-16 19:12:00	37.17685	-3.590392	eng	engISO2018	...	-2.10	-19.32	72.52	32.33	24.52	0.25	-16.3	-17.33	0.457107	-0.146447

5 rows × 144 columns

First, rather than calculating the median response to each PA in the location, then calculating the circumplex coordinates, the coordinates for each individual response are calculated. This results in a vector of ISOPleasant, ISOEventful values which are continuous variables from -1 to +1 and can be analysed statistically by calculating summary statistics (mean, standard deviation, quintiles, etc.) and through the use of regression modelling, which can often be simpler and more familiar than the recommended methods of analysing ordinal data. This also enables each individual's response to be placed within the pleasant-eventful space. All of the responses for a location can then be plotted, giving an overall scatter plot for a location, as demonstrated in (i).

In [7]:

Copied!





sspy.plotting.jointplot(
    sspy.isd.select_location_ids(ssid, "PancrasLock"),
    title="(a) Example distribution of the soundscape perception of an urban park",
    diagonal_lines=True,
    # hue="LocationID",
    # legend=True,
    alpha=0.75,
)
sspy.plotting.jointplot(
    sspy.isd.select_location_ids(ssid, "PancrasLock"),
    title="(a) Example distribution of the soundscape perception of an urban park",
    diagonal_lines=True,
    # hue="LocationID",
    # legend=True,
    alpha=0.75,
)

Out[7]:

<seaborn.axisgrid.JointGrid at 0x2a5513f90>

Once these individual responses are plotted, we then overlay a heatmap of the bivariate distribution (with color maps for each decile) and marginal distribution plots. In this way, three primary characteristics of the soundscape perception can be seen:

The distribution across both pleasantness and eventfulness, including the central tendency, the dispersion, and any skewness in the response;
The general shape of the soundscape within the space - in this case Russell Sq is almost entirely in the pleasant half, but is split relatively evenly across the eventfulness space, meaning while it is perceived as generally pleasant, it is not strongly calm or vibrant;
The degree of agreement about the soundscape perception - there appears to be a relatively high agreement about the character of Russell Sq, as demonstrated by the compactness of the distribution, but this is not the case for every location, as will be shown later.

Fig (i) includes several in-depth visualisations of the distribution of soundscape assessments, however the detail included can make further analysis difficult. In particular, a decile heatmap is so visually busy that, in our experience, it is not possible to plot more than one soundscape distribution at a time without the figure becoming overly busy. It also can make it difficult to truly grasp point 2, the general shape of the soundscape. To facilitate this, the same soundscape can be represented by its 50th percentile contour, as demonstrated in Fig (ii) where the shaded portion contains 50% of the responses. This simplified view of the distribution presents several advantages, as will be demonstrated in Figs. (iii and iv) and takes inspiration from the recommendation in the ISO standard to use the median as a summary statistic.

When visualised this way, it is possible to identify outliers and responses which are the result of anamolous sound events. For instance if, during a survey session at a calm park, a fleet of helicopters flies overhead, driving the participants to respond that the soundscape is highly chaotic, we would see a group of scatter points in the chaotic quadrant which appear obviously outside the general pattern of responses. In a simpler analysis method, these responses would either be entirely discarded as outliers or surveys and soundwalks would be halted entirely -- ignoring what is in fact a significant impact on that location, its soundscape, and how useful it may be for the community -- or would be included within the statistical analysis, significantly impacting the central tendency and dispersion metrics (i.e. median and range) without consideration for the context. This is the situation shown in Fig (ii) where it is obvious that there is strong agreement that Regents Park Fields is highly pleasant and calm, however we can see numerous responses which assessed it as highly chaotic when a series of military helicopter fly overs drastically changed the sound environment of the space for nearly 20 minutes.

In [8]:

Copied!





location = "RegentsParkFields"
fig, ax = plt.subplots(1,1, figsize=(7, 7))

sspy.plotting.density(
    sspy.isd.select_location_ids(ssid, location),
    title="(b) Median perception contour and scatter plot of individual assessments\n\n",
    ax=ax,
    hue="LocationID",
    legend=True,
    density_type="simple",
    palette="dark:gray",
)
location = "RegentsParkFields"
fig, ax = plt.subplots(1,1, figsize=(7, 7))

sspy.plotting.density(
    sspy.isd.select_location_ids(ssid, location),
    title="(b) Median perception contour and scatter plot of individual assessments\n\n",
    ax=ax,
    hue="LocationID",
    legend=True,
    density_type="simple",
    palette="dark:gray",
)

Out[8]:

<Axes: title={'center': '(b) Median perception contour and scatter plot of individual assessments\n\n'}, xlabel='ISOPleasant', ylabel='ISOEventful'>

Fig (iii) demonstrates how this simplified representation makes it possible to compare the soundscape of several locations in a sophisticated way. The soundscape assessments of three urban spaces, Camden Town, Pancras Lock, and Russell Square, are shown overlaid with each other. We can see that Camden Town, a busy and crowded street corner with high levels of traffic noise and amplified music, is generally perceived as chaotic, but the median contour shape which characterises it also crosses over into the vibrant quadrant. We can also see that, for a part of the sample, Russell Square and Pancras Lock are both perceived as similarly pleasant, however some portion of the responses perceived Pancras Lock as being somewhat chaotic and annoying. This kind of visualisation is able to highlight these similarities between the soundscapes in the locations and identify how they differ. From here, further investigation could lead us to answer what is it that led to those people perceiving the location as unpleasant, and what similarities does the soundscape of Pancras Lock have with Russell Square that could perhaps be enhanced to increase the proportion of people perceiving it as more pleasant.

In [9]:

Copied!





fig, ax = plt.subplots(1,1, figsize=(7,7))

sspy.plotting.density(
    sspy.isd.select_location_ids(ssid, ['CamdenTown', 'RussellSq', 'PancrasLock']),
    title="(c) Comparison of the soundscapes of three urban spaces\n\n",
    ax=ax,
    hue="LocationID",
    alpha=0.5,
    palette="husl",
    legend=True,
    density_type="simple"
)
fig, ax = plt.subplots(1,1, figsize=(7,7))

sspy.plotting.density(
    sspy.isd.select_location_ids(ssid, ['CamdenTown', 'RussellSq', 'PancrasLock']),
    title="(c) Comparison of the soundscapes of three urban spaces\n\n",
    ax=ax,
    hue="LocationID",
    alpha=0.5,
    palette="husl",
    legend=True,
    density_type="simple"
)

Out[9]:

<Axes: title={'center': '(c) Comparison of the soundscapes of three urban spaces\n\n'}, xlabel='ISOPleasant', ylabel='ISOEventful'>

In addition to solely analysing the distributions of the perceptual responses themselves, this method can also be combined with other acoustic, environmental, and contextual data. The final example, in Fig (iv) demonstrates how this method can better demonstrate the complex relationships between acoustic features of the sound environment and the soundscape perception. The data in the ISD includes approximately 30 second long binaural audio recordings taken while each participant was responding to the soundscape survey, providing an indication of the exact sound environment they were exposed to. For Fig (iv) the entire dataset of 1,338 responses at all 13 locations has been split according to the analysis of these recordings giving a set of less than 65 dB LAeq and a set of more than 65 dB. The bivariate distribution of these two conditions are then plotted.

In [10]:

Copied!





ssid["dBLevel"] = pd.cut(
    ssid["LAeq_L(A)_Max"],
    bins=(0, 63, 150),
    labels=("Under 63dB", "Over 63dB"),
    precision=1,
)

ssid.head()
ssid["dBLevel"] = pd.cut(
    ssid["LAeq_L(A)_Max"],
    bins=(0, 63, 150),
    labels=("Under 63dB", "Over 63dB"),
    precision=1,
)

ssid.head()

Out[10]:

	LocationID	SessionID	GroupID	RecordID	start_time	end_time	latitude	longitude	Language	Survey_Version	...	THD_Min_Max	THD_Max_Max	THD_L5_Max	THD_L10_Max	THD_L50_Max	THD_L90_Max	THD_L95_Max	ISOPleasant	ISOEventful	dBLevel
0	CarloV	CarloV2	2CV12	1434	2019-05-16 18:46:00	2019-05-16 18:56:00	37.17685	-3.590392	eng	engISO2018	...	-11.76	54.18	34.82	26.53	5.57	-9.0	-10.29	0.219670	-0.133883	Under 63dB
1	CarloV	CarloV2	2CV12	1435	2019-05-16 18:46:00	2019-05-16 18:56:00	37.17685	-3.590392	eng	engISO2018	...	-11.76	54.18	34.82	26.53	5.57	-9.0	-10.29	-0.426777	0.530330	Under 63dB
2	CarloV	CarloV2	2CV13	1430	2019-05-16 19:02:00	2019-05-16 19:12:00	37.17685	-3.590392	eng	engISO2018	...	-19.32	72.52	32.33	24.52	0.25	-16.3	-17.33	0.676777	-0.073223	Under 63dB
3	CarloV	CarloV2	2CV13	1431	2019-05-16 19:02:00	2019-05-16 19:12:00	37.17685	-3.590392	eng	engISO2018	...	-19.32	72.52	32.33	24.52	0.25	-16.3	-17.33	0.603553	-0.146447	Under 63dB
4	CarloV	CarloV2	2CV13	1432	2019-05-16 19:02:00	2019-05-16 19:12:00	37.17685	-3.590392	eng	engISO2018	...	-19.32	72.52	32.33	24.52	0.25	-16.3	-17.33	0.457107	-0.146447	Under 63dB

5 rows × 145 columns

In [11]:

Copied!

ssid["dBLevel"].describe()
ssid["dBLevel"].describe()

Out[11]:

count          1433
unique            2
top       Over 63dB
freq            743
Name: dBLevel, dtype: object

In [12]:

Copied!





sspy.plotting.jointplot(
    ssid,
    marginal_kind="kde",
    title="(d) Soundscape perception as a function of sound level",
    diagonal_lines=False,
    alpha=0.5,
    palette="colorblind",
    hue="dBLevel",
    density_type="simple",
    incl_scatter=False,
    legend=True,  #
    marginal_kws={"common_norm": False, 'fill': True},
)
sspy.plotting.jointplot(
    ssid,
    marginal_kind="kde",
    title="(d) Soundscape perception as a function of sound level",
    diagonal_lines=False,
    alpha=0.5,
    palette="colorblind",
    hue="dBLevel",
    density_type="simple",
    incl_scatter=False,
    legend=True,  #
    marginal_kws={"common_norm": False, 'fill': True},
)

Out[12]:

<seaborn.axisgrid.JointGrid at 0x2a6cc0990>

Other examples¶

In addition to the visualisation demonstrations given above which were included in the JASA Express Letters article, we present a few examples of the uses of this distributional shape approach.

The soundscape shape of all 13 locations:¶

In [13]:

Copied!

len(ssid.LocationID.unique())
len(ssid.LocationID.unique())

Out[13]:

In [14]:

Copied!





fig, axes = plt.subplots(7, 4, figsize=(12, 21))
for i, location in enumerate(ssid.LocationID.unique()):
    sspy.plotting.density(
        sspy.isd.select_location_ids(ssid, location_ids=location),
        ax = axes.flatten()[i], title=location
    )

plt.tight_layout()
fig, axes = plt.subplots(7, 4, figsize=(12, 21))
for i, location in enumerate(ssid.LocationID.unique()):
    sspy.plotting.density(
        sspy.isd.select_location_ids(ssid, location_ids=location),
        ax = axes.flatten()[i], title=location
    )

plt.tight_layout()

A comparison of two days in the same location:¶

In [15]:

Copied!





fig, ax = plt.subplots(1,1,figsize=(5,5))
location='RegentsParkFields'

sspy.plotting.density(
    sspy.isd.select_location_ids(ssid, location_ids=location),
    ax=ax,
    title="Comparison of two days in Regents Park",
    density_type="simple",
    fill=False,
    incl_outline=True,
    hue="SessionID",
    legend=True
)
fig, ax = plt.subplots(1,1,figsize=(5,5))
location='RegentsParkFields'

sspy.plotting.density(
    sspy.isd.select_location_ids(ssid, location_ids=location),
    ax=ax,
    title="Comparison of two days in Regents Park",
    density_type="simple",
    fill=False,
    incl_outline=True,
    hue="SessionID",
    legend=True
)

Out[15]:

<Axes: title={'center': 'Comparison of two days in Regents Park'}, xlabel='ISOPleasant', ylabel='ISOEventful'>

All of the survey days in every location:¶

In [16]:

Copied!





fig, axes = plt.subplots(7, 4, figsize=(12, 21))
for i, location in enumerate(ssid["LocationID"].unique()):
    sspy.plotting.density(
        sspy.isd.select_location_ids(ssid, location),
        ax=axes.flatten()[i],
        title=location,
        fill=False,
        hue="SessionID",
        legend=True,
        density_type="simple",
        incl_scatter=True
    )

plt.tight_layout()
fig, axes = plt.subplots(7, 4, figsize=(12, 21))
for i, location in enumerate(ssid["LocationID"].unique()):
    sspy.plotting.density(
        sspy.isd.select_location_ids(ssid, location),
        ax=axes.flatten()[i],
        title=location,
        fill=False,
        hue="SessionID",
        legend=True,
        density_type="simple",
        incl_scatter=True
    )

plt.tight_layout()

Statistical analysis of the ISD dataset¶

Although the visualisations shown in the above figures are a powerful tool for viewing, analysing, and discussing the multi-dimensional aspects of soundscape perception, there are certainly cases where simpler metrics are needed to aid discussion and to set design goals. Taking inspiration from noise annoyance, we propose a move toward discussing the "percent of people likely to perceive" a soundscape as pleasant, vibrant, etc. when it is necessary to use numerical descriptions. In this way, a numerical design goal could also be set as e.g. "the soundscape should be likely to be perceived as pleasant by at least 75% of users" or the results of an intervention presented as e.g. "the likelihood of the soundscape being perceived as calm increased from 30% to 55%". These numbers can be drawn from either actual surveys or from the results of predictive models.

Finally, although acknowledging the distribution of responses is crucial, it is sometimes necessary to summarise locations down to a single point to compare many different locations and to easily investigate how the soundscape assessment has generally changed over time. For this purpose, the mean of the ISOPleasant and ISOEventful values across all respondents is calculated to result in a single coordinate point per location. This clearly mirrors the original intent of the coordinate transformation presented in the ISO, but by applying the transformation first to each individual assessment then calculating the mean value, it maintains a direct link to the distributions shown above.

We have included a function for creating a numerical summary of each location. For this, we first calculate the mean ISOPleasant and ISOEventful value for the location, giving a single coordinate to describe the location in the circumplex. We then calculate the percentage of overall responses falling in the pleasant or eventful halfs, or in the vibrant, chaotic, etc. quadrants.

In [17]:

Copied!

sspy.isd.describe_location(ssid, "CamdenTown")
sspy.isd.describe_location(ssid, "CamdenTown")

Out[17]:

{'count': 105,
 'ISOPleasant': -0.103,
 'ISOEventful': 0.364,
 'pleasant': 0.352,
 'eventful': 0.895,
 'vibrant': 0.295,
 'chaotic': 0.6,
 'monotonous': 0.038,
 'calm': 0.057}

In [18]:

Copied!

sspy.isd.soundscapy_describe(ssid)
sspy.isd.soundscapy_describe(ssid)

Out[18]:

	count	ISOPleasant	ISOEventful	pleasant	eventful	vibrant	chaotic	monotonous	calm
CarloV	116	0.518	-0.012	0.940	0.448	0.397	0.052	0.009	0.526
SanMarco	95	0.224	0.377	0.737	0.874	0.642	0.221	0.021	0.095
PlazaBibRambla	18	0.463	-0.023	0.889	0.556	0.500	0.056	0.056	0.389
CamdenTown	105	-0.103	0.364	0.352	0.895	0.295	0.600	0.038	0.057
EustonTap	96	-0.220	0.198	0.250	0.771	0.177	0.594	0.146	0.062
Noorderplantsoen	97	0.427	0.315	0.897	0.825	0.722	0.103	0.000	0.155
MarchmontGarden	104	0.279	-0.037	0.721	0.375	0.231	0.144	0.125	0.481
MonumentoGaribaldi	32	0.421	-0.007	0.906	0.438	0.375	0.062	0.031	0.500
TateModern	152	0.381	0.211	0.901	0.730	0.645	0.086	0.007	0.243
PancrasLock	93	0.257	0.079	0.710	0.613	0.376	0.237	0.043	0.323
TorringtonSq	113	0.085	0.200	0.584	0.699	0.363	0.336	0.071	0.204
RegentsParkFields	118	0.501	-0.050	0.873	0.415	0.297	0.119	0.008	0.559
RegentsParkJapan	91	0.653	-0.006	0.956	0.527	0.505	0.022	0.022	0.418
RussellSq	145	0.482	0.057	0.910	0.545	0.490	0.055	0.034	0.393
StPaulsCross	66	0.358	0.139	0.848	0.652	0.545	0.106	0.030	0.258
StPaulsRow	72	0.234	0.122	0.750	0.639	0.486	0.153	0.083	0.250
CampoPrincipe	110	0.505	-0.076	0.918	0.382	0.327	0.055	0.027	0.582
MiradorSanNicolas	28	0.395	0.126	0.929	0.607	0.536	0.071	0.000	0.393
LianhuashanParkEntrance	60	0.281	0.054	0.850	0.483	0.400	0.083	0.067	0.417
LianhuashanParkForest	59	0.292	0.046	0.797	0.559	0.373	0.186	0.017	0.373
PingshanPark	69	0.237	0.013	0.812	0.449	0.304	0.145	0.029	0.464
PingshanStreet	99	-0.055	0.047	0.343	0.576	0.101	0.475	0.172	0.222
ZhongshanPark	442	0.126	0.055	0.656	0.545	0.355	0.190	0.131	0.267
OlympicSquare	330	0.338	0.025	0.885	0.488	0.409	0.079	0.030	0.452
ZhongshanSquare	416	0.217	-0.032	0.774	0.380	0.281	0.099	0.108	0.418
DadongSquare	354	0.294	-0.005	0.862	0.438	0.356	0.082	0.056	0.466

The standard describe() method in pandas can also still be used to calculate other summary statistics.

In [19]:

Copied!

ssid.describe()
ssid.describe()

Out[19]:

	latitude	longitude	ssi01	ssi02	ssi03	ssi04	PAQ1	PAQ4	PAQ2	PAQ7	...	THD_THD_Max	THD_Min_Max	THD_Max_Max	THD_L5_Max	THD_L10_Max	THD_L50_Max	THD_L90_Max	THD_L95_Max	ISOPleasant	ISOEventful
count	1554.000000	1554.000000	3474.000000	3474.000000	3475.000000	3475.000000	3480.000000	3480.000000	3480.000000	3480.000000	...	1433.000000	1433.000000	1433.000000	1433.000000	1433.000000	1433.000000	1433.000000	1433.000000	3480.000000	3480.000000
mean	48.514777	0.286628	2.490213	2.545481	2.819281	2.480576	3.667241	2.616092	3.509195	2.868103	...	-2.639156	-10.418095	64.076148	36.833517	29.390363	8.217334	-7.193992	-8.363992	0.268778	0.064362
std	5.480102	3.827394	1.080434	1.098295	1.012090	1.136286	1.049872	1.140478	0.994626	1.046004	...	2.503211	6.627678	7.183133	6.847635	6.946812	7.731246	6.917431	6.747693	0.361666	0.275440
min	37.172598	-3.600068	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	...	-18.910000	-21.440000	44.630000	17.230000	12.640000	-14.600000	-19.450000	-19.860000	-1.000000	-0.823223
25%	45.433663	-0.146595	2.000000	2.000000	2.000000	2.000000	3.000000	2.000000	3.000000	2.000000	...	-3.550000	-15.310000	58.520000	32.330000	24.720000	2.770000	-12.320000	-13.400000	0.042893	-0.103553
50%	51.521683	-0.129658	2.000000	3.000000	3.000000	2.000000	4.000000	2.000000	4.000000	3.000000	...	-1.880000	-11.520000	64.740000	37.220000	29.580000	7.580000	-8.300000	-9.450000	0.323223	0.042893
75%	51.526900	-0.100162	3.000000	3.000000	4.000000	3.000000	4.000000	4.000000	4.000000	4.000000	...	-0.890000	-7.580000	69.610000	41.790000	34.290000	12.310000	-3.710000	-5.260000	0.517005	0.219670
max	51.539196	12.354908	5.000000	5.000000	5.000000	5.000000	5.000000	5.000000	5.000000	5.000000	...	-0.050000	21.500000	89.650000	58.340000	54.990000	38.550000	24.340000	22.740000	1.000000	1.000000

8 rows × 125 columns

Finally, the mean coordinate values for each location can be plotted in the soundscape circumplex.

In [20]:

Copied!





means = sspy.surveys.mean_responses(ssid, group="LocationID")
means = sspy.surveys.add_iso_coords(means, overwrite=True)
sspy.plotting.scatter(
    means,
    hue="LocationID",
    s=40,
    legend=False,
    xlim=(-0.25, 0.75),
    ylim=(-0.25, 0.75)
)

# means.sspy.scatter(hue="LocationID", s=40, legend=False, xlim=(-0.25, 0.75), ylim=(-0.25, 0.75))
means = sspy.surveys.mean_responses(ssid, group="LocationID")
means = sspy.surveys.add_iso_coords(means, overwrite=True)
sspy.plotting.scatter(
    means,
    hue="LocationID",
    s=40,
    legend=False,
    xlim=(-0.25, 0.75),
    ylim=(-0.25, 0.75)
)

# means.sspy.scatter(hue="LocationID", s=40, legend=False, xlim=(-0.25, 0.75), ylim=(-0.25, 0.75))

Out[20]:

<Axes: title={'center': 'Soundscape Scatter Plot'}, xlabel='ISOPleasant', ylabel='ISOEventful'>

In [20]: