Date of Award

2023

Degree Type

Open Access Dissertation

Degree Name

Computational Science Joint PhD with San Diego State University, PhD

Program

Institute of Mathematical Sciences

Advisor/Supervisor/Committee Chair

Jérôme Gilles & Henry Schellhorn

Dissertation or Thesis Committee Member

Allon G. Percus

Dissertation or Thesis Committee Member

Peter Blomgren

Dissertation or Thesis Committee Member

Arjuna Flenner

Terms of Use & License Information

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Rights Information

© 2023 Justin Y Sunu

Keywords

audio classification, graph-based algorithm, machine learning

Subject Categories

Applied Mathematics

Abstract

The rapid growth of audio data collection in various domains necessitates advanced techniquesfor efficient analysis and classification. This dissertation proposes new approaches for categorizing acoustic data, using both unsupervised and semi-supervised learning methods. Starting with raw audio, we preprocess the signal to segment it into time windows, each of which we consider as an independent data point. We use the short-time Fourier transform to describe the signal in a given time window as a set of Fourier coefficients. We interpret the resulting frequency signature as a high-dimensional feature description of each data point. We then develop a graph-based approach for analyzing these signals, representing the data using a similarity graph. Following methods used successfully in image processing and problems on networks, we apply a spectral embedding to project the high-dimensional graph data onto a low-dimensional subspace. We show how the Nyström extension can accelerate the calculation of the eigenvectors of the graph Laplacian, and how to adapt the method to accommodate streaming data. Using the low-dimensional representation of the audio signal, we consider several clustering methods for categorizing the data. We compare results of the conventional spectral clustering algorithm, which applies ?-means to the eigenvectors of the Laplacian, with a semi-supervised implementation of ?-nearest neighbors on these eigenvectors. We also use an incremental reseeding algorithm that diffuses cluster labels across a graph, showing how its output can construct a novel reduced-dimensionality representation of the data. Based on this, we propose a semi-supervised extension of the method. Finally, we evaluate the effectiveness of our methodology on problems of classifying vehicles based on roadside microphone recordings and of classifying songs according to musical genre. We demonstrate the effects of spectral embedding on these problems, as well as the relative performance of both our unsupervised and semi-supervised algorithms. These results suggest that, even with little or no training data, graph-based methods can provide a powerful tool for acoustic analysis and for machine learning from acoustic signals.

ISBN

9798381904406

Share

COinS