当前位置：首页 > 软件库 > 神经网络/人工智能 > 机器学习/深度学习 >

deep-listening

Deep Learning experiments for audio classification

授权协议 Readme

开发语言 Python

所属分类神经网络/人工智能、机器学习/深度学习

软件类型开源软件

地区不详

投递者慕容念

操作系统跨平台

开源组织无

适用人群未知

软件概览

deep-listening

Deep learning experiments for audio classification

A full write-up, including technical explanations and design decisions, as well as a summary of results achieved can be found within the associated Project Report.

This project consists of several Jupyter notebooks that implement deep learning audio classifiers.

1-us8k-ffn-extract-explore.ipynb

this notebook contains code to extract and visualise audio files from the UrbanSound8K data set
the feature extraction process uses audio processing metrics from the librosa library, which reduces each recording to 193 data points
as the audio information is highly abstracted, (we can not process successive frames using a receptive field), these features are intended to be fed into a feed-forward neural network (FFN)

2-us8k-ffn-train-predict.ipynb

this notebook contains the code to load previously extracted features and feed them into a 3-layer FFN, implemented using Tensorflow and Keras
also included is some code to evaluate model performance, and to generate predictions from individual samples, demonstrating how a trained model would be used to identify the nature of live recordings

3-us8k-cnn-extract-train.ipynb

this notebook extracts audio features suitable for input into a classic 2-layer Convolutional Neural Network (CNN)
much more of the audio data is preserved in this approach, as the saved numpy feature data is over 2GB I haven't included it with this repository, but you can use the code in this notebook to extract it from the original UrbanSound8K data set

4-us8k-cnn-salamon.ipynb

this notebook implements an alternative CNN, similar to one described by Salamon and Bello

5-ffbird-cnn.ipynb

this notebook uses the Salamon and Bello CNN to process the FreeField1010 data set of field recordings, with the goal of recognising the presence of birdsong.
the data set is not part of this repository, so if you want to run this code you'll need to download the data yourself (see instructions in the notebook)

7-us8k-rnn-extract-train.ipynb

this uses a Recurrent Neural Network to classify Mel-frequency cepstral coefficients (MFCC) features.

Do get in touch if you've any questions, (me @ jaroncollis . com)

使用案例

JS - console.log deep display

const util = require('util') console.log(('inspect db 1 is=',util.inspect(db, {showHidden: false, depth: null}))) console.log(('inspect db 2 is=',util.inspect(db, false, null, true /* enable colors *
basic know of deep learning

Notes from stanford CS231N online course. I am labelling the image while listening to the online course. stride, filiter, pad input: 7x7 F:filter:3x3 stride:N 1 padding:k OUTPUT=(7-F)/stride+1=(7-3)/1

deep-listening

deep-listening

1-us8k-ffn-extract-explore.ipynb

2-us8k-ffn-train-predict.ipynb

3-us8k-cnn-extract-train.ipynb

4-us8k-cnn-salamon.ipynb

5-ffbird-cnn.ipynb

7-us8k-rnn-extract-train.ipynb

同类工具

相关阅读

相关文章

相关问答

相关文档