Identifying a speaker's voice from an audio clip at mass gathering

imkiran.timsina

New Member
Jan 29, 2017
1
0
0
I got interested in working on a personal project which is going to have following features
* the audio clip will be in mp3 format
* Audio will have different people speaking in a mass gathering
* Strip off a particular person's voice
* The system will be provided with some sample voice clips of each person

I understand the basics of digital signal processing (FT, FFT, DFT, STFT, etc.). The subject is a little bit intimidating, so I wanted to know in specific
* What theories should I deeply focus on for this case? Any reference documents would be appreciated.
* What tools should I be using for this particular case? I'm thinking to do this using NAudio library for .NET. I'm willing to change my programming platform if there are some other better libraries.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing