Determination of Likelihood Ratios for Forensic Voice Comparison Using Principal Component Analysis
Issued Date: 26 Jun 2014
Abstract
The likelihood ratio (LR) framework is gaining increasing acceptance amongst forensic speech scientists when undertaking forensic voice comparison. Multivariate Kernel Density (MVKD) is one approach that has been used for calculating LRs when the number of parameters is in the region of 3 or 4. However there could be robustness issues with this approach when the number of parameters is larger than this. In this paper we present an alternative to the MVKD approach, termed Principal Component Analysis Kernel Density Likelihood Ratio (PCAKLR), which takes account of within-segment correlations, yet is computationally robust irrespective of the number of parameters used. We show that PCAKLR produces comparable results to MVKD for small numbers of parameters. Further, it also has the ability to directly handle between-segment correlations and is thus an alternative to the logistic-regression fusion typically used to combine results from multiple segments.
Download Media
PDF (Price: £17.50 )References
Aitken,C.G.G.(1995)StatisticsandtheEvaluationofEvidenceforForensicScientists.NewYork:J.Wiley.
Aitken,C.G.G.andLucy,D.(2004)Evaluationoftraceevidenceintheformofmultivariatedata.JournaloftheRoyalStatisticalSociety:SeriesC(AppliedStatistics)53(1):109–122.
Aitken,C.G.G.andTaroni,F.(2004)StatisticsandtheEvaluationofEvidenceforForensicsScientists,vol.10.NewYork:JohnWiley&SonsInc.
Becker,T.,Jessen,M.andGrigoras,C.(2008)ForensicspeakerverificationusingformantfeaturesandGaussianmixturemodels.ProceedingsofInterspeech,InternationalSpeechCommunicationAssociation:1505–1508.
Becker,T.,Jessen,M.andGrigoras,C.(2009)SpeakerverificationbasedonformantsusingGaussianmixturemodels.ProceedingsofNAG/DAGAInternationalConferenceonAcoustics,Rotterdam:1640–1643.
Brümmer,N.(2004)Application-independentevaluationofspeakerdetection.ODYSSEY04-TheSpeakerandLanguageRecognitionWorkshop,Toledo,Spain.
Brümmer,N.,Burget,L.,etal.(2007)FusionofheterogeneousspeakerrecognitionsystemsintheSTBUsubmissionfortheNISTspeakerrecognitionevaluation2006.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2072–2084.http://dx.doi.org/10.1109/TASL.2007.902870
Brümmer,N.andduPreez,J.(2006)Application-independentevaluationofspeakerdetection.ComputerSpeechandLanguage20(2):230–275.
Cao,L.,Chua,K.,Chong,W.K.,Lee,H.P.andGu,Q.M.(2003)AcomparisonofPCA,KPCAandICAfordimensionalityreductioninsupportvectormachine.Neurocomputing55(1):321–336.
Cheney,E.W.andKincaid,D.R.(2007)NumericalMathematicsandComputing.Stamford:Brooks/ColePub.Co.
Edelman,A.(1989)Eigenvaluesandconditionnumbersofrandommatrices.PhdThesisMassachusettsInstituteofTechnology,Cambridge,MA.
Gold,E.andFrench,P.(2011)Internationalpracticesinforensicspeakercomparison.InternationalJournalofSpeech,LanguageandtheLaw18(2):293–307.
Golub,G.andKahan,W.(1965)Calculatingthesingularvaluesandpseudo-inverseofamatrix.JournaloftheSocietyforIndustrialandAppliedMathematics:SeriesB,NumericalAnalysis2(2):205–224.
Gonzalez-Rodriguez,J.,Drygajlo,A.,Ramos-Castro,D.,Garcia-Gomar,M.andOrtega-Garcia,J.(2006)Robustestimation,interpretationandassessmentoflikelihoodratiosinforensicspeakerrecognition.ComputerSpeechandLanguage20(2–3):331–355.http://dx.doi.org/10.1016/j.csl.2005.08.005
Gonzalez-Rodriguez,J.,Rose,P.,Ramos,D.,Toledano,D.T.andOrtega-Garcia,J.(2007)EmulatingDNA:Rigorousquantificationofevidentialweightintransparentandtestableforensicspeakerrecognition.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2104–2115.http://dx.doi.org/10.1109/TASL.2007.902747
Hollander,M.,WolfeD.A.andChicken,E.(2013)NonparametricStatisticalMethods,vol.751.NewYork:JohnWiley&Sons.
Jackson,J.E.andWiley,J.(1991)AUser’sGuidetoPrincipalComponents.NewYork:WileyOnlineLibrary.
Jolliffe,I.T.(2002)Principalcomponentanalysis.EncyclopediaofStatisticsinBehavioralScience.NewYork:Springer.
Jolliffe,I.T.(1986)PrincipalComponentAnalysis.NewYork:Springer-Verlag.
Khodai-Joopari,M.(2006)Forensicspeakeranalysisandidentificationbycomputer.ABayesianapproachanchoredinthecepstraldomain.UnpublishedPhDThesis,UniversityofNewSouthWales,Australia.
Lewis,S.(1984)Philosophyofspeakeridentification.Policeapplicationsofspeechandtaperecordinganalysis.ProceedingoftheInstituteofAcoustics6(1):69–77.
Lindley,D.(1977)Aprobleminforensicscience.BiometrikaTrust64(2):207–213.http://dx.doi.org/10.1093/biomet/64.2.207
Meuwly,D.andDrygajlo,A.(2001)ForensicspeakerrecognitionbasedonaBayesianframeworkandGaussianMixtureModelling(GMM).ASpeakerOdyssey-TheSpeakerRecognitionWorkshop,Crete,Greece.
Morrison,G.S.(2009)Likelihood-ratioforensicvoicecomparisonusingparametricrepresentationsoftheformanttrajectoriesofdiphthongs.JournaloftheAcousticalSocietyofAmerica125(4):2387–2397.http://dx.doi.org/10.1121/1.3081384
Morrison,G.S.(2010)Forensicvoicecomparison.ExpertEvidence,ThomsonReuters,Sydney,Australia40:1–105.
Morrison,G.S.(2011a)Acomparisonofproceduresforthecalculationofforensiclikelihoodratiosfromacoustic-phoneticdata:Multvariatekerneldensity(MVKD)versusGaussianmixturemodel-universalbackgroundmodel(GMM-UBM).SpeechCommunication53(2):242–256.
Morrison,G.S.(2011b)Measuringthevalidityandreliabilityofforensiclikelihood-ratiosystems.ScienceandJustice51(3):91–98.
Pigeon,S.,Druyts,P.andVerlinde,P.(2000)ApplyinglogisticregressiontothefusionoftheNIST’991-speakersubmissions.DigitalSignalProcessing10(1):237–248.http://dx.doi.org/10.1006/dspr.1999.0358
Ramos-Castro,D.(2007)Forensicevaluationoftheevidenceusingautomaticspeakerrecognitionsystems.PhDDissertation,UniversidadautonomadeMadrid.
Ramos-Castro,D.,Gonzalez-Rodriguez,J.andOrtega-Garcia,J.(2006)Likelihoodratiocalibrationinatransparentandtestableforensicspeakerrecognitionframework.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Reynolds,D.A.,Quatieri,T.F.,etal.(2000)SpeakerverificationusingadaptedGaussianmixturemodels.DigitalSignalProcessing10(1):19–41.http://dx.doi.org/10.1006/dspr.1999.0361
Rose,P.(2002)ForensicSpeakerIdentification.London,NewYork:Taylor&Francis.
Rose,P.(2003)Thetechnicalcomparisonofforensicvoicesamples.ExpertEvidence,ThomsonLawbookCompany,Sydney,Australia99:1–126.
Rose,P.(2006a)Accountingforcorrelationinlinguistic-acousticlikelihoodratio-basedforensicspeakerdiscrimination.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Rose,P.(2006b)Technicalforensicspeakerrecognition:evaluation,typesandtestingofevidence.ComputerSpeechandLanguage20(2):159–191.http://dx.doi.org/10.1016/j.csl.2005.07.003
Rose,P.(2010)TheeffectofcorrelationonstrengthofevidenceestimatesinForensicVoiceComparison:uni-andmultivariateLikelihoodRatio-baseddiscriminationwithAustralianEnglishvowelacoustics.InternationalJournalofBiometrics2(4):316–329.http://dx.doi.org/10.1504/IJBM.2010.035447
Rose,P.(2011)ForensicvoicecomparisonwithJapanesevowelacoustics-alikelihoodratio-basedapproachsegmentalcepstra.Proceedingsofthe17thInternationalCongressofPhoneticSciences:1718–1721.
Rose,P.,Osanai,T.andKinoshita,Y.(2003)Strengthofforensicspeakeridentificationevidence:multispeakerformant-andcepstrum-basedsegmentaldiscriminationwithaBayesianlikelihoodratioasthreshold.InternationalJournalofSpeechLanguageandtheLaw10(2):179–202.
Seneff,S.andZue,V.(1988)Transcriptionandalignmentofthetimitdatabase.TIMITCD-ROMDocumentation.
Shlens,J.(2005)Atutorialonprincipalcomponentanalysis.SystemsNeurobiologyLaboratory82,UniversityofCaliforniaatSanDiego.
Singh,S.andT.Murry(1978)Multidimensionalclassificationofnormalvoicequalities.JournaloftheAcousticalSocietyofAmerica64(1):81–87.http://dx.doi.org/10.1121/1.381958
Stevens,K.N.(1971)Sourcesofinter-andintra-speakervariabilityintheacousticpropertiesofspeechsounds.Proceedingsofthe7thInternationalCongressofPhoneticSciences:206–232.
Tipping,M.E.andBishop,C.M.(1999)Probabilisticprincipalcomponentanalysis.JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology)61(3):611–622.
Trefethen,L.N.andBau,D.(1997)NumericalLinearAlgebra.Pennsylvania:SocietyforIndustrialMathematics,Vol.50.
vanLeeuwen,D.A.andBrümmer,N.(2007)Anintroductiontoapplication-independentevaluationofspeakerrecognitionsystems.InC.Müller(ed.)SpeakerClassificationI.FundamentalsFeatures,andMethods330-353.Berlin,Heidelberg:Springer.
Wand,M.P.andJones,M.C.(1994)KernelSmoothing.Florida:CrcPress.
Wang,H.andYang,J.(2010)Thecomparisonof‘Idiot'sBayes’andmultivariatekernel-densityinforensicspeakeridentificationusingChinesevowel/a/.Proceedingsofthe3rdInternationalCongressonImageandSignalProcessing(CISP2010)8:3533–3537.
Zou,H.,Hastie,T.andTibshirani,R.(2006)Sparseprincipalcomponentanalysis.JournalofComputationalandGraphicalStatistics15(2):265–286.http://dx.doi.org/10.1198/106186006X113430
Zue,V.,Seneff,S.andGlass,J.(1990)SpeechdatabasedevelopmentatMIT:TIMITandbeyond.SpeechCommunication9(4):351–356.http://dx.doi.org/10.1016/0167-6393(90)90010-7
Aitken,C.G.G.andLucy,D.(2004)Evaluationoftraceevidenceintheformofmultivariatedata.JournaloftheRoyalStatisticalSociety:SeriesC(AppliedStatistics)53(1):109–122.
Aitken,C.G.G.andTaroni,F.(2004)StatisticsandtheEvaluationofEvidenceforForensicsScientists,vol.10.NewYork:JohnWiley&SonsInc.
Becker,T.,Jessen,M.andGrigoras,C.(2008)ForensicspeakerverificationusingformantfeaturesandGaussianmixturemodels.ProceedingsofInterspeech,InternationalSpeechCommunicationAssociation:1505–1508.
Becker,T.,Jessen,M.andGrigoras,C.(2009)SpeakerverificationbasedonformantsusingGaussianmixturemodels.ProceedingsofNAG/DAGAInternationalConferenceonAcoustics,Rotterdam:1640–1643.
Brümmer,N.(2004)Application-independentevaluationofspeakerdetection.ODYSSEY04-TheSpeakerandLanguageRecognitionWorkshop,Toledo,Spain.
Brümmer,N.,Burget,L.,etal.(2007)FusionofheterogeneousspeakerrecognitionsystemsintheSTBUsubmissionfortheNISTspeakerrecognitionevaluation2006.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2072–2084.http://dx.doi.org/10.1109/TASL.2007.902870
Brümmer,N.andduPreez,J.(2006)Application-independentevaluationofspeakerdetection.ComputerSpeechandLanguage20(2):230–275.
Cao,L.,Chua,K.,Chong,W.K.,Lee,H.P.andGu,Q.M.(2003)AcomparisonofPCA,KPCAandICAfordimensionalityreductioninsupportvectormachine.Neurocomputing55(1):321–336.
Cheney,E.W.andKincaid,D.R.(2007)NumericalMathematicsandComputing.Stamford:Brooks/ColePub.Co.
Edelman,A.(1989)Eigenvaluesandconditionnumbersofrandommatrices.PhdThesisMassachusettsInstituteofTechnology,Cambridge,MA.
Gold,E.andFrench,P.(2011)Internationalpracticesinforensicspeakercomparison.InternationalJournalofSpeech,LanguageandtheLaw18(2):293–307.
Golub,G.andKahan,W.(1965)Calculatingthesingularvaluesandpseudo-inverseofamatrix.JournaloftheSocietyforIndustrialandAppliedMathematics:SeriesB,NumericalAnalysis2(2):205–224.
Gonzalez-Rodriguez,J.,Drygajlo,A.,Ramos-Castro,D.,Garcia-Gomar,M.andOrtega-Garcia,J.(2006)Robustestimation,interpretationandassessmentoflikelihoodratiosinforensicspeakerrecognition.ComputerSpeechandLanguage20(2–3):331–355.http://dx.doi.org/10.1016/j.csl.2005.08.005
Gonzalez-Rodriguez,J.,Rose,P.,Ramos,D.,Toledano,D.T.andOrtega-Garcia,J.(2007)EmulatingDNA:Rigorousquantificationofevidentialweightintransparentandtestableforensicspeakerrecognition.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2104–2115.http://dx.doi.org/10.1109/TASL.2007.902747
Hollander,M.,WolfeD.A.andChicken,E.(2013)NonparametricStatisticalMethods,vol.751.NewYork:JohnWiley&Sons.
Jackson,J.E.andWiley,J.(1991)AUser’sGuidetoPrincipalComponents.NewYork:WileyOnlineLibrary.
Jolliffe,I.T.(2002)Principalcomponentanalysis.EncyclopediaofStatisticsinBehavioralScience.NewYork:Springer.
Jolliffe,I.T.(1986)PrincipalComponentAnalysis.NewYork:Springer-Verlag.
Khodai-Joopari,M.(2006)Forensicspeakeranalysisandidentificationbycomputer.ABayesianapproachanchoredinthecepstraldomain.UnpublishedPhDThesis,UniversityofNewSouthWales,Australia.
Lewis,S.(1984)Philosophyofspeakeridentification.Policeapplicationsofspeechandtaperecordinganalysis.ProceedingoftheInstituteofAcoustics6(1):69–77.
Lindley,D.(1977)Aprobleminforensicscience.BiometrikaTrust64(2):207–213.http://dx.doi.org/10.1093/biomet/64.2.207
Meuwly,D.andDrygajlo,A.(2001)ForensicspeakerrecognitionbasedonaBayesianframeworkandGaussianMixtureModelling(GMM).ASpeakerOdyssey-TheSpeakerRecognitionWorkshop,Crete,Greece.
Morrison,G.S.(2009)Likelihood-ratioforensicvoicecomparisonusingparametricrepresentationsoftheformanttrajectoriesofdiphthongs.JournaloftheAcousticalSocietyofAmerica125(4):2387–2397.http://dx.doi.org/10.1121/1.3081384
Morrison,G.S.(2010)Forensicvoicecomparison.ExpertEvidence,ThomsonReuters,Sydney,Australia40:1–105.
Morrison,G.S.(2011a)Acomparisonofproceduresforthecalculationofforensiclikelihoodratiosfromacoustic-phoneticdata:Multvariatekerneldensity(MVKD)versusGaussianmixturemodel-universalbackgroundmodel(GMM-UBM).SpeechCommunication53(2):242–256.
Morrison,G.S.(2011b)Measuringthevalidityandreliabilityofforensiclikelihood-ratiosystems.ScienceandJustice51(3):91–98.
Pigeon,S.,Druyts,P.andVerlinde,P.(2000)ApplyinglogisticregressiontothefusionoftheNIST’991-speakersubmissions.DigitalSignalProcessing10(1):237–248.http://dx.doi.org/10.1006/dspr.1999.0358
Ramos-Castro,D.(2007)Forensicevaluationoftheevidenceusingautomaticspeakerrecognitionsystems.PhDDissertation,UniversidadautonomadeMadrid.
Ramos-Castro,D.,Gonzalez-Rodriguez,J.andOrtega-Garcia,J.(2006)Likelihoodratiocalibrationinatransparentandtestableforensicspeakerrecognitionframework.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Reynolds,D.A.,Quatieri,T.F.,etal.(2000)SpeakerverificationusingadaptedGaussianmixturemodels.DigitalSignalProcessing10(1):19–41.http://dx.doi.org/10.1006/dspr.1999.0361
Rose,P.(2002)ForensicSpeakerIdentification.London,NewYork:Taylor&Francis.
Rose,P.(2003)Thetechnicalcomparisonofforensicvoicesamples.ExpertEvidence,ThomsonLawbookCompany,Sydney,Australia99:1–126.
Rose,P.(2006a)Accountingforcorrelationinlinguistic-acousticlikelihoodratio-basedforensicspeakerdiscrimination.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Rose,P.(2006b)Technicalforensicspeakerrecognition:evaluation,typesandtestingofevidence.ComputerSpeechandLanguage20(2):159–191.http://dx.doi.org/10.1016/j.csl.2005.07.003
Rose,P.(2010)TheeffectofcorrelationonstrengthofevidenceestimatesinForensicVoiceComparison:uni-andmultivariateLikelihoodRatio-baseddiscriminationwithAustralianEnglishvowelacoustics.InternationalJournalofBiometrics2(4):316–329.http://dx.doi.org/10.1504/IJBM.2010.035447
Rose,P.(2011)ForensicvoicecomparisonwithJapanesevowelacoustics-alikelihoodratio-basedapproachsegmentalcepstra.Proceedingsofthe17thInternationalCongressofPhoneticSciences:1718–1721.
Rose,P.,Osanai,T.andKinoshita,Y.(2003)Strengthofforensicspeakeridentificationevidence:multispeakerformant-andcepstrum-basedsegmentaldiscriminationwithaBayesianlikelihoodratioasthreshold.InternationalJournalofSpeechLanguageandtheLaw10(2):179–202.
Seneff,S.andZue,V.(1988)Transcriptionandalignmentofthetimitdatabase.TIMITCD-ROMDocumentation.
Shlens,J.(2005)Atutorialonprincipalcomponentanalysis.SystemsNeurobiologyLaboratory82,UniversityofCaliforniaatSanDiego.
Singh,S.andT.Murry(1978)Multidimensionalclassificationofnormalvoicequalities.JournaloftheAcousticalSocietyofAmerica64(1):81–87.http://dx.doi.org/10.1121/1.381958
Stevens,K.N.(1971)Sourcesofinter-andintra-speakervariabilityintheacousticpropertiesofspeechsounds.Proceedingsofthe7thInternationalCongressofPhoneticSciences:206–232.
Tipping,M.E.andBishop,C.M.(1999)Probabilisticprincipalcomponentanalysis.JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology)61(3):611–622.
Trefethen,L.N.andBau,D.(1997)NumericalLinearAlgebra.Pennsylvania:SocietyforIndustrialMathematics,Vol.50.
vanLeeuwen,D.A.andBrümmer,N.(2007)Anintroductiontoapplication-independentevaluationofspeakerrecognitionsystems.InC.Müller(ed.)SpeakerClassificationI.FundamentalsFeatures,andMethods330-353.Berlin,Heidelberg:Springer.
Wand,M.P.andJones,M.C.(1994)KernelSmoothing.Florida:CrcPress.
Wang,H.andYang,J.(2010)Thecomparisonof‘Idiot'sBayes’andmultivariatekernel-densityinforensicspeakeridentificationusingChinesevowel/a/.Proceedingsofthe3rdInternationalCongressonImageandSignalProcessing(CISP2010)8:3533–3537.
Zou,H.,Hastie,T.andTibshirani,R.(2006)Sparseprincipalcomponentanalysis.JournalofComputationalandGraphicalStatistics15(2):265–286.http://dx.doi.org/10.1198/106186006X113430
Zue,V.,Seneff,S.andGlass,J.(1990)SpeechdatabasedevelopmentatMIT:TIMITandbeyond.SpeechCommunication9(4):351–356.http://dx.doi.org/10.1016/0167-6393(90)90010-7
Refbacks
- There are currently no refbacks.