International Journal of Speech Language and the Law, Vol 21, No 1 (2014)

Determination of Likelihood Ratios for Forensic Voice Comparison Using Principal Component Analysis

Balamurali Nair, Esam Alzqhoul, Bernard John Guillemin
Issued Date: 26 Jun 2014

Abstract


The likelihood ratio (LR) framework is gaining increasing acceptance amongst forensic speech scientists when undertaking forensic voice comparison. Multivariate Kernel Density (MVKD) is one approach that has been used for calculating LRs when the number of parameters is in the region of 3 or 4. However there could be robustness issues with this approach when the number of parameters is larger than this. In this paper we present an alternative to the MVKD approach, termed Principal Component Analysis Kernel Density Likelihood Ratio (PCAKLR), which takes account of within-segment correlations, yet is computationally robust irrespective of the number of parameters used. We show that PCAKLR produces comparable results to MVKD for small numbers of parameters. Further, it also has the ability to directly handle between-segment correlations and is thus an alternative to the logistic-regression fusion typically used to combine results from multiple segments.

Download Media

PDF (Price: £17.50 )

DOI: 10.1558/ijsll.v21i1.83

References


Aitken,C.G.G.(1995)StatisticsandtheEvaluationofEvidenceforForensicScientists.NewYork:J.Wiley.
Aitken,C.G.G.andLucy,D.(2004)Evaluationoftraceevidenceintheformofmultivariatedata.JournaloftheRoyalStatisticalSociety:SeriesC(AppliedStatistics)53(1):109–122.
Aitken,C.G.G.andTaroni,F.(2004)StatisticsandtheEvaluationofEvidenceforForensicsScientists,vol.10.NewYork:JohnWiley&SonsInc.
Becker,T.,Jessen,M.andGrigoras,C.(2008)ForensicspeakerverificationusingformantfeaturesandGaussianmixturemodels.ProceedingsofInterspeech,InternationalSpeechCommunicationAssociation:1505–1508.
Becker,T.,Jessen,M.andGrigoras,C.(2009)SpeakerverificationbasedonformantsusingGaussianmixturemodels.ProceedingsofNAG/DAGAInternationalConferenceonAcoustics,Rotterdam:1640–1643.
Brümmer,N.(2004)Application-independentevaluationofspeakerdetection.ODYSSEY04-TheSpeakerandLanguageRecognitionWorkshop,Toledo,Spain.
Brümmer,N.,Burget,L.,etal.(2007)FusionofheterogeneousspeakerrecognitionsystemsintheSTBUsubmissionfortheNISTspeakerrecognitionevaluation2006.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2072–2084.http://dx.doi.org/10.1109/TASL.2007.902870
Brümmer,N.andduPreez,J.(2006)Application-independentevaluationofspeakerdetection.ComputerSpeechandLanguage20(2):230–275.
Cao,L.,Chua,K.,Chong,W.K.,Lee,H.P.andGu,Q.M.(2003)AcomparisonofPCA,KPCAandICAfordimensionalityreductioninsupportvectormachine.Neurocomputing55(1):321–336.
Cheney,E.W.andKincaid,D.R.(2007)NumericalMathematicsandComputing.Stamford:Brooks/ColePub.Co.
Edelman,A.(1989)Eigenvaluesandconditionnumbersofrandommatrices.PhdThesisMassachusettsInstituteofTechnology,Cambridge,MA.
Gold,E.andFrench,P.(2011)Internationalpracticesinforensicspeakercomparison.InternationalJournalofSpeech,LanguageandtheLaw18(2):293–307.
Golub,G.andKahan,W.(1965)Calculatingthesingularvaluesandpseudo-inverseofamatrix.JournaloftheSocietyforIndustrialandAppliedMathematics:SeriesB,NumericalAnalysis2(2):205–224.
Gonzalez-Rodriguez,J.,Drygajlo,A.,Ramos-Castro,D.,Garcia-Gomar,M.andOrtega-Garcia,J.(2006)Robustestimation,interpretationandassessmentoflikelihoodratiosinforensicspeakerrecognition.ComputerSpeechandLanguage20(2–3):331–355.http://dx.doi.org/10.1016/j.csl.2005.08.005
Gonzalez-Rodriguez,J.,Rose,P.,Ramos,D.,Toledano,D.T.andOrtega-Garcia,J.(2007)EmulatingDNA:Rigorousquantificationofevidentialweightintransparentandtestableforensicspeakerrecognition.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2104–2115.http://dx.doi.org/10.1109/TASL.2007.902747
Hollander,M.,WolfeD.A.andChicken,E.(2013)NonparametricStatisticalMethods,vol.751.NewYork:JohnWiley&Sons.
Jackson,J.E.andWiley,J.(1991)AUser’sGuidetoPrincipalComponents.NewYork:WileyOnlineLibrary.
Jolliffe,I.T.(2002)Principalcomponentanalysis.EncyclopediaofStatisticsinBehavioralScience.NewYork:Springer.
Jolliffe,I.T.(1986)PrincipalComponentAnalysis.NewYork:Springer-Verlag.
Khodai-Joopari,M.(2006)Forensicspeakeranalysisandidentificationbycomputer.ABayesianapproachanchoredinthecepstraldomain.UnpublishedPhDThesis,UniversityofNewSouthWales,Australia.
Lewis,S.(1984)Philosophyofspeakeridentification.Policeapplicationsofspeechandtaperecordinganalysis.ProceedingoftheInstituteofAcoustics6(1):69–77.
Lindley,D.(1977)Aprobleminforensicscience.BiometrikaTrust64(2):207–213.http://dx.doi.org/10.1093/biomet/64.2.207
Meuwly,D.andDrygajlo,A.(2001)ForensicspeakerrecognitionbasedonaBayesianframeworkandGaussianMixtureModelling(GMM).ASpeakerOdyssey-TheSpeakerRecognitionWorkshop,Crete,Greece.
Morrison,G.S.(2009)Likelihood-ratioforensicvoicecomparisonusingparametricrepresentationsoftheformanttrajectoriesofdiphthongs.JournaloftheAcousticalSocietyofAmerica125(4):2387–2397.http://dx.doi.org/10.1121/1.3081384
Morrison,G.S.(2010)Forensicvoicecomparison.ExpertEvidence,ThomsonReuters,Sydney,Australia40:1–105.
Morrison,G.S.(2011a)Acomparisonofproceduresforthecalculationofforensiclikelihoodratiosfromacoustic-phoneticdata:Multvariatekerneldensity(MVKD)versusGaussianmixturemodel-universalbackgroundmodel(GMM-UBM).SpeechCommunication53(2):242–256.
Morrison,G.S.(2011b)Measuringthevalidityandreliabilityofforensiclikelihood-ratiosystems.ScienceandJustice51(3):91–98.
Pigeon,S.,Druyts,P.andVerlinde,P.(2000)ApplyinglogisticregressiontothefusionoftheNIST’991-speakersubmissions.DigitalSignalProcessing10(1):237–248.http://dx.doi.org/10.1006/dspr.1999.0358
Ramos-Castro,D.(2007)Forensicevaluationoftheevidenceusingautomaticspeakerrecognitionsystems.PhDDissertation,UniversidadautonomadeMadrid.
Ramos-Castro,D.,Gonzalez-Rodriguez,J.andOrtega-Garcia,J.(2006)Likelihoodratiocalibrationinatransparentandtestableforensicspeakerrecognitionframework.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Reynolds,D.A.,Quatieri,T.F.,etal.(2000)SpeakerverificationusingadaptedGaussianmixturemodels.DigitalSignalProcessing10(1):19–41.http://dx.doi.org/10.1006/dspr.1999.0361
Rose,P.(2002)ForensicSpeakerIdentification.London,NewYork:Taylor&Francis.
Rose,P.(2003)Thetechnicalcomparisonofforensicvoicesamples.ExpertEvidence,ThomsonLawbookCompany,Sydney,Australia99:1–126.
Rose,P.(2006a)Accountingforcorrelationinlinguistic-acousticlikelihoodratio-basedforensicspeakerdiscrimination.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Rose,P.(2006b)Technicalforensicspeakerrecognition:evaluation,typesandtestingofevidence.ComputerSpeechandLanguage20(2):159–191.http://dx.doi.org/10.1016/j.csl.2005.07.003
Rose,P.(2010)TheeffectofcorrelationonstrengthofevidenceestimatesinForensicVoiceComparison:uni-andmultivariateLikelihoodRatio-baseddiscriminationwithAustralianEnglishvowelacoustics.InternationalJournalofBiometrics2(4):316–329.http://dx.doi.org/10.1504/IJBM.2010.035447
Rose,P.(2011)ForensicvoicecomparisonwithJapanesevowelacoustics-alikelihoodratio-basedapproachsegmentalcepstra.Proceedingsofthe17thInternationalCongressofPhoneticSciences:1718–1721.
Rose,P.,Osanai,T.andKinoshita,Y.(2003)Strengthofforensicspeakeridentificationevidence:multispeakerformant-andcepstrum-basedsegmentaldiscriminationwithaBayesianlikelihoodratioasthreshold.InternationalJournalofSpeechLanguageandtheLaw10(2):179–202.
Seneff,S.andZue,V.(1988)Transcriptionandalignmentofthetimitdatabase.TIMITCD-ROMDocumentation.
Shlens,J.(2005)Atutorialonprincipalcomponentanalysis.SystemsNeurobiologyLaboratory82,UniversityofCaliforniaatSanDiego.
Singh,S.andT.Murry(1978)Multidimensionalclassificationofnormalvoicequalities.JournaloftheAcousticalSocietyofAmerica64(1):81–87.http://dx.doi.org/10.1121/1.381958
Stevens,K.N.(1971)Sourcesofinter-andintra-speakervariabilityintheacousticpropertiesofspeechsounds.Proceedingsofthe7thInternationalCongressofPhoneticSciences:206–232.
Tipping,M.E.andBishop,C.M.(1999)Probabilisticprincipalcomponentanalysis.JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology)61(3):611–622.
Trefethen,L.N.andBau,D.(1997)NumericalLinearAlgebra.Pennsylvania:SocietyforIndustrialMathematics,Vol.50.
vanLeeuwen,D.A.andBrümmer,N.(2007)Anintroductiontoapplication-independentevaluationofspeakerrecognitionsystems.InC.Müller(ed.)SpeakerClassificationI.FundamentalsFeatures,andMethods330-353.Berlin,Heidelberg:Springer.
Wand,M.P.andJones,M.C.(1994)KernelSmoothing.Florida:CrcPress.
Wang,H.andYang,J.(2010)Thecomparisonof‘Idiot'sBayes’andmultivariatekernel-densityinforensicspeakeridentificationusingChinesevowel/a/.Proceedingsofthe3rdInternationalCongressonImageandSignalProcessing(CISP2010)8:3533–3537.
Zou,H.,Hastie,T.andTibshirani,R.(2006)Sparseprincipalcomponentanalysis.JournalofComputationalandGraphicalStatistics15(2):265–286.http://dx.doi.org/10.1198/106186006X113430
Zue,V.,Seneff,S.andGlass,J.(1990)SpeechdatabasedevelopmentatMIT:TIMITandbeyond.SpeechCommunication9(4):351–356.http://dx.doi.org/10.1016/0167-6393(90)90010-7

Refbacks

  • There are currently no refbacks.





Equinox Publishing Ltd - 415 The Workstation 15 Paternoster Row, Sheffield, S1 2BX United Kingdom
Telephone: +44 (0)114 221-0285 - Email: [email protected]

Privacy Policy