International Journal of Speech Language and the Law, Vol 22, No 1 (2015)

Jovicic

doi : 10.1558/ijsll.v22i1.17880

Impact of mobile phone usage on speech spectral features: some preliminary findings

Slobodan T. Jovičić, Nikola Jovanović, Miško Subotić and Đorđe Grozdić

Abstract

The manner of using a mobile phone in voice communications can significantly affect the spectral characteristics of the speech signal. This article presents preliminary results of the analysis of the long-term average speech spectrum (LTASS), the long-term formant distribution (LTF) in voiced sounds, and vowel formants F1, F2 and F3 of six speakers in five modes of mobile phone usage. These modes are: normal holding of a mobile phone (NOR), with a bonbon (sweet) in the mouth (BON), with a cigarette between the lips (CIG), with the mobile phone between cheek and shoulder (SHO) and with the hand covering the mobile phone and mouth (HAN). The results show that each mode has an impact on spectral features and that the modes HAN and SHO have the greatest impact. The most striking results are the relative displacement of F1, which can reach 30% (e.g. vowel /a/ in HAN mode for males), formant F2, near 15% (vowel /i/ in SHO mode for males), and formant F3, about 5% (vowel /u/ in CIG mode for females). These findings suggest that forensic practitioners should exercise caution in interpreting formant measurements in speaker identification cases involving mobile phone transmission.

Introduction

The mobile phone as a personal device is now available to everyone and almost at every point of the globe. This means that a person is in a position to communicate orally in a vast array of situations, anytime and anywhere. These facts imply very different ways and conditions of use of the mobile phone that can significantly affect the quality of verbal communication (Guillemin and Watson 2008; Shewmaker, Hapner, Gilman, Klein and Johns 2010). The mobile phone is designed as a handheld device, but very often it is held between shoulder, cheek and ear, held in the hand but away from the speaker’s mouth, placed next to the speaker in speakerphone mode, etc. In some situations, a speaker does not want to be seen or heard talking and covers the mobile phone (and mouth) with a hand (Ito, Takeda and Itakura 2005). These various positions and distances between the talker’s mouth and the phone’s microphone affect the quality of the speech signal, especially in the spectral domain, which is important in forensic speaker identification (FSI).

Generally speaking, a speaker can be considered as the source of an acoustic (speech) signal. The microphone, as a transducer device, transforms the acoustic signal into an electrical signal, and the telecommunication channel transmits it to the receiver via headphones or to a recorder which records it as a speech signal. Each element of this transmission line affects the quality of the recorded speech signal (Rose 2002). Modern headphones or recording devices are of very high quality and minimally, if at all, affect it. However, the transfer characteristics of the communication channel may significantly affect the spectral characteristics of the speech signal. Recently, several studies examined landline phone and mobile phone transmission-channel effects on the formants of the speech signal (Künzel 2001; Byrne and Foulkes 2004). For both landlines and mobiles the results show that there exists a remarkable artificial upward shift of the centre frequency of the first formant (F1). This is caused by the lower cut-off slope of the transmission channels. Second and third formant frequencies (F2 and F3) were generally unaffected because both lie inside the band-pass of the channels. However, Künzel (2001) emphasised possible distortions in the speech spectrum as a consequence of a nonlinear transfer function of the transmission channel. These distortions could change formant shapes and positions in the speech spectrum and, in exceptional cases, could cause the appearance of a false formant or a missing formant.

However, there have been no published studies of the impact of mode of mobile phone usage on speech spectral features. No-one has investigated, for instance, speaking with the mobile phone between cheek and shoulder, keeping a cigarette between the lips while speaking, speaking with a sweet or food in the mouth, and conversing with a hand covering the mobile phone and mouth. In some of these situations the position of the mobile phone can influence the articulation of the speech signal. It is well known that variations in the shape of the vocal tract have a direct impact on spectral features of the speech signal, including the frequency of vowel formants. Further, if the distance between mouth and microphone is small, any change of the nearby acoustic environment and field can significantly affect the spectral features of the speech signal. These effects could increase or decrease the range of intra-speaker variability. In view of this, it was hypothesised that these effects in the articulator and acoustic domain could be important for forensic practice in speaker comparison, and an experimental investigation was performed with preliminary findings reported in this article.

The article is organised as follows: in the first section the experimental procedure is explained including the method, subjects, speech material and experimental setup. The next section presents the results and discussion of the analysis of the long-term average speech spectrum (LTASS), formant distribution for voiced sounds generally (LTF) and formant distributions for vowel sounds only. Conclusions are drawn and presented in the final section.

1. Experiment

The experiment was carried out to test the effect of the manner of using a mobile phone on the long-term average speech spectrum (LTASS), the long-term formant distributions (LTF) and the first three vowel formants (F1, F2 and F3).

1.1 Method

The following five modes of mobile phone usage were specified and are shown in Figure 1:

  • NOR mode – the normal way of holding a mobile phone between ear and mouth of the speaker, so the loudspeaker on top of the phone leans over the auricle (closest to the eardrum) and the microphone of the phone is near to the mouth. This situation is chosen as a reference and the other four modes in the experiment are compared with it.
  • BON mode – in this mode a speaker holds a bonbon (a sweet) in the mouth between cheek and teeth, introducing small changes in the shape of the vocal tract.
  • CIG mode – in this mode a speaker holds a cigarette between the lips, introducing a restriction in the movement of the lower jaw and lips.
  • SHO mode – in this mode a speaker holds the mobile phone between cheek and shoulder while talking (as often happens when both hands are occupied), introducing the combined effect of changed shape of the vocal tract through the pressure of the phone on the cheek and the restriction of movement of the lower jaw.
  • HAN mode – in this mode the other hand is placed over the phone and mouth, in order to visually mask the conversation.

Figure 1: Modes of mobile phone usage in the experiment

1.2 Subjects and speech material

The experiment involved six volunteers (three male and three female) aged between 24 and 38 years. All were university-educated native speakers of Serbian, and no speaker exhibited any speech, voice or hearing problems. The participants used a variety typical of Belgrade without any particular regional accents. They were all previously instructed in the manner of using the mobile phone while reading a passage, in accordance with Figure 1. The reading was done with a neutral voice, at normal speed and loudness levels, over all modes of mobile phone usage.

The passage was part of the GEES speech corpus (Jovičić, Kašić, Đorđević and Rajković 2004) composed of 79 words lasting for about 32 seconds (see Appendix). The statistics of phoneme appearance in the passage are in agreement with the overall phoneme appearance statistics for the Serbian language. The passage contains five Serbian vowels, 159 instances in total, with the following distribution: /a/ – 39, /e/ – 38, /i/ – 34, /o/ – 31 and /u/ – 17, which, statistically, is sufficient for the experiment bearing in mind that there are six participants.

1.3 Experimental setup

The organisation of the experiment is shown in Figure 2. We used a combination of the GSM mobile phone network and the landline (PSTN) network. The reasons for this decision were as follows: i) this network combination is very frequent, and ii) the bandwidth of the landline network is normally 300–3400 Hz while in the case of the mobile network the low frequency end of its frequency response is approximately 100 Hz and its high frequency response could dynamically vary anywhere between 2800 and 3600 Hz, depending upon the source coding bit rate selected (Guillemin and Watson 2008). Therefore, the bandwidth of the landline network is more restricted and, because of that, of more interest for FSI. On the other hand, this network combination does not affect the primary goal of our experiment.


Figure 2: Experimental setup

The speakers used the mobile phone Nokia 6600 Slide, and recording of the speech material was done in an office. The procedure in the experiment was as follows: using the mobile phone, the speaker would dial the number of a landline phone that was in the same room and, after establishing a connection, the landline telephone handset was disconnected to avoid the echo effect between two phones. Over the splitter on the phone cable, the voice signal from the mobile phone was recorded by a computer as a mono audio file in the uncompressed PCM wav format with a sampling frequency of 44.1 kHz and a resolution of 16 bits.

For all measurements, Praat software (Boersma and Weenink 2011) was used.

2. Results

2.1 Long-term average speech spectrum (LTASS)

LTASS was obtained by averaging short-time FFT spectra on the whole spoken passage. This means that LTASS includes the spectra of all voiced and unvoiced speech sounds. Figure 3 shows separate spectra for males and females averaged over the speakers. For correct comparison, the RMS levels of all spectra are normalised to the NOR spectrum. This analysis was preparatory to the more specific formant analysis described below. As such, it shows the main trends in spectral magnitude changes or spectral distortions referring to the NOR spectrum as reference.


Figure 3: Averaged LTASS for a) male speakers and b) female speakers in all five modes of mobile phone usage

The results in Figure 3 show that the analysed modes of mobile phone communication modify the transfer function of the vocal tract, resulting in spectral changes of the speech signal. The results are similar for males and females with a little larger spectral magnitude differences for males. The spectra BON and CIG show maximum magnitude differences when compared with the NOR spectrum, not more than +3 dB and −4.5 dB for both genders. On the other hand, the spectra SHO and HAN have larger maximum magnitude differences: +7 dB and −8 dB for males, reaching −10 dB at about 3500 Hz in HAN mode (marked with an arrow in Figure 3a), and +7 dB and −6 dB for females. Another observation for SHO and HAN spectra is the existence of a low-frequency band between 300 Hz and 1500 Hz, where the SHO spectrum is above the NOR spectrum and the HAN spectrum is below the NOR spectrum, and a high-frequency band between 1500 Hz and 3000 Hz (both marked with circles in Figure 3), where SHO and HAN spectra are in opposite positions. This is a consequence of the flattened HAN spectrum and the increased slope of the SHO spectrum.

The results in Figure 3 support the main hypothesis in this investigation, i.e. that the manner of usage of a mobile phone in voice communications can affect the spectral characteristics of the speech signal. Spectral distortions, demonstrated in Figure 3, could change the intensity of spectral concentrates or even change their position on the frequency scale, which is important for both unvoiced speech sounds and especially formants of the voiced sounds. The latter is the subject of the analysis in the next sections.

2.2 The long-term formant distributions in voiced sounds

The long-term formant distributions (LTF) of the first three formants F1, F2 and F3 of all voiced segments in the speech passage were measured according to Nolan and Grigoras (2005). The data presented in Table 1 are the mean values for F1, F2 and F3 in the five modes of the manner of using a mobile phone averaged across the three male and three female speakers, the proportional differences relative to the NOR mode, and significance levels in the statistical t-test between the NOR mode and the other four modes. In interpreting these data, we must keep in mind that this analysis encompasses all vowels and all voiced consonants and that it is an interim step between LTASS analysis and vowel formant distribution analysis in the next section.


Table 1: Average formant frequencies in all experimental modes, the proportional differences relative to the NOR mode and significance levels in t-test between the mode NOR and other four modes

Mode

F1

F2

F3

Hz

% diff.

Sign.

Hz

% diff.

Sign.

Hz

% diff.

Sign.

Males

NOR

630

1522

2484

BON

637

101.1

*

1470

96.6

***

2452

98.7

***

CIG

617

97.9

***

1462

96.1

***

2536

102.1

***

SHO

598

94.9

***

1349

88.6

***

2470

99.4

**

HAN

882

140.0

***

1603

105.3

***

2391

96.3

***

Females

NOR

695

1722

2589

BON

718

103.3

***

1702

98.8

**

2636

101.8

***

CIG

718

103.3

***

1625

94.4

***

2472

95.5

***

SHO

652

93.8

***

1624

94.3

***

2677

103.4

***

Significance levels: * < 0.05, ** < 0.01 and *** < 0.001. Number of data N = 9300±400


All formants in all modes have statistically significant differences in position on the frequency scale in comparison with formants in the NOR mode. These results confirm, convincingly, the importance of LTASS distortions in the analysed modes of mobile phone usage in speech communication. Particularly, the proportional differences in data for F1 showed the highest increases in HAN mode for both genders (140% and 127.5%) in comparison to all data in Table 1. Other consistent appearances for both genders are the lowest F1 (94.9% and 93.8%) and F2 (88.6% and 94.3%) in SHO mode and higher F2 (105.3% and 105.1%) only in HAN mode in comparison to other modes. The F3 does not have a consistent relation between genders, but still the proportional differences data indicate significant differences in its position on the frequency scale.

Since both LTASS and LTF analyses do not optimally represent the vowel formants (LTASS includes all sounds and LTF includes all voiced sounds), they do not provide a true picture of their behaviour in the analysed modes of mobile phone usage. We therefore took the further step of measuring only vowel formants and undertaking a scatter distribution analysis.

2.3 Formant distributions in vowels

In the following analysis, the formant frequencies of the first three formants of five Serbian vowels /i/, /e/, /a/, /o/ and /u/, in five modes and for six subjects were measured. A total of 14,310 measurements were made (159 vowels × 6 subjects × 5 modes × 3 formants). Each formant was measured at a central point if the contour had a rapid change, or at the mid-point of the most stable part of the contour. Figures 4 and 5 show the scatter distribution of the vowels in F1–F2 and F2–F3 planes for male and female subjects respectively. The results are averaged across the subjects. The points in the figures represent the centroids in the scatter distribution of each vowel. The points for one mode are connected in a polygon of vowels ‘/i/–/e/–/a/–/o/–/u/’. The polygons represent the general behaviour of formants in different modes in a way similar to LTFs, which represent the general behaviour of formants for all voiced phonemes.


Figure 4: Scatter diagrams F1–F2 and F2–F3 for the vowels of male subjects

Figure 5: Scatter diagrams F1–F2 and F2–F3 for the vowels of female subjects

In the comparison of vowels along dimensions F1, F2 and F3 in Figures 4 and 5, the maximum relative displacements are indicated for F1 at about 30% (vowel /a/ in HAN mode for males), F2 near 15% (vowel /i/ in SHO mode for males), and F3 around 5% (vowel /u/ in CIG mode for females). However, the statistical significance analysis provides us with a more precise indication of the spectral distortions of individual vowels. We applied the two-tailed Wilcoxon matched-pairs signed rank tests for differences of vowel formants in HAN, SHO, CIG and BON modes in comparison to the referent NOR mode. The results shown in Table 2 indicate only the vowels that have statistically significant deviation regarding formants and modes; other ones are not statistically significant.

It can be seen in Table 2 that males show more formant deviations than females in the HAN mode (11 versus 6 vowels) while females take the lead in the CIG mode (8 versus 6 vowels). Another finding from the comparison of males and females is marked in Table 2 and relates to significant deviations of all five vowels for F1 in HAN mode for the males and F3 in CIG mode for the females. Similarly, the significant deviations are indicated in the SHO mode for all five vowels for both genders, but for F2 for males and F3 for females. Finally, it appears that for females F3 is more sensitive to modes of mobile phone usage; for males it is F1 and F2. The discussion below considers these results.


Table 2: Results of two-tailed Wilcoxon matched-pairs signed rank tests for differences of vowels’ formants in HAN, SHO, CIG and BON modes in comparison to the NOR mode. Only statistically significant cases are noted.

Males

Females

Mode

Vowel

Formant

Significance

Mode

Vowel

Formant

Significance

HAN

/i/

F1

***

HAN

/e/

F1

***

HAN

/e/

F1

***

HAN

/a/

F1

***

HAN

/a/

F1

***

HAN

/o/

F1

***

HAN

/o/

F1

***

HAN

/u/

F1

***

HAN

/u/

F1

***

HAN

/o/

F2

***

HAN

/a/

F2

***

HAN

/u/

F2

***

HAN

/o/

F2

**

SHO

/e/

F2

**

HAN

/i/

F3

***

SHO

/a/

F2

***

HAN

/e/

F3

***

SHO

/i/

F3

***

HAN

/a/

F3

***

SHO

/e/

F3

**

HAN

/o/

F3

***

SHO

/a/

F3

**

SHO

/a/

F1

*

SHO

/o/

F3

***

SHO

/u/

F1

*

SHO

/u/

F3

*

SHO

/i/

F2

***

CIG

/i/

F2

***

SHO

/e/

F2

***

CIG

/e/

F2

***

SHO

/a/

F2

***

CIG

/a/

F2

***

SHO

/o/

F2

***

CIG

/i/

F3

***

SHO

/u/

F2

**

CIG

/e/

F3

***

CIG

/a/

F1

**

CIG

/a/

F3

**

CIG

/i/

F2

**

CIG

/o/

F3

***

CIG

/e/

F2

***

CIG

/u/

F3

***

CIG

/a/

F2

***

BON

/i/

F3

**

CIG

/a/

F3

*

BON

/e/

F3

*

CIG

/o/

F3

***

BON

/a/

F3

*

BON

/e/

F3

***

Significance levels: * < 0.05, ** < 0.01 and *** < 0.001

2.4 Discussion

All results show that the analysed modes of mobile phone communication modify the transfer function of the vocal tract, resulting in spectral changes of the speech signal. It is common knowledge in phonetic theory (e.g. within the source-filter theory of speech production (Fant 1970) or quantal theory of speech (Stevens and Keyser 2010)) that even minor modifications to the articulatory gestures during speech production may alter the acoustic signal.

In our case, in BON mode, the bonbon was small in diameter (about 1 cm) and still affected the resonant characteristics of the oral cavity. Similar spectral changes occur in mode CIG because holding a cigarette between the lips limits the movements of the articulators, especially of the lower jaw and mouth opening. Slight repositioning of the talker’s articulators for both modes BON and CIG (Figure 3) causes spectral changes above 800 Hz possible repercussions in changes of the F2 and F3 (Table 2).

However, strong individual effects are apparent, despite subjects having been instructed at the beginning of the experiment on how to use the mobile phone. For example, in mode CIG all females held the cigarette with protruding lips, for the possible reason of keeping lipstick on the lips by minimising the extent of mouth opening (Figure 6). The consequence is the lengthening of the vocal tract, which results in the lowering of F2 and F3 (Fant 1970, and our results in Figures 4 and 5, and Table 2). On the other hand, the males held the cigarette stuck to their lower lip with little impact on the shape and disposition of the lips, so that the effect of the formant lowering is evident mostly in F2 (Table 2). An example of a possible individual effect is the different position of the sweet inside the mouth (BON mode) between cheek and teeth, or between teeth and tongue, or even changing its position during the act of speaking. Effects of sweet positions on F3 and possibly F2 need further investigation. In any case, a sweet in the mouth causes small changes to F3 in the front vowels /i/, /e/ and /a/ (Table 2).

The SHO and HAN modes demonstrate a more intensive effect on the spectral characteristics of the speech signal. In the case of holding a mobile phone between the cheek and shoulder, in the SHO mode, there is a combination of articulation constraints in the production of the speech sounds. When bending the head to the shoulder, the mobile exerts pressure on the cheek causing a restriction in movement of the articulators around the mouth (forming shapes of the lips and setting them in motion back and forth) and restricts the movement of the lower jaw (limited downward movement of the tongue and change in shape and volume of the oral cavity). Again we have differences between males and females. The results in Figures 4 and 5 and Table 2 indicate a firmer holding of mobile phone by males resulting in a lower F2 as a consequence of changes in the shape of the oral cavity. The females do not hold the phone so firmly between the cheek and shoulder, and the main spectral changes are evident in F3.

The biggest spectral changes are evident in the HAN mode caused by the positioning of the hand over the mobile phone and mouth. The positioning of the hand in front of the mouth has several effects. First, it acts as a screen that causes changes in ‘radiation impedance’ (acoustic impedance) at the lips (Flanagan 1972). In normal – non-obstructed – speech the radiation impedance consists of a radiation resistance and radiation inductance in parallel connection. A hand over the mouth changes the values of these components and introduces another component: radiation capacitance. This complex radiation impedance causes many spectral variations, as is evident in Figure 3, including the loss of sound energy from lower spectral components and a certain degree of amplitude increase in the higher frequencies around F3. Other effects are the turbulences in the propagation of sound waves from mouth to hand with consequences for the quality of speech sounds: the talker’s voice sounds slightly ‘muted’, ‘muffled’ or ‘dull’ (Fecher 2014). All these effects depend on how the hand is positioned in front of the mouth, i.e. its distance from the mouth and the hand shape adopted.

For males, the HAN action may inhibit breathing; specifically, holding a hand in front of the mouth may make it more difficult to breathe and, as compensation, the jaw is lowered in order to take in more air. This articulatory manifestation is similar to the whispering that is heard when jaw lowering occurs (Jovičić 1998) resulting in a significant increase in F1 and a moderate increase in F2 (Jovičić 1998; Ito, Takeda and Itakura 2005). In both cases, we have an unusual or unnatural speech action with a clear F1 increase when the lower jaw goes down, as Lindblom and Sundberg (1971) demonstrated. Unlike males, females do not hold their hand so firmly over the mobile phone and mouth but position it rather like a screen in front of the mobile and mouth (Figure 6). As a result we see some minor changes in F3, and not so striking changes in F1 (Figure 5). An unexpected effect is the increase in F2 from about 1320 Hz to 1520 Hz for vowels /o/ and /u/. However, we can see in LTASS for females a spectral concentrate at about 1780 Hz (marked by an arrow in Figure 3b). The rising part of this concentrate (marked by a rectangle) covers the spectral segment where the F2 of vowels /o/ and /u/ is located and causes the frequency increase (effect noted in the Introduction). The likely cause of the spectral concentrate is the flattening in HAN mode (see Section 1, LTASS), when the talker’s voice sounds slightly ‘muted’, ‘muffled’ or ‘dull’ (Fecher 2014).


Figure 6: Examples of different forms in manner of mobile phone usage with female subjects

3. Conclusions

Our experiment investigated and established manner of mobile phone usage as another source of potential intra-speaker variability, especially in the spectral domain. As noted in the Introduction, this phenomenon has not previously been investigated experimentally or documented. Byrne and Foulkes (2004) suggested that variations in their experimental results may have been attributable to ‘mobile users [exercising] considerably more flexibility in both the distance between mouth and receiver, and also the angle between them’ and called for the acoustic effects of mouth to phone orientation and distance to be investigated experimentally. Our findings, although preliminary, strongly support Byrne and Foulkes’s observations. They also support our hypothesis concerning the impact of the manner of mobile phone usage on speech spectral features with consequences in increasing intra-speaker variability. Our results show that the relative displacement of F1 can reach 30% (vowel /a/ in HAN mode for males), F2 nearly 15% (vowel /i/ in SHO mode for males), and F3 about 5% (vowel /u/ in CIG mode for females). These findings provide grounds for further discussion of the interpretation of formant measures earlier started by Nolan (2002) and Künzel (2002). Further, the statistically significant displacement of formants in the various modes of mobile phone usage indicates that forensic practitioners should exercise caution in interpreting formant values if the speech recordings contain such effects.

The questions remain of how forensic practitioners can recognise this type of spectral distortion in recordings and how to interpret formants measures. For now, these questions have no definite answers. Our only recommendation to practitioners is to exclude the speech segments from further analysis if they suspect that they may contain these types of spectral distortion. Of course, further investigations are needed especially in the perceptual domain to find the answer to the first question. The second question requires investigations into the quantification of spectral distortions and its correlation with the manner of mobile phone usage. Future studies might usefully investigate other modes of mobile phone usage, individual variations in these modes, methods for the identification of the various modes, the limits of spectral distortions, spectral distortions of sounds other than vowels, as well as the mathematical modelling of articulators in the environment of such modes, and so on.

Acknowledgment

This work was supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia under Grant number No 178027. We would like to record our special thanks to reviewers for their helpful advice on drafts of this article.

About the authors

Slobodan T. Jovičić PhD is professor at the Department of Telecommunications, School of Electrical Engineering, University of Belgrade, Serbia. He is a scientific adviser at the Life Activities Advancement Center and a forensic expert at the Serbian Ministry of Justice. He founded the Laboratory for Forensic Acoustics and Phonetics at the Life Activities Advancement Center, Belgrade. His research interests include speech communication, speech technology, speech disorders and forensic science.

Nikola Jovanović received his MSc in telecommunications (speech communication) from the School of Electrical Engineering, University of Belgrade, Serbia. He now works as a researcher at the Military Technical Institute, Belgrade. His main research interests lie within the field of speech communication quality enhancement.

Miško Subotić received his PhD in telecommunications (speech communication) from the School of Electrical Engineering, University of Belgrade, Serbia. He is now director of the Life Activities Advancement Center, an innovative center that covers interdisciplinary aspects in speech communication investigation. His research interests include speech production and perception, bioinformatics bases of speech, speech and hearing disorders, and cognitive speech and e-medicine applications.

Đorđe Grozdić received his MSc in telecommunications (speech communication) from the School of Electrical Engineering, University of Belgrade, Serbia. He is now studying for his doctorate at the School of Electrical Engineering and works in the Laboratory for Forensic Acoustics and Phonetics at the Life Activities Advancement Center, Belgrade. His research interests lie within the field of speech perception, recognition and quality enhancement.

Appendix: Speech material (passage) used in the experiment

Zdravo Marko! Dođi! Hoću da ti ispričam šta se dogodilo danas! Čekajući Petra na stanici, videh jednog starijeg čoveka kako nosi buket cveća. Na glavi je imao šešir. Prelazio je ulicu naspram one tek renovirane buregdžinice. Zamisli, baš u tom trenutku je dunuo vrlo jak vetar pa mu je šešir poleteo s glave. On je, da bi zadržao šešir, raširio ruke i ispustio cveće koje je vetar rezneo po ulici. Dok se saginjao, umalo ga ne zgazi Ljubin Ford!

References

Boersma, P. and Weenink, D. (2011) Praat: Doing Phonetics by Computer (Version 5.2.26). Retrieved from http://praat.org/.

Byrne, C. and Foulkes, P. (2004) The ‘mobile phone effect’ on vowel formants. International Journal of Speech, Language and the Law 11(1): 83–102. http://dx.doi.org/10.1558/sll.2004.11.1.83

Fant, G. (1970) Acoustic Theory of Speech Production (2nd edn). The Hague: Mouton & Co.

Fecher, N. (2014) Effects of forensically-relevant facial concealment on acoustic and perceptual properties of consonants. Dissertation, Department of Language and Linguistic Science, University of York.

Flanagan, J. L. (1972) Speech Analysis, Synthesis and Perception (2nd edn). New York: Springer-Verlag. http://dx.doi.org/10.1007/978-3-662-01562-9

Guillemin, B. J. and Watson, C. (2008) Impact of the GSM mobile phone network on the speech signal: some preliminary findings. International Journal of Speech, Language and the Law 15(2): 193–218.

Ito, T., Takeda, K. and Itakura, F. (2005) Analysis and recognition of whispered speech. Speech Communication 45: 129–152. http://dx.doi.org/10.1016/j.specom.2003.10.005

Jovičić, S. T. (1998) Formant feature differences between whispered and voiced sustained vowels. AcusticaActa Acustica 84: 739–743.

Jovičić, S. T., Kašić, Z., Đorđević, M. and Rajković, M. (2004) Serbian emotional speech database: design, processing and evaluation. Proceedings of the Conference SPECOM-2004, St Petersburg, Russia: 77–81.

Künzel, H. J. (2001) Beware of the ‘telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Linguistics 8(1): 80–99.

Künzel, H. J. (2002) Rejoinder to Francis Nolan’s ‘The “telephone effect” on formants’: a response. Forensic Linguistics 9(1): 83–86.

Lindblom, L. and Sundberg, J. (1971) Acoustical consequences of lip, tongue, jaw, and larynx movement. Journal of the Acoustical Society of America 50(4): 1166–1179. http://dx.doi.org/10.1121/1.1912750

Nolan, F. (2002) The ‘telephone effect’ on formants: a response. Forensic linguistics 9(1): 74–82. http://dx.doi.org/10.1558/sll.2002.9.1.74

Nolan, F. and Grigoras, C. (2005) A case for formant analysis in forensic speaker identification. International Journal of Speech Language and the Law 12(2): 143–173. http://dx.doi.org/10.1558/sll.2005.12.2.143

Rose, P. J. (2002) Forensic Speaker Identification. London: Taylor & Francis. http://dx.doi.org/10.1201/9780203166369

Shewmaker, M. B., Hapner, E. R., Gilman, M., Klein, A. M. and Johns, M. M. (2010) Analysis of voice change during cellular phone use: a blinded controlled study. Journal of Voice 24(3): 308–313. http://dx.doi.org/10.1016/j.jvoice.2008.09.002

Stevens, K. N., and Keyser, S. J. (2010) Quantal theory, enhancement and overlap. Journal of Phonetics 38(1): 10–19. http://dx.doi.org/10.1016/j.wocn.2008.10.004

Refbacks

  • There are currently no refbacks.





Equinox Publishing Ltd - 415 The Workstation 15 Paternoster Row, Sheffield, S1 2BX United Kingdom
Telephone: +44 (0)114 221-0285 - Email: info@equinoxpub.com

Privacy Policy