Predicting American Movie Genre Categories from Linguistic Characteristics

Authors

  • Tony Berber Sardinha Sao Paulo Catholic University
  • Marcia Veirano Pinto Sao Paulo Catholic University

DOI:

https://doi.org/10.1558/jrds.v2i1.27515

Keywords:

Multidimensional Analysis, American movies, genre prediction

Abstract

The goal of the current study is to explore the possibility of correctly classifying movie transcripts into movie genres by means of a Discriminant Function Analysis (DFA) based on a previous comprehensive multidimensional (MD) analysis of American cinema. MD analysis is a framework for describing the salient characteristics of text varieties by means of multivariate statistical techniques, notably factor analysis. Traditionally, MD analysis has been restricted to the study of register variation, being largely ignored in text classification research. In the MD analysis reported, a large genre-diversified movie corpus was tagged for lexico-grammatical features with the Biber tagger and the resulting factor scores were used as input for the DFA. The results showed that particular movie genres could be successfully predicted from the MD analysis, thereby lending credence to movie genre distinctions, while at the same time stressing the robustness of MD factor scores as reliable predictors of genre distinctions.

Author Biographies

  • Tony Berber Sardinha, Sao Paulo Catholic University

    Tony Berber Sardinha is Associate Professor, Dept. of Linguistics and Graduate Program in Applied Linguistics, Sao Paulo Catholic University, Brazil.

  • Marcia Veirano Pinto, Sao Paulo Catholic University

    Marcia Veirano Pinto is a PhD in Applied Linguistics from Sao Paulo Catholic University and has several year's experience teaching English as a foreign language and as a teacher trainer. She is currently a post-doctoral researcher at Sao Paulo Catholic University working alongside Prof. Berber Sardinha.

References

Al-Surmi, M. (2012) Authenticity and TV shows: A multi-dimensional analysis perspective. Tesol Quarterly 46 (4): 671–694.

Altman, R. (2000) Film/Genre. London: Palgrave Macmillan.

Berber Sardinha, T., Kauffman, C., and Mayer-Acunzo, C. (2014). A Multi-dimensional analysis of register variation in Brazilian Portuguese. Corpora vol. 9, no 2: 239–271. http://dx.doi.org/10.3366/cor.2014.0059

Berber Sardinha, T. and Veirano Pinto, M. (2014, November) What’s on TV? Looking at American television corpus linguistics style. Paper presented at the XVI Encontro de Alunos de Graduação em Inglês como Língua Estrangeira [XVI Meeting of Undergraduate Students in English as a Foreign Language], São Paulo, SP.

Bergan, R. (2006) Eyewitness Companions: Film. New York: DK Publishing.

Bértoli-Dutra, P (2014) Muti-dimensional analysis of British and American pop songs. In T. Berber Sardinha and M. Veirano Pinto (Eds) Multi-Dimensional Analysis, 25 years on: A Tribute to Douglas Biber, 274–310. Amsterdam/Philadelphia, PA: John Benjamins.

BET, Network (2011) Bet Network Presents: Reel Facts – Understanding AA Movie Consumers. Retrived on 24 January 2015 from http://www.reachingblackconsumers.com/wp-content/uploads/2011/09/BET-Networks-REEL-Facts-Movie-Goer-Consumption-Study.pdf

Biber, D. (1986) Spoken and written textual dimension in English: Resolving the contradictory findings. Language 62: 384–414. http://dx.doi.org/10.2307/414678

Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511621024

Biber, D. (1993) Representativeness in corpus design, Literary and Linguistic Computing, 8 (4): 243–257. http://dx.doi.org/10.1093/llc/8.4.243

Biber, D. (1995) Dimensions of Register Variation: A Cross-Linguistic Comparison. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511519871

Biber, D. (2004) Historical patterns for the grammatical marking of stance: A cross register comparison, Journal of Historical Pragmatics, 5 (1): 107–136. http://dx.doi.org/10.1075/jhp.5.1.06bib

Biber, D. (2006) University Language: A Corpus Based Study of Spoken and Written Registers. Amsterdam/Philadelphia, PA: John Benjamins. http://dx.doi.org/10.1075/scl.23

Biber, D. (2009) A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing. International Journal of Corpus Linguistics 14 (3): 275–311. http://dx.doi.org/10.1075/ijcl.14.3.08bib

Biber, D. and Conrad, S. (2009) Register, Genre, and Style. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511814358

Biber, D. and Tracy-Ventura, N. (2007) Dimensions of register variation in Spanish. In G. Parodi (Ed.) Working with Spanish Corpora, 54–89. New York: Continuum.

Bordwell, D. (2005) O cinema clássico hollywoodiano: Normas e princípios narrativos [Classical Hollywood cinema: Narrational principles and procedures]. In F. P. Ramos (Ed.). Teoria contemporânea do cinema: Documentário e narrativa ficcional [Contemporary cinema theory: documentaries and fictional narrative], 277–302. São Paulo: Editora Senac.

Cantos Gómez, P. (2013) Statistical Methods in Language and Linguistic Research. Bristol: Equinox.

Crossley, S. A., Allen, L. K. and McNamara, D. S. (2014) A multi-dimensional analyis of essay writing: What linguistic features tell us about situational parameters and the effects of language functions on judgments of quality. In T. Berber Sardinha and M. Veirano Pinto (Eds), Multi-Dimensional Analysis 25 Years on: A Tribute to Douglas Biber, 109–147. Amsterdam/Philadelphia, PA: John Benjamins.

Forchini, P. (2012) Movie Language Revisited: Evidence from Multi-dimensional Analysis and Corpora. Bern: Peter Lang. http://dx.doi.org/10.3726/978-3-0351-0325-0

Halliday, M. A. K. and Matthiessen, C. M. (2004) An Introduction to Functional Grammar. London: Hodder Arnold.

Hunter, R. (2011) A comédia muda. [Funny Men]. In P. Kemp (Ed.). Tudo sobre cinema [Cinema: The whole story], 62–67. Rio de Janeiro: Sextante.

Kauffmann, C. H. (2005) O corpus do jornal: Variação linguística, gênero e dimensões da imprensa diária escrita [A newspaper corpus: Dimensions of variation in the daily written press]. Unpublished master dissertation, São Paulo Catholic University, São Paulo, Brazil. Retrieved on 10 February 2009 from http://www.sapientia.pucsp.br/tde_arquivos/19/TDE-2005-09-01T07:57:09Z-1103/Publico/Diss_Kauffmann_BDTD.pdf

King, C. (2011) Capa e Espada [The Swashbuckler]. In P. Kemp (Ed.). Tudo sobre cinema [Cinema: The whole story], 48–53. Rio de Janeiro: Sextante.

Kozloff, S. (2000) Overhearing Film Dialogue. Berkeley, CA: University of California Press.

Jullier, L. and Marie, M. (2009) Lendo as imagens do cinema. São Paulo: Editora Senac.

Macnab, G. (2011) Musicais. In: P. Kemp (Ed.). Tudo sobre cinema, 76–80. Rio de Janeiro: Sextante.

Neal, S. (1980) Genre. London: British Film Institute.

Ramos Filho, E. (2014) Artigos acadêmicos em língua inglesa: uma abordagem multidimensional [Academic articles in English: A multidimensional approach]. Unpublished PhD thesis, São Paulo Catholic University, São Paulo, Brazil.

Schatz, T. (1981) Hollywood genres. Boston: McGraw-Hill.

Schneider, S. J. (2008) (Ed). 1001 filmes para ver antes de morrer. Rio de Janeiro: Sextante.

Souza, R.C. (2014) Dimensions of variation in Time magazine. In T. Berber Sardinha and M. Veirano Pinto (Eds) Multi-Dimensional Analysis, 25 years on: A Tribute to Douglas Biber, 311–343. Amsterdam/Philadelphia, PA: John Benjamins.

Veirano Pinto, M. (2013) A linguagem dos filmes norte-americanos ao longo dos anos: Uma abordagem multidimensional [The language of North American movies over the years: A multidimensional study]. (Unpublished doctoral dissertation), Catholic University of São Paulo, São Paulo, Brazil.

Veirano Pinto, M. (2014) Dimensions of variation in North American Movies. In T. Berber Sardinha, and M. Veirano Pinto, Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber 109–148. Amsterdam/Philadelphia: John Benjamins.

Zuppardo, M.C. (2014) Dimensões de variação em manuais aeronáuticos: Um estudo baseado na análise multidimensional [Dimensions of variation in aviation manuals: A multidimensional approach]. Unpublished master dissertation, São Paulo Catholic University, São Paulo, Brazil. Retrieved on 15 November 2014 from https://www.academia.edu/7482747/Dimens%C3%B5es_de_Varia%C3%A7%C3%A3o_em_Manuais_Aeron%C3%A1uticos

Published

2016-02-16

Issue

Section

Articles

How to Cite

Berber Sardinha, T., & Veirano Pinto, M. (2016). Predicting American Movie Genre Categories from Linguistic Characteristics. Journal of Research Design and Statistics in Linguistics and Communication Science, 2(1), 75-102. https://doi.org/10.1558/jrds.v2i1.27515