The backlash against “big data” studies is well underway. And no more so than in the area of humanities and creative arts research.
If I had a dollar for every person who has told me over the past year or so that you can’t learn anything about art/culture/society from studying “the numbers” I would be my own independently financed research institute by now. And yet the current and important debate about private telecommunications data, prompted by proposed revisions to national security laws, shows just how revealing large-scale transactional metadata can be.
Since the end of 2012, a team of researchers at Deakin and RMIT universities has been gathering global film business data to determine and measure the critical factors affecting film industry performance in a period of transition. The project, called Kinomatics, is also an opportunity for us to assess the usefulness of metadata-driven research for a field like cinema studies.
What is kinomatics?
The term “kinomatics” is derived from the word “kinematics”, the study of the geometry of motion, and “kino”, a term for cinema. We like to think of Kinomatics as the study of the industrial geometry of motion pictures. As such it is not at all like the typical approach to university-based cinema research.
Kinomatics is part of a broader, emerging disciplinary shift in the humanities, away from a traditional focus on measuring the value and meaning of cultural artefacts (such as films) to recognising instead the significance of cultural flows and transactions.
Kinomatics began as the offshoot of an Australian Research Council-funded project examining the contemporary business of cinema distribution and exhibition in Australia. The film business is, and has been from the outset, a global enterprise. So in order to really understand Australia’s film economy we quickly realised we needed a more global perspective, one that could account for the cinema’s international, interdependent, overlapping and uneven networks.
Our focus had shifted from trying to understand which particular aspects of which particular movies might explain film industry performance – Was it the depressing script? Was it the inadequate budget? – to a more systemic overview.
This adjustment in turn required us to find different sources of evidence to answer our questions. (What is the impact of environmental conditions on film diffusion? What are the spatial and temporal dimensions of cinema circulation?). Rather than the content of the films themselves we were much more interested in the metadata that describes the cinema’s social, institutional and commercial transactions.
Working with Big Cultural Data: three supersize takeaways
The Kinomatics dataset that forms the basis for our research is a unique collection of more than 250 million “showtime” records, capturing information about all film screenings in 48 countries over a 30-month period (2012-2015).
Kinomatics adopts many of the techniques that have been developed in recent years within the Digital Humanities. These include the use of APIs, visualisation techniques and data analysis approaches based around the making and mining of the horizontal relationships between data (rather than the vertical relationships favoured by archivists, for example).
As one of the first “big cultural data” projects of its kind, Kinomatics has been a massive learning curve for the research team. Here are our top three takeaways as we’ve come to grips with our supersized database:
1. It’s not how big your data is, it’s what you do with it that counts. Big Data doesn’t necessarily refer to the size of the data (after all today’s “big” data is tomorrow’s iota) but it does mean that the size of the data is one of the problems that need be resolved.
In fact, in Kinomatics the data just keeps growing. Through further integrating different types of data (demographic data, social media data, technical infrastructure data, economic and financial data and climatic data, for example) we have been able to explore the value of an “expanded” approach to cultural data, rather than simply focusing on the idea of one “big dataset” per se.
2. Big Cinema Data requires a “big” range of expertise. We are a collaborative team with a wide range of skill sets. And yet we could always do with more. Big data studies involve accumulating your evidence base before you begin to develop your approach to it. This iterative and exploratory approach differs from conventional academic methodologies.
We often confront colleagues who believe that working with data involves proposing a query and, hey presto!, an answer magically appears on the screen. Working with big data involves careful consideration, multiple iterations, experimentations and interpretations. It’s a process of constant decision-making.
In a sense big data studies are always in beta. In our experience the bigger the team, the more considered the approach can be.
3. Big Data does not always lead to Big Breakthroughs. In part, the backlash against big data represents a classic Gartner hype curve trough. But data scepticism is important. No matter how large your dataset, it is never comprehensive and it is never unbiased.
In our experience we have been able to make assertions about specifically defined research problems that account for the type of data and the way it has been gathered.
For example, part of our interest in the factors concerned with the diffusion of film across cinemas around the globe is to better understand the factors that drive diffusion and diversity in domestic film markets. We’ve explored this by looking at a range of case studies such as the spread of High Frame Rate technologies via the release of The Hobbit movies; the volume of transactions between cinema nations expressed in the form of dyadic networks; the relationship between the flow of remittances (the money sent home by migrants) and the movement of Indian films around the globe; and engaging everyday cinemagoers with our dataset through the CinemaCities Ranking index.
Working with big data requires a changed mindset. Researchers need to be able to work at scale and be comfortable answering questions with probabilities rather than definitive conclusions. Perhaps the critical question now is not what we can do with the data as much as what the data is doing to us as researchers.