Is Big Data the next big thing?*
Last week I attended the Legaltech conference in New York. As has been the case for the last few years, much of the conference focused on e-discovery (as it is still called stateside) and ‘predictive coding,’ reflecting the heavy emphasis on large scale litigation in the US legal market. However, there were some sessions on other topics, including a series on ‘Big Data’ which was of particular interest.
What is Big Data?
There isn’t a very satisfactory definition. The one used in the sessions was the McKinsey definition:
“Big data refers to data sets whose size is beyond the ability of typical database software to capture, store, manage and analyse”
But this begs the question of what is ‘typical’. It was generally agreed at the sessions that whereas organisations had focused on extracting business intelligence from structured data2 there could be real potential in being able to run analytics across huge volumes of unstructured content. This may even include recordings of voicemails of telephone conversations.
The conference also talked about ‘dark data’, i.e. data the firm may be holding without any real idea of its value or importance – often data that should have been deleted, but which firms are reluctant to delete because they are unsure what it contains and therefore of the consequences of deleting it. Dark data was compared to the clutter you store in your garage against the possibility you might need it one day, but you couldn’t readily say exactly what was in there.
Whilst we might argue about the definitions, I think it is recognised almost universally that organisations now hold a great deal of data which they are unable to make use of. Often those organisations don’t really know what data they are holding.
For some time, the supermarkets have been using the information gathered from their EPOS (electronic point of sale) systems, enhanced by data from their loyalty schemes, in order to analyse the buying habits of their customers, leading them in some cases to change the way products are laid out in their stores at different times of the day in order to catch the eye of different types of customer. It is perhaps this type of use that has led some to suggest that in 10 years’ time organisations’ IT spend will be controlled by their marketing departments. Other examples of the use of big data are organisations gathering information from social networks in order to monitor consumers’ sentiments in relation to their brands. Another use is the ‘black box’ in the car, used by some insurance companies to monitor young drivers’ habits in order to assess the risk they pose and therefore set their premiums appropriately.
Although in general law firms are relatively small organisations, they still have more data than they are able to manage and utilise effectively. Some law firms are now starting to talk about using big data themselves 3.
Is Big Data a problem or an opportunity?
It is a problem in the sense that:
Storing the data is becoming an increasing problem. Even with cheap storage (either onpremise or in the cloud) the volume of data being generated is going to give rise to significant costs. Gartner predicts that big data will require $34bn of IT expenditure in 2013. This is probably an underestimate.
Having to search through the data and provide relevant content to a third party (in the context of litigation, regulatory enquiries or data protection requests) is a bigger and more costly exercise the more data you have.
Privacy is a concern. It is hard to be sure that an organisation is complying with its obligations in respect of data about individuals (only storing what’s necessary, keeping the information up to date etc) when such data may be hidden in an impenetrable mass.
For law firms in particular, having data in multiple places and systems makes it more difficult to comply with large clients’ stipulations about confidentiality and the segregation of their data.
Big data presents a potential opportunity, however, in the sense that it may be able to provide insight which firms have found it hard to deliver manually. Already, some firms are using enterprise search tools to locate expertise within their organisations, not by creating and maintaining expertise databases (which are notoriously hard to keep up to date) but by using a search engine to look at the authorship of documents, the narratives in time entries, matter descriptions and CVs/pitches in order to work out who in the organisation has considered a particular issue before. This requires some careful tuning, but can produce remarkably accurate results.
On a broader scale, analysis of big data may be able to help with questions such as ‘what tasks are generally involved in this type of matter and who handles them?’ in order for firms to analyse their processes and drive efficiencies. It may help firms to analyse the types of circumstances which make the delivery of their services more time-consuming (and therefore costly), enabling lawyers to ask the right questions at the beginning of a matter so that their cost estimate can take account of those circumstances. Further, firms might use their data to assess more broadly the effectiveness of individual lawyers and partners in delivering work and developing client relationships, although the potential for argument over the use of a large mass of data, of which lawyers may be highly suspicious, in this way may be significant.
There is potential for the use of big data to go deeper, enabling firms to analyse how often a particular argument has been successful in a particular type of case and what the factors are thatmake such success more or less likely. We are not there yet, and much work -and expense – will be needed in order to realise these ideas.
Parallels with e-discovery/e-disclosure
There are, however, significant parallels with the e-discovery processes which were the theme of much of the conference. The challenge is making sense of large volumes of data, being able to identify the themes that emerge from them and finding a way of presenting this analysis which lawyers can work with. There was indeed a comment from a speaker in one of the conference’s plenary sessions that if ‘predictive coding’ and ‘computer assisted review’ had been the buzzwords of the conference over the last couple of years, ‘information governance’ was likely to be the next buzzword.
For any large organisation, records management and control (weeding) of their data will become a necessity. As we have seen new roles in the C-suite, including Chief Information Security Officer (CISO), there is a potential for Chief Information Governance Officers (CIGOs) to appear, taking records management to the next level, so that firms:
Understand what data they have
Are able to dispose of unnecessary data
Are able to protect the remainder (in terms of security and data protection)
Can ensure legal and regulatory compliance, and
Can extract valuable insights from the data they have kept.
In our view at 3Kites, there are opportunities to use existing search and litigation support technologies to help extract value from firms’ data, but it will be important to be very clear about what firms want to achieve and to recognise issues about the reliability of the underlying information. More generally, the increasing need for proper information governance will force firms to be more rigorous in how they capture, classify and update their data, especially in relation to their matters. This in turn will enable them to extract much greater value from such data in order to assist in delivering legal work efficiently in the future.
Melanie Farquharson, 3Kites Consulting, February 2013.
3Kites Consulting is a limited company registered in England and Wales. Registered number: 5644909. Registered office: 1 High Street, Knaphill, Woking, Surrey, GU21 2PG. www.3kites.com