The NATO-led Libyan campaign has increased the monitoring of Twitter and other social media in its mission planning, according to today’s Financial Times. Because there are “too few special forces on the ground”, NATO “will take information from every source we can”, according to RAF Wing Commander Mike Bracken, the Libyan operation’s military spokesman. The article even quotes Twitter user @4libya, who on Tuesday tweeted to @NATO what she claimed were coordinates for Gaddafi forces. Without discussing whether this information was used or not, the article goes on to discuss both the advantages and potential pitfalls of using social media in open-source analysis.
Of the many potential pitfalls of canvassing social media, one barely touched on in the article is the simple fact that there is just too much of it. The firehose of information continues to gush, and any attempt to drink it all will quickly overwhelm a system. As such, an analyst needs a tool that highlights the relevant nuggets, pointing at what needs to be read before anything else. This is where Recorded Future comes in.
We chose thirty Libya-based Twitter users and collected the past three months of their tweets. We then spun up a local instance of the Recorded Future platform, which extracts not just entities and events but statements about time. By harvesting this dataset, we were able to see not just who and what the Twitter users were talking about, but when those things were going to happen. This produced some interesting results.
The initial dataset consisted of about 30,000 tweets. Even at only 140 characters each, that’s still a lot to read. Of those tweets, 6,840 mentioned a point in time. The overwhelming majority of those time mentions were reporting on things that happened in the past: last night, last week, last Tuesday, etc. However, 144 of those tweets, or about 1.5%, mentioned a time point in the future: tomorrow, next week, next Tuesday, etc. In effect, they were reporting on something that was going to happen. Suddenly, that giant 30,000-tweet dataset has been immediately triaged to a subset that can be read in minutes.
Recorded Future also allows a deeper dive into the dataset. It’s easy enough to determine, for example, which users are tweeting the most often, and which are tweeting the most often about events to happen in the future. The following graphic, created in Spotfire, shows tweets per day by user, with future-looking tweets in pink:
Some people tweeted more as time went along, some fell off. Some people started making more predictions over time, some made less. Recorded Future lets the analyst quickly visualize both, as well of course allowing them to drill down to the actual underlying data.
For instance, suppose an analyst wants to focus specifically on one user’s forward looking predictions. Easy enough in Recorded Future to pull that data:
By harvesting a single source and using the Recorded Future technology to extract time points, we can easily look at all future-looking statements made by a single user. This can assist in model building, in credibility scoring, and in mission planning.
The analysis of open-source social media can be a great tool to use, and the NATO mission in Libya shows that forces are attempting to use it. The first problem anyone will face is how to deal with the huge amount of data. The Recorded Future platform, available for government customers to be installed behind a private firewall, can take this mass of unstructured text and make sense of it, aligning events and entities across time.