It’s always exciting to see a smart new social media monitoring product with jazzy new visuals and clever ideas like WYSIWYG geofencing. But just like many data visualisations, the buzz might be a lot greater than the substance. The purpose for which you intend to use these features, and the details, will determine their true business value. In the case of Twitter geo-filtering the value is often misunderstood. One of the most common two questions* we get asked is about Twitter geofiltering and geofencing.
Image: Geofencing map
Size matters — geofiltering creates a sample
The single big idea to take away is that geofiltering creates a sample of all possible relevant tweets.
You cannot get blood from a stone, and your jazzy new social media monitoring package cannot get location data from tweets which do not have embedded location data within the 65 fields of metadata which accompany every single tweet.
The bottom line is that by geofiltering at a finer level than “country” you will end up with about 10–20 per cent of actual relevant tweets for your keywords. Filtering by country you will end up with about 20–50 per cent of relevant tweets.
Twitter doesn’t really know where you are
For most people, at most times, Twitter, unlike Google, does not know where you are. For example:
- A lot of users nominate nonsensical or illogical addresses in their profile location;
- Tens of millions of fake Twitter accounts self-nominate realistic addresses in their profile location;
- Twitter location services are opt-in, and most people do not opt-in;
- Adding an actual location to a tweet requires that a user deliberately touch the location function before tweeting, but most people do not; and,
- The generic location added by touching the location function is a broad radius — details are added by toggling “add precise location” which is scary enough that very very few people add it.
As a result, Twitter has only a general idea where most people are. It’s fair to say that for a majority of people tweeting it has little idea and only generic at best e.g. self-declared country level.
Think about it — simply
If you step back from the impressive geo-fencing feature for a moment and think about someone you know — possibly yourself — you can then see why it’s mostly a gimmick. It’s a gimmick because it disguises the underlying technical limitations instead of highlighting them to you in your search for more accurate and more easily obtained insights. It is leaving you vulnerable to forming false conclusions, and of being accused of potential incompetence in the case of serious information which has been missed such as in emergency response issues.
Think about this:
If you have “Melbourne” in your Twitter profile as your location, what does Twitter make of that? Which Melbourne is it: Melbourne, Florida?
If you have “Australia” then it is going to be hard to identify that you are currently in Bendigo and tweeting from Bendigo.
If you have nominated “Kalgoorlie” in your profile does Twitter know that Kalgoorlie is part of the City of Kalgoorlie-Boulder in Australia and “an exciting prospect for tourists”, especially those interested in mining museums? I doubt it.
Even if you deliberately have location services activated and then deliberately touch the location function before sending a tweet, the location added by Twitter is generic in the sense that the designation “Melbourne” covers an area almost as big as Los Angeles in the absence of you further deliberately selecting “add precise location”.
Therefore, the only chance that Twitter will be able to geotag your tweet is if you deliberately add a location to the tweet, as mentioned above.
It is almost common sense to realise that the number of people that do this is a very small percentage.
Once you stop and realise that (1) location services are opt-in, (2) most people do not opt-in, and (3) location data, after opting in to the service, has to be deliberately added to every tweet before sending, then you can clearly appreciate that the fraction of tweets capable of being found by geofiltering is a small fraction indeed.
Example shows the results of geofiltering
A search for tweets containing Bendigo and mosque (or mosques) illustrates the point:
- A search without any geofiltering returns 10,256 tweets over the last 12 months; versus,
- A search with “Australia” as the location of the tweets returns 2,693 tweets.
We know that “Bendigo” searched as a topic in conjunction with mosques is very likely to be about Bendigo in Victoria and the “mosque controversy” and therefore the global non-filtered search will have few irrelevant tweets. In other words it is not likely that the global search has many irrelevant tweets boosting the count.
In fact without the geofiltering activated Twitter estimates that 81.1 per cent of such tweets can be identified as coming from Australia. Which all adds up to say that we can be quite confident that most of the 10,256 tweets found without geofiltering are relevant to the search topic. The number of 10,256 is not inflated.
Therefore, by activating a search using a geofilter we can see that we are discarding more than 75 per cent of the relevant data.
And that is when using an Australia-wide filter. If we narrow down to a town — Bendigo — or even a large city — Melbourne — or we draw fun lines on a fancy map to create geofences, then we obviously will lose most of the data.
By most it means that it is very likely that ten per cent or less of the relevant data will surface.
Is working with ten per cent of the data acceptable?
Let’s be very clear. For 90 per cent of use cases of social media monitoring, discarding 90 per cent of the incoming data is a high risk and not acceptable.
We know that because most use cases are about monitoring in real time, and not about applying social data analytics to back-data.
When doing analytics with back-data you might find a sample acceptable, depending on the purpose. For example if you wish to analyse brand comments or sentiment over a year then a geofiltered sample of 100,000 tweets will most likely give you similar results to analysing three or four or five times that number.
However, if you are monitoring for customer service or for emergency response, for example, then losing 90 per cent of tweets through geofiltering is likely to be a very risky solution — very risky indeed.
Furthermore, if you are monitoring for policy formulation and analysis and the tweet volumes are not very high, then you will find it very difficult to draw conclusions from the ten per cent of the data which gets through the geofilter.
Be clear about your use cases. Then, if you require the optimum social media monitoring effectiveness for such use cases as customer service and emergency service monitoring, invest in a platform which offers the best search, internal filtering and tagging, inbound rule formulation, and persistent social profile validation as possible.
The trade-off between the loss of information through geofiltering and the sophisticated use of search and post-search techniques without geofiltering is an art. This comes with practice and expertise in both social media monitoring and with the specific tools and platforms. In the future it may be as simple as drawing lines on a map, but at the moment it is a long way from being such a convenient answer.
*The other most common question is about how to monitor Facebook data now that Facebook Topic Data has been introduced and previous access to public personal posts has be terminated.