April 28, 2015 9:12 AM UTC received a tip about a gambling site that may be falsely advertising its abilities and wrongfully netting bitcoin in the process.

The subject of the complaint was a site called BETwitter, which is unique in its space. The game allows users to bet on the odds of various words appearing in tweets during a certain window. It’s very easy to sign up and play, but there’s a fatal problem with the site: it uses Twitter’s public API, which may show as little as 1% of the actual tweets using a given keyword. Here is the video that BETwitter has used to introduce the game to the world:

Now, doesn’t like to write about what we don’t know, so, for scientific purposes, this writer went ahead and made a BETwitter bet on a Friday night. “For scientific purposes.” Here is a video of the last few minutes of the bet participated in:

0.005BTC were wagered that the word “public” would win over “secret.” The author looked at the past several plays of this same game, and noticed that “public” almost always wins. As you can see in the video, he was right, and for this, he won .0033BTC.

So if the fundamental question about this game is, can it be fun and rewarding? If that is the fundamental question, the answer is yes. But the actual fundamental question about this game is: is it fair? When answering that question, no matter the outcome of one’s bet, one must consider the possibility that literally every game result is reported by BETwitter.

Firehose Vs. Public API

The Twitter Firehose is the entire stream of its data, a massive pipe that requires huge resources to utilize. The person who wrote to us, Joe Murray, was the also the first to notice this discrepancy about BETwitter. Joe, a former Data Scientist for a social media analytics company, had this to say about the BETwitter system, via the game’s own Reddit post, saying to the its creator:

The public streaming API contains an EXTREME minority of all tweets actually made. You are running a gambling site using (at worst) only ONE PERCENT of all tweets actually being made. Can you honestly say that this is a fair gambling site if the outcomes are determined by only 1% of the full data? Any bet that comes even remotely close to breaking even could be completely false.

The creator, who has admitted to having no significant experience with social media APIs, responded diffidently:

[…] My guess is that 1% is about the maximum rate they will provide to someone using the API. You could still sample just a few keywords (not 400 which is the maximum limit for the public API – we’re using a LOT less!).

This public streaming API will drop tweets if there are too many coming in at once – that is true. And it’s mentioned in the BETwitter FAQ that it may not always have all tweets. But in practice I haven’t been able to produce this.

Some games containing the words ‘LOL’ and ‘love’ and ‘fuck’ have observed over 50k tweets / hour. Most other games work with 1-2k tweets / hour / word, so it’s far from needing to drop anything.

I don’t claim that BETwitter is provably fair, but provably unfair is also not true IMHO.

Well, if it’s not provably fair, what is it? If the metric used to determine the winner and the loser is not mathematically guaranteed, how can this be a fair way to gamble? In dice, you win if your roll is within the pre-defined winning range. In cards, you win if you have the best hand. In roulette, if you hit one of the squares you’ve covered. However, it appears that with BETwitter, one could actually bet on the winning word, but still lose because that word did not appear in the public API. An easy fix for the developer would be to contract the services of one of the companies that deal with the actual, full stream of Twitter data – such as Gnip.

Other Potential Problems

It seems obvious that this system could be gamed rather easily by someone with a lot of followers, real or purchased, or through other means. It’s like betting that people will pass by a certain storefront on a certain day wearing red shirts — there is a chance that someone could pass out red shirts just up the street on condition that people wear them. In this case, someone could simply tweet or purchase the tweeting of the word they bet on a few times, dramatically increasing the average followers of the word. Then it wouldn’t matter how many times the other side had done it.

There are services out there on the Internet that allow for the user to purchase Twitter followers. Using the followers as the metric, rather than the number of times the word appears, seems fallible; then again, the whole system seems fallible.

In the end, this is an interesting project that has its downsides. All gambling is hugely risky, but this has the added risk of perhaps not being accurate at all. Use with caution.

