Funding Open Data With Gittip

to make the revenue fit the expense

Paul HouleCreator of database animals and bayesian brains
March 19, 2014

Past as prelude

Silvrback blog image

The early 2000's were a heady time in the library world. Granting agencies were in love with "digital libraries", and you could get $40,000 just like that to put a cool stuff from the archive online or directly help departments with their academic missions.

It's not that expensive to maintain a digital library. Sometimes you can go for years without thinking of them, but someday you're going to replace a server and have to pick every piece of the system up and put it into place. I'm optimistic because I'm a good maintenance programmer, and I was in a crack Library IT group that kept things running like a top on quality iron, but you're going to spend at least 5% of what it cost to build a site to keep it running each year. Quite possibly a lot more. In the long term it adds up, however, and the grant-writers saw that each digital library project dug a deeper and deeper hole in the budgets, thus digital libraries have gone out of style.

What I saw at that time was that project-based funding doesn't address the predictable recurring costs caused by projects. We needed something different.

The traditional funding landscape

Silvrback blog image

Many forms of funding, such as private and government grants, as well as Kickstarters, are project-oriented. Large amounts are also spent on projects within and between commercial organization. There is a beginning and an end to a project, and (at the end) a definite amount of time and money spent. There's a science of project management, which is why PMI Certification is worth something.

There are many activities, however, that don't have a beginning or an end. For instance, a restaurant customers on regular basis with a
progression towards profit and satiety rather than an endpoint in time. A hackerspace needs cash each month to pay the rent. These types of activities need a recurring stream of income to support; project-based funding can fails the most successful projects if it fails to keep the products alive.

Introducing Gittip

Silvrback blog image

Gittip helps patrons to donate small amount of cash (anything from $0.25 to $100) a week to other Gittip users on an automatic basis. The system automatically charges your credit card for donations and direct deposits donations into bank accounts.

Trust is going to be a big issue for any service like this, but because Gittip makes transactions through Balanced Payments, and thus through the conventional credit card and ACH system (unless you insist on using Bitcoins through coinbase), this a system that is relatively safe. Because the system doesn't depend on borrowing, and doesn't accumulate large fund balances, there isn't the risk of massive value destruction possible with either Mt. Gox or a conventional bank.

Is Gittip a viable business?

It's natural that people who benefit from Gittip would want to support it, and that is how Gittip gets revenue.

As of March , founder Chad Whitaker earns about 4% of all gifts on Gittip and gives back about 20% of what he gives. The official Gittip account earns about 3% of all gifts and gives back a bit less than 10% of what it earns. Perhaps some other contributors get gifted as well, but we could estimate that the operators get about 5-8% of the cash that goes through. That amounts to enough to support 1 or perhaps 2 FTE in the US, although not at prime programming rates. If the cash flow through the system double or triples and people continue to donate at the same rate, it should be comfortably self-supporting

My Plan

Silvrback blog image

I've spent a few years developing methods to get information of large generic
databases such as Freebase, which is the public face of the Google Knowledge Graph. Something fundamental about this is openness. Freebase lets us refer to entities such as Garret Morgan with unique identifiers and this kind of Linked Data is a universal language that can be used between computers. This kind of data is already accepted by the Internet giants that have backed the organization. The software tools and data sets I'm about to describe give individuals, academics and smaller organizations greater ability to participate in the global knowledge graph.

I've used primitive tools in the past to create web sites like, NY Pictures and Ookaboo.

Every week, Freebase releases a more than 20 gigabyte file that my open source Infovore software converts into a refined product called :BaseKB My removing unnecessary, uninteresting and invalid data, it cuts the size of Freebase in half or more. :BaseKB and Infovore are useful starting points for anyone wanting to build their own knowledge graph, because the data cleaning functions alone can mean the difference between failure and success in a project.

Because of the limits of the server, it took me a whole month to download the Wikipedia PageLinks to Amazon S3. This 4 Terabyte data set contains page view statistics, on an hourly granularity for every wikimedia from 2008 to the present. Inside Amazon S3, it is easy to work on this data with my open source Telepath software.

The first product I've developed is :SubjectiveEye 3D, which averages popularity information for concepts from 2008 to the present. This sort of product is essential for web sites like Ookaboo, where users can select topics using type-ahead search, and also essential for text analysis programs which require an estimate of how frequently different topics come up.

This data provides a detailed history of how people, living in all the world's language zones, have been interested in various topics at moments in time. There are so many exciting applications, there's no way I can develop or conceive them all, so I'd like to make this data available to everyone in the
S3 cloud. The Telepath software, which consumes this data, is a starting point for anyone who wants to explore it.

Although a limited version of this data already exists as an AWS public data sets, but this can be a complete resource which is automatically kept up to date.

This operation is remarkably cheap to run, but, it does cost money. If I receive $125 a week in donations on Gittip that will pay most of the storage, transfer, and processing charges for this data. When I reach that goal, I will make my copy of the complete PageCounts data freely available inside Amazon S3.

Getting these expenses paid will free my mind to think about using the tools I've developed to create more web sites and solve problems for customers. If I think I can find more in donations, I expect to set higher goals in line with increasing commitments. I don't expect Gittip to pay my living expenses, however, I do need a cash stream to pay for recurring expenses connected with Open Data and I'd like Gittip to be one part of a diversified income stream.

Please make a recurring donation now at my Gittip page.

The Big Picture

Silvrback blog image

So far as I know, all religions teach their followers to give or tithe a certain amount of their income. Some people give to a church, some people give to the United Way and there are all kinds of fundrasing programs such as P.T.A., Girl Scout Cookies, and the Race for the Cure. Some see it is not just an obligation, but something that contributes to general prosperity.

That's particularly the case for the open source and open data communities. Commercial software sold by the month is all the rage these day, we see everything from Office 365 to Adobe Creative Cloud to Basecamp. That's a good thing, and we can extend it to the open source movement. Personally I've seen open source products contribute thousands to hundreds of thousands of dollars of value for me and the people I've worked for, and it's just logical that if I give something back I'll get more of the software that I love.

If you agree, please make a contribution at my Gittip page.

Paul Houle

Creator of database animals and bayesian brains

comments powered by Disqus