Some of the innovative products/companies that I've had the privilege to be a part of include:
There are MANY others – which is good database innovation mojo – especially compared to seven years ago when Mike Stonebraker and I started Vertica. At that point, the standard response from people when I said I was working with Mike on a new database company was "Why does the world need another database engine? Who could possibly compete with the likes of Microsoft, IBM and Oracle?" But the reality was that Oracle and the other large RDBMS vendors had significantly stifled innovation in database systems for 20+ years.
Jit Saxena and the team at Netezza deserve huge kudos for proving that, starting in the early 2000s, the time was right for innovation in large-scale commercial database system architectures. Companies were starved for database systems that were built for analytical purposes. I'm not a fan of using proprietary hardware to solve database problems (amazing how quickly people forgot about the Britton Lee experiment with "database machines"). But putting the proprietary hardware debate aside, thanks to innovators like Mike Stonebraker, Dave Dewitt, Stan Zdonik, Mitch Cherniack, Sam Madden, Dan Abadi, Jit Saxena and many others, now we're well on our way to making up for lost time.
Some other database start-ups of note include:
- NuoDB, Jim Starkey's company
- Akiban Technologies
- ParElastic
- Hadapt, Dan Abadi's company
- Basho, the makers of Riak
- 10Gen, the MongoDB company
- Cassandra, not really a company (yet, I think) but a viable key value store
The challenge now for most commercial IT and database professionals is the process of trying to match the right new tools with the appropriate workloads. If, as Mike and his team say in their seminal paper "one size does not fit all for database systems," then one of the hardest next steps is figuring out which database system is right for which workload (a topic for another blog post). This problem is exacerbated by the tendency to over-promote the potential applications for any one of these new systems, but hey, that's what marketing people get paid to do ;)
However, there are still missing pieces. I believe we need:
- Large-scale, multi-tenant analytic database as a service, similar to Cloudant and Dynamo but tuned/configured specifically for analytical workloads with the appropriate network infrastructure to support large loads
- Large-scale, multi-tenant statistics as a service – equivalent functionality to SPSS, R, SAS, but hosted and available as an affordable Web service. The best example of this right now is probably Revolution. I guess the acronym would be Statistics as a Service – or Statistics as a Utility
- Radically better visualization tools and services: I think that HTML5 has clearly enabled this and is making tools like Ben Fry's Processing more accessible so that the masses can do "artful analytics"
It’s very cool to see a bunch of chemists working together to design compounds or libraries of compounds that they wouldn’t otherwise have created. Modern chemists use their remarkable intuition along with incredibly powerful computational models running on high- performance cloud infrastructure. They analyze how active or greasy a potential compound could be or how soluble, big, dense, heavy or how synthetically tractable it might be. Teams of chemists spread across the globe use this data to make better decisions about which compounds are worth synthesizing and which are not as they seek to discover therapies that make a difference in the lives of patients.