Blog

Tom Chikoore speaks…

This is my first post here on the Filtrbox blog. Since March, I have been so busy writing code to get Filtrbox from vision to product that I have not had enough time to come up for a breather and compose a coherent blog post. However, I am now confident that the ship is steadying and on course to making a significant contribution to restoring information efficiency. For those of you whom I have not met, I am Tom Chikoore the CTO and co-founder of Filtrbox. As the Filtrbox CTO, my responsibilities include product vision, product planning and product development. I would like to use this first blog post to give you a run down of where were are today and how we got here.

Today we have released a version of the Filtrbox Dashboard to our community of users that I think best mirrors our vision of how the seemingly incoherent data on the web can be collected, transformed and packaged into useful information for consumption by Filtrbox users. Although this is a milestone for us, it is only a glimpse into our vision of the future of information efficiency at Filtrbox. The most satisfying aspect of where we are today is how we got here.

From day one, we have made it a goal at Filtrbox to be scrappy, resourceful and smart. Being a team of two, for the most part, we were determined to solve the information efficiency problem while making the most out of our scarce resources. In order to effectively address the problem, we split the problem into two stages; the two stages reflected our final product design that is based on a complete separation of concerns. The first stage - data collection. The second stage - data consumption. The reasoning for this was that before we could present the data for consumption by the user we needed to have a solid foundation of data collection algorithms.

This past summer, during Techstars, we set out to build a solid, scalable and extensible data collection foundation based on a framework that we built and code named “Carnivore”. Being a small team, the decision to build a framework is now paying off because our development turnaround is now very short; we can build substantial features into the product without neither breaking any code nor breaking a sweat (this software engineering approach will be subject of a future blog post). “Carnivore” consists of distributable autonomous data collection agents that run 24 hours a day (“hard at work scouring the universe for new content for you”) cataloging and storing the data in the Indexer database. Engineering and building “Carnivore” has not been without its challenges, least of which is building efficient algorithms for our parser, code-named “Gormandizer”, for processing data off the Internet which by nature does not fit any consistent pattern a.k.a. “dirty” (this will be subject of a future blog post). One of the biggest benefits the “Carnivore” framework gave us earlier on was the ability to plug in data generators. We leveraged this capability to generate the “Filtrbox: XX new articles from XX topic(s)” e-mails that our users receive each morning. The ability to generate e-mail was huge for us because it gave us the opportunity to share the benefits of Filtrbox with some of our closest friends (However, it also came with that ugly Topic/Keyword setup interface, please forgive me for that, I had to give you something :-) ). The slow addiction to Filtrbox daily e-mails has resulted in a large amount of valuable feedback that we are leveraging to improve Filtrbox on a daily basis.

With “Carnivore” in place by the end of summer, we set out on our second stage, data consumption in early Fall. The data consumption framework is a framework that allows data collected by “Carnivore” to be distilled into meaningful information and packaged for consumption by the user via various data consumer tools. We cheated a little and started on the data consumption framework before its time when we added the e-mail delivery. E-mail was the harbinger for the data consumer. We learned a great deal from that and most of what we have learned, we incorporated in the designs of our second data consumer, the Filtrbox Dashboard. Armed with the design and prototype of the Filtrbox Dashboard, we needed someone to join the team to work on not just the Dashboard but the whole data consumer framework. So, we set out to hire the third member of the team. At Filtrbox we have the fundamental belief that a great product is built on the foundation of a more than excellent team. Excellence is what we were after and excellence is what we sought to find. After reading tons of resumes, numerous phone interviews, a few “get to know you” lunches and a couple of in person interviews, we finally found the person who fit our mission best, Bruce Deen. Bruce, the “Senior Code Monkey” (ask for Bruce’s business card, the next time to run into him), is a Flex (choice of Flex will be subject of a future blog post) ROCKSTAR who has stepped up to the plate, “owned” the whole data consumption framework and he is responsible for the Filtrbox Dashboard that we have today ……. as far as we can tell he has not even started to dig deep into his bag of Flex tricks. With Bruce working on the data consumer side and myself working on data collection, I believe we are finally on course to making a great product that reflects our vision.

That is a “short” chronology of how we got to where we are today. From now on, I will be keeping you informed through this blog on how we are progressing on our quest to take the complex unstructured data on the Internet today and making it simple and consumable by you and I. Lastly, I would like to express my gratitude to all the users who have been sending us feedback. You are definitely making a great contribution to making Filtrbox a better source of information. Keep the feedback coming.

- Tom