28 February 2015

Side Projects

I started my professional career in July 2008, fresh out of college I was really excited about working on real projects. After two months of training I was assigned to the Call Handling (IVR) team, a .Net based project. Soon I realized that the project did not require much coding, whole day went into writing test cases and IVR workflows in XML. This motivated me to work on the following side projects:

Indews.in (2008)
Soon after 2008 Mumbai attacks, I decided to create my own news aggregation site for India, hence the name Indews (Indian News). I was not happy how Google News clubbed the news and sometimes ended up showing stale information as the main headline. I implemented some part of the site, wrote a very basic crawler in C#, hosted the site on a local IIS server at home. It had a really bad interface and did not survive more than 3 months. I had lost all my interest in a general news aggregator, now I was mainly interested in the tech/programming news.

9AM.in (2009-2010)
After shutting down indews.in, I started a project called 9am under http://www.natmac.org/9am/. I got bored with ASP.Net technologies, there was a lot of abstraction and many times I did not understand how things worked underneath, for example ajax implementation of ASP.Net. It was all magic and lots of dlls. And there were a very few open source projects in C#. So decided to move away from .Net and migrated all my code to Java. Having worked on Java and J2EE in college projects, it was easy.
9am was again an RSS/ATOM aggregator, but this time I was crawling the whole web for the tech related stuff and some news sites for general news (kept general news anyway from indews!). You can find an internet archive snapshot here.
9am had a many features, like: finding top keywords, grouping similar items, inbuilt search, categorizing a feed item into one of these categories:
DBs, UI, .Net, S/W Engg, Languages, Mobile, Java, XML, OS
Categorization was based on bag of words and worked fairly well. It was also hosted at my home computer using a static IP and had a 384 Kbps network connection. Crawler used to crawl a few thousand URLs everyday based on a Revisit Policy. The database had around 60,000 feed (RSS/ATOM) URLs and everyday it used to discover new ones. Some of the website owners got pissed with the crawling and asked me to remove the URLs. Since everything was automated, I had no control over URL discovery.
9am was really fun, it used to discover really good articles on the web everyday and I always had something amazing to read in my office. Following is an internet archive screenshot of the Language category under TECH tab:

The whole setup had many issues, day time power cuts, internet outage, slow internet, slow machine, poor MySQL full text search. Despite of all, it used to get ~5000 visits/day from Google.
When I was moving to another city, I had to shut it down. For unknown reasons it remained that way forever. The crawler code can be found at https://code.google.com/p/crowl/ and https://github.com/vikasing/crowl

Mozvo.com (2011-2013)
Mozvo analyzed the sentiments of tweets, reviews and blogs to create a Mozvo score for a movie. It had many other cool features like: movie recommendations, actor profiles, friends' tweets about a movie, movie explorer based on many attributes etc. This was the most ambitious side project I ever did. It also involved two more guys from the same company I was working at. We worked after office, almost everyday, initially it felt like it might end up evolving in a startup. I mainly worked on the back-end part of it, which had MongoDB as its database and a data layer written in Java. It was fun building the core parts. I ended up learning lots of new stuff.

We kept on adding many features without asking our users whether they really wanted them or not. It was like a playground for us, whatever we (or any one of us) thought was cool, we ended up implementing that ignoring the outcome. We did not analyze whether any feature was helping us in retaining the users. Google brought all the traffic and that was not really enough, ~200 visits/day. Gradually we lost our interest and in April 2013 we altogether stopped working on it. It is still alive at mozvo.com but in a dormant state.

GizmoAge (2012)
This was an Android app built on top of PhoneGap, main aim was to collect latest gadget news and group it to remove ambiguity.  The first version of the app was ready to use and did look much better than many apps in the Play Store. I published the app in Play Store, but removed it after a couple of months, don't remember why :), I guess there were some server issues.

Cryptocurrency Mining (2014)
This was my first hardware hacking project. I ended up investing around $1000 in this project, bought 2 top end graphics cards, a 850 watt SMPS and lots of hacky stuff like PCI risers, power buttons from Hong Kong etc.
The rig mined all the popular alt coin currencies at that time from Dogcoin to Coinocoin. I also did some trading at various exchanges. After three months of mining all the fun was gone so I stopped my rig and decided to sell the hardware. But before that following happened:
I had to RMA one of the graphics card and the motherboard short circuited (no RMA). Also lost around 0.1 bitcoin in trading. Sometime later I sold my 0.42 bitcoin and stopped the crypto currency madness all together.
Nevertheless it was fun, got to learn many things about crypto currencies eg. bitcoin, blockchain, ASIC, primecoin, mintcoin, CPU only coins, and there were these crazy ideas of coin drops, also country specific coins like Auroracoin for Iceland.

There were some other small projects here and there:
  • Crowl (2009- ): The web crawler which powered 9am, still working on it, its much more powerful now.
  • NiceText (2012-): This is a very small library I wrote to extract the text from a webpage. Other libs boilerpipe and readability port did not work that well on many pages. This is a part of crowl project. I wrote a post about it and here is the github link.
  • jaLSA (2014-): A lib I wrote to do Latent Semantic Analysis. It was needed for a project I was working on in my previous company.
  • velocityplus (2011): An eclipse plugin for Apache velocity templating engine. Worked on it when I was working in my first company. It is unfinished, got really bored while developing it.
  • Fing.in (2012): A Bollywood news portal, it was supposed to be a sub project of mozvo.com. Finished it locally but did not publish it anywhere.
  • Paltan.org (2008): I briefly worked on creating a social networking website for my college group, it was based on Wordpress based Buddypress. But the Buddypress itself was in beta development and lacked many obvious features. It was all PHP, lost my interest very soon, did not go anywhere, shut it down after sometime.
  • letsj.com (2012): An aggregator for Java related articles. Intention was to use Lucene as a database as well as indexing engine, got into many issues, abandoned.