Kaggle – Data Science Competition

The reasons why you should take a competition

According to Yanir Seroussi, if you’re a data scientist (or want to become one), participating in competitions is a great way of honing your skills, building reputation, and potentially winning some cash.

Introduction to Kaggle

Kaggle is the world’s largest community of data scientists. They compete with each other to solve complex data science problems, and the top competitors are invited to work on the most interesting and sensitive business problems from some of the world’s biggest companies through Masters competitions.

Kaggle provides cutting-edge data science results to companies of all sizes. We have a proven track-record of solving real-world problems across a diverse array of industries including life sciences, financial services, energy, information technology, and retail.

Profiling top Kagglers

How to compete 1

  1. Read the manual
  2. Understand the performance measure
  3. Know your data
  4. Understand what you want to achieve before worrying about the how
  5. Set up a local validation environment
  6. Monitor the forum
  7. Do your research
  8. Apply the basics rigorously
  9. Ensemble all the things
  10. Win


Exploring Data


Machine learning libraries

Python: scikit-learn, XGBoost, Vowpal Wabbit, cuda-convnet2

R 2 : gbm, randomForest, e1071, glmnet, tau, Matrix, SOAR, forEach, doMC, data.table

