The reasons why you should take a competition
According to Yanir Seroussi, if you’re a data scientist (or want to become one), participating in competitions is a great way of honing your skills, building reputation, and potentially winning some cash.
Introduction to Kaggle
Kaggle is the world’s largest community of data scientists. They compete with each other to solve complex data science problems, and the top competitors are invited to work on the most interesting and sensitive business problems from some of the world’s biggest companies through Masters competitions.
Kaggle provides cutting-edge data science results to companies of all sizes. We have a proven track-record of solving real-world problems across a diverse array of industries including life sciences, financial services, energy, information technology, and retail.
Profiling top Kagglers
- Profiling Top Kagglers: KazAnova Currently #2 in the World
- Profiling Top Kagglers: Owen Zhang, Currently #1 in the World
How to compete 1
- Read the manual
- Understand the performance measure
- Know your data
- Understand what you want to achieve before worrying about the how
- Set up a local validation environment
- Monitor the forum
- Do your research
- Apply the basics rigorously
- Ensemble all the things
Machine learning libraries
Python: scikit-learn, XGBoost, Vowpal Wabbit, cuda-convnet2
R 2 : gbm, randomForest, e1071, glmnet, tau, Matrix, SOAR, forEach, doMC, data.table
- Yanir Seroussi, 2015. 10 Steps to Success in Kaggle Data Science Competitions. [ONLINE] Available at: http://www.kdnuggets.com/2015/03/10-steps-success-kaggle-data-science-competitions.html. [Accessed 10 March 2015]. ↩
- DataRobot, 2015. 10 R Packages to Win Kaggle Competitions. [ONLINE] Available at http://www.slideshare.net/DataRobot/final-10-r-xc-36610234 [Accessed 03 July 2014]. ↩