Kaggle is the name of an online platform and community for topics such as data analysis, machine learning (ML), data mining and Big Data. It can be reached at www.kaggle.com and offers its members the opportunity to post and participate in competitions, exchange knowledge, and educate themselves. Kaggle's target audience includes data scientists and companies or organizations from a wide range of industries and fields. Members are allowed, for example, to announce competitions. For this, they post data to be analyzed according to specific specifications. Other members of the platform then develop data models that analyze the posted data as best as possible according to the specifications.

The winner of the competition receives the previously offered prize. In some cases, prize money of up to five, six or seven figures (US dollars) is offered. In addition to the possibility of running competitions, the platform has continuously expanded its service portfolio around data science topics. Data scientists engage in knowledge exchange via Kaggle, educate themselves, research or publish data sets, and discuss or develop data models.

Originally founded by Anthony Goldbloom and Ben Hamner in 2010, Kaggle was acquired by Google in 2017. Already at that time, more than one million users were registered on Kaggle. Many of the world's largest companies, such as Microsoft or Facebook, are active on www.kaggle.com. Numerous competitions were successfully handled with great attention from various institutions and the press. Among other things, these involved topics from medical research, the optimization of traffic flows, or font and image recognition. Target groups of the Kaggle platform

Users, scientists, companies and organizations from various fields and industries are active on the Kaggle platform. In principle, the platform's services are aimed at anyone who deals with topics such as Big Data, data mining, machine learning, data analysis and artificial intelligence (AI). Kaggle's target groups can post competitions, participate in them, receive further training or exchange knowledge. Often, data scientists or companies use Kaggle to find solutions to specific problems based on existing datasets.

Kaggle competition

The competitions announced on Kaggle are usually about developing data models to analyze existing data sets as effectively as possible according to a problem definition. Some of the data models developed may subsequently be used for other problems. The data and questions are provided by the company or the organization that announces the competition. In principle, anyone can participate in a competition. The prerequisite is a free account on Kaggle. Contests on Kaggle are public or private. The basic procedure of a contest is as follows:

  • the initiator of a contest describes his problem or question and prepares the data that will be used to solve it
  • the data and questions are published on Kaggle together with a contest deadline. At the same time, the initiator can offer a prize for the best solution
  • the participants of the contest use the data and develop techniques or data models that they think answer the question most effectively. Ideas can be made publicly available and discussed
  • once a good model is found, it can be submitted
  • then the evaluation of the data model takes place. Criteria for the evaluation are, for example, prediction accuracies, which are determined with solution data known only to the competition initiator
  • the submitted data models are published with a short summary in a live ranking list
  • Participants have the opportunity to revise and optimize their models. Subsequently, they can be resubmitted and evaluated
  • once the deadline has passed, the participant with the best solution receives the prize offered
  • the contest initiator may use the solution

Examples of contests on Kaggle

Since Kaggle was founded in 2010, many contests from different fields have been held. Some of the winning prizes reached five-, six- or seven-figure US dollar amounts. Among others, the topics were medical research, optimization of traffic flows or font and image recognition. Examples of Kaggle competitions include:

  • U.S. space agency NASA: Detecting dark matter using galaxy images.
  • HIV/AIDS research
  • Handwriting recognition
  • Image analysis for early detection of lung cancer
  • Fish detection and identification

Other Kaggle offerings and services

In addition to running data science competitions, Kaggle offers other services such as:

  • Kaggle Kernels: cloud-based workbench for deploying and analyzing Python or R code.
  • Public platform for deploying datasets
  • Kaggle Learn: Online learning platform and courses for AI education.
  • Job platform for machine learning and AI specialists