Kaggle titanic challenge with torch7

Kaggle Titanic Challenge

Kaggle titanic challenge is a famous knowledge competition which many new Kaggler will try their first Kaggle competition. Since there are currently no tutorial to solve this challenge with artificial neural network, I decided to use torch7 to compete in this competition. FYI, click here to get the data.

Why Torch7

Deep learning is state of the art machine learning algorithm in learning image, video, sound and natural language. Torch7 is one of the famous deep learning framework, which is already used within Facebook, Google, Twitter, NYU, IDIAP, Purdue and several other companies and research labs.

Model

Learning Model

Validation

Validation is using back the training dataset since the dataset will be too small if I separate it. As I am using tanh as transfer function , for those model output bigger than 0, I classified it as survived. Using model after training for 20000 epoch, the result are:

  • False Positive : 31
  • True Positive : 276
  • False Negative : 66
  • True Negative : 518

Submission Result

The score is the percentage of passengers we correctly predict. The submission result using model on epoch 20000 is 0.74641. From the result, we know that our model is slightly overfitted because training set accuracy is 89.113% while test set is 74.646%.

Some Thought

I have showed that torch7 works in Kaggle titanic challenge too. With this method, I can easily get the probability that a passenger is survived based on the input, and this Bayesian behavior is really important in real world use case.

Some Picture

Torch7

Written on June 20, 2015