Introduce FawkesJs

It has been awhile since I have been inactive in open-source world. Although nowadays trend is in VR, AI and iOT, API development is still needed.

FawkesJs is a Javascript framework that is built on top of express, typescript and MVC structure. Inspired by Laravel and Loopback, the target of the framework is to make Javascript development even easier.

Build in structure in this project

  • Express
  • Sequelize
  • Typescript
  • Swagger: use fawkesjs -s ./swagger/swagger.json to generate swagger document
  • Express Rest Param Validation: integration with swagger document generation
  • Acl (inside fawkesjs-starter/src/module)
  • AccessToken (inside fawkesjs-starter/src/module)

Why FawkesJs

  • NodeJS has good async support that PHP lack of
  • Laravel is the first choice in PHP, however NodeJS is still full of framework choices
  • Express in NodeJS is good, however its too minimalist.
  • Loopback is good, however personally I think Laravel structure is somehow better than Loopback - more organized
  • With Typescript, we can have better type checking during development time. Convenient to develop with atom
  • Name is just a symbol, its so hard to come out with a good naming
  • Rust seems promising, however I’m still waiting for Hyper to implement async IO.

Usage

  • git clone https://github.com/fawkesjs/fawkesjs-starter
  • follow the README
Read More

Lubuntu to replace Window

lubuntu

Why Lubuntu

As a developer, you must have heard of ubuntu. FYI, Lubuntu is a light weight version of Ubuntu. Using Lubuntu, you can have more control on your PC.

Advantage for replacing Windows for Lubuntu

  • Combination of Windows 7 with command line feeling
  • Directly use docker container to run application instead of virtualbox, which provide a higher speed.
  • Faster speed for most of the program, for example android-studio. This might be due to less backend application run in Windows (And it is hard to disable those application).
  • More control on the system and more debug message can be seen.

Disadvantage for using Lubuntu

  • Most gaming software run natively in windows, the worst things is I have installed lubuntu xenial 16.04LTS version and they have dropped support for my AMD RADEON graphic card model,
  • The UI is not as beautiful as Windows

Some bug Fix

  • Fix no sound
install restrict-ubuntu
  • Fix frequent crash in dell inspiron graphic card: upgrade from kernel 4.4 to kernel 4.6
  • Local Support: go to start > Preferences > Language Support
  • Unable to start docker: sudo docker daemon -D -s vfs

Some example

  • Run this blog locally
cd /e/nghenglim.github.io/ # contain the repo
sudo docker run -d -v "$PWD:/src" -p 4000:4000 grahamc/jekyll serve -H 0.0.0.0

Conclusion

If graphic card is not supported by Lubuntu, the best setup will be dual boot Lubuntu and Windows. Play games and watch movie in Windows, however do development in Lubuntu.

Read More

Google Tensor Processing Unit

If you have seen the Google’s I/O 2016, you might have known about Google Tensor Processing Unit, or TPU.

Why TPU is important

As machine learning guy will know, current industry trend is doing machine learning especially related to image processing one with GPU. A well known example is doing image convolution with CUDA. However GPU is mainly for graphic processing but not machine learning, therefore it is not functioning to the optimal.

And IMO this is the reason google releasing TPU. TPU will be optimized for doing machine learning training, and I foresee it will come with first class support with Tensorflow.

Forecasting

Google is aiming big for the A.I., and TPU is one of the strategy to make them stay in the main stream. And this is also a symbolic of that maker spirit is on the rise - traditionally it would be asking vendor such as Intel to produce these things.

Read More

My Blog Post Frequency

Request to extend my blog post Frequency: Busy doing project recently and most of the things is not appropiate to share. Therefore the blog post frequency will be extended from weekly to randomly - will try to have a blog post at least once a month.

Read More

TensorFlow Udacity 1_notmnist - Part 6

tensorflow-udacity

Summary of 1_notmnist

Basically 1_notmnist is to learn how to display data in Jupyter Notebook. Besides, it also let us know on sklearn - a python machine library - so that we can then compare with TensorFlow. This is the exact ipynb file at Tensorflow Github Repo.

Notice

This is as a form of sharing and discuss on better way to solve 1_notmnist problem. Do not copy and paste directly as it does not help on improving yourself + the answer is not optimized.

The entire series of TensorFlow Udacity can be found at here

Solving Problem 6

def reshape(a):
    return a.reshape(a.shape[0],a.shape[1]*a.shape[2])
t = pickle.load(open("notMNIST.pickle", "r"))
y = t['train_labels']
X = reshape(t['train_dataset']) # reshape it to 2d array
del(t) # this should free up more memory spaces
# choose form 0:10000 because not enough memory for the docker
# probably a way to do batch learning with scikit-learn
# http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
# http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html
C = 1.0
classifier = LogisticRegression(C=C, penalty='l1')
classifier.fit(X[0:10000], y[0:10000])
y_pred = classifier.predict(X)
classif_rate = np.mean(y_pred.ravel() == y.ravel()) * 100
print("classif_rate for %f " % (classif_rate))
# we now see how it is predicted using sample 10001 to 20000, which is not used for training
# actually should calculate the accuracy in percentage.
print(y[10001:20000])
print(y_pred[10001:20000])

png

Comment

This time we have learnt how to use scikit-learn to do LogisticRegression for notMNIST. As we can see, the classification accuracy is still not bad.

I will expect that tensorflow is either faster and more structured compared to this solution (or google probably will not use this as an example)

Read More

TensorFlow Udacity 1_notmnist - Part 5

tensorflow-udacity

Summary of 1_notmnist

Basically 1_notmnist is to learn how to display data in Jupyter Notebook. Besides, it also let us know on sklearn - a python machine library - so that we can then compare with TensorFlow. This is the exact ipynb file at Tensorflow Github Repo.

Notice

This is as a form of sharing and discuss on better way to solve 1_notmnist problem. Do not copy and paste directly as it does not help on improving yourself + the answer is not optimized.

The entire series of TensorFlow Udacity can be found at here

Solving Problem 5

def reshape(a):
    return a.reshape(a.shape[0],a.shape[1]*a.shape[2])
t = pickle.load(open("notMNIST.pickle", "r"))
unique_td = unique_rows(reshape(t['train_dataset']))
duplicate_rows = len(t['train_dataset']) - len(unique_td)
print(duplicate_rows)

png

Comment

Still got 2 optional question that I have not yet answered. Will get back to that in the future.

Read More

TensorFlow Udacity 1_notmnist - Part 4

tensorflow-udacity

Summary of 1_notmnist

Basically 1_notmnist is to learn how to display data in Jupyter Notebook. Besides, it also let us know on sklearn - a python machine library - so that we can then compare with TensorFlow. This is the exact ipynb file at Tensorflow Github Repo.

Notice

This is as a form of sharing and discuss on better way to solve 1_notmnist problem. Do not copy and paste directly as it does not help on improving yourself + the answer is not optimized.

The entire series of TensorFlow Udacity can be found at here

Solving Problem 4

print(np.unique(train_labels))
print(np.unique(test_labels))
print(np.bincount(train_labels))
print(np.bincount(test_labels))

png

Comment

Since after shuffling, the image for each count for each train_labels and test_labels is the same, therefore we can say that the data is still good after shuffling!

Read More

TensorFlow Udacity 1_notmnist - Part 3

tensorflow-udacity

Summary of 1_notmnist

Basically 1_notmnist is to learn how to display data in Jupyter Notebook. Besides, it also let us know on sklearn - a python machine library - so that we can then compare with TensorFlow. This is the exact ipynb file at Tensorflow Github Repo.

Notice

This is as a form of sharing and discuss on better way to solve 1_notmnist problem. Do not copy and paste directly as it does not help on improving yourself + the answer is not optimized.

The entire series of TensorFlow Udacity can be found at here

Solving Problem 3

len_dict = {}
for folder in folder_names:
    t = pickle.load(open(dir_name + "/" + folder + ".pickle", "r"))
    len_dict[folder] = len(t)
print(len_dict)

png

Comment

Since the image for each class is roughly around 52911, therefore we can say that the data is balanced. Though more appropiate way is to calculate the standard deviation :)

Read More

TensorFlow Udacity 1_notmnist - Part 2

tensorflow-udacity

Summary of 1_notmnist

Basically 1_notmnist is to learn how to display data in Jupyter Notebook. Besides, it also let us know on sklearn - a python machine library - so that we can then compare with TensorFlow. This is the exact ipynb file at Tensorflow Github Repo.

Notice

This is as a form of sharing and discuss on better way to solve 1_notmnist problem. Do not copy and paste directly as it does not help on improving yourself + the answer is not optimized.

The entire series of TensorFlow Udacity can be found at here

Solving Problem 2

# first load the pickle file, loading one for illustration purpose
t = pickle.load(open("notMNIST_large/A.pickle", "r"))
# need to use matplotlib inline if want to show at jupyter Notebook
%matplotlib inline
# plot one of image. the number 5 to be exactly
plt.imshow(t[4], interpolation='nearest')
# show the image
plt.show()

png

Read More

TensorFlow Udacity 1_notmnist - Part 1

tensorflow-udacity

Summary of 1_notmnist

Basically 1_notmnist is to learn how to display data in Jupyter Notebook. Besides, it also let us know on sklearn - a python machine library - so that we can then compare with TensorFlow. This is the exact ipynb file at Tensorflow Github Repo.

Notice

This is as a form of sharing and discuss on better way to solve 1_notmnist problem. Do not copy and paste directly as it does not help on improving yourself + the answer is not optimized.

The entire series of TensorFlow Udacity can be found at here

Preparation

# start a docker container
docker run -p 8888:8888 -it b.gcr.io/tensorflow-udacity/assignments:latest

Problem 1

Let’s take a peek at some of the data to make sure it looks sensible. Each exemplar should be an image of a character A through J rendered in a different >font. Display a sample of the images that we just downloaded. Hint: you can use the package IPython.display.

Solving Problem 1

import os, random
dir_name = "notMNIST_large"
folder_names = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
for folder in folder_names:
    im_name = random.choice(os.listdir(dir_name + "/" + folder))
    im_file = dir_name + "/" + folder + "/" + im_name
    display(Image(filename=im_file))

png

Comment

Before I get to Problem 1 I have spent a lot of time download the notMNIST_large due to low RAM I have given to my VM that run the docker container.

For the problem part, it teaches us on using display(Image(filename=im_file)) which is a very useful function of showing image file on Jupyter Notebook.

Read More

AlphaGo Win Lee Sedol, New Milestone for AI

AlphaGo

AlphaGo

Quoting from wikipedia, as of 12 Mar 2016:

AlphaGo is a computer program developed by Google DeepMind in London to play the board game Go. In October 2015, it became the first computer Go program to beat a professional human Go player without handicaps on a full-sized 19×19 board. In March 2016, it beat Lee Sedol in the first three games in a five-game match, the first time a computer Go program has beaten a 9-dan professional without handicaps.

AlphaGo’s algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play.

Why AlphaGo Wins Matter - Comparing to Watson

  • Watson
    • Brute force + NLP + Advantages of without using hand to press button
    • Does not actually represent what a human mind is doing
    • Most probably require large amount of computer resources when operating
  • AlphaGo
    • Deep Learning + Monte-Carlo tree search + Use human hand to place stones on board and stop the timer
    • Deep Learning is like human mind to do long short term memory + image recognition (which is quite important in Go)
    • Monte-Carlo tree search is like human mind do some calculation after he/she thinks about a good spot
    • It require more resources during training (for deep learning part), however the resource needed when match against human is significantly lower.

New Milestone For AI

The winning of AlphaGo is a new milestone for AI, and it also will brings new impact to our world.

  • The business will accelerate the usage of AI applications and robotic.
  • Amazed by AlphaGo, the people will easier accept that current AI starting to outperform them in many area.

I can see the coming of the second machine age and it will come lots sooner then we think, I will guess that it will be within this 5 year.

Read More

Fixing Local Jekyll after Upgrade to 3.0

jekyll-logo

Github Deprecating Redcarpet and Pygments

You Can see that in my repo commit:

And to have same jekyll version with Github Page my local has updated to Jekyll 3.0 with gem update jekyll too

Upgrading to Jekyll 3.0 break local

Image of Fixed Local Jekyll:

image breaked jekyll

All the Effort I Tried but Failed

  • conda install Pygments
  • gem uninstall jekyll; gem install jekyll

Image of Fixed Local Jekyll

image fixed jekyll

Solutions

After all the above failed, I think it might be due to markdown has changed, so I go to see the document of kramdown.

Then I found out:

  • ``` using this as code block markdown is not supported in kramdown
  • ~~~ this is the new supported code block markdown
Read More

Sharing - Meatier - a meteor alternative

Meatier

Meatier

Meteor is awesome! But after 3 years, it’s starting to show its age. This project is designed to showcase the exact same functionality as Meteor, but without the monolithic structure. It trades a little simplicity for a lot of flexibility.

Some of the Meatier author Thought

  • Built on Node 0.10, and that ain’t changing anytime soon
  • Build system doesn’t allow for code splitting (the opposite, in fact)
  • Global scope (namespacing doesn’t count)
  • Goes Oprah-Christmas-special with websockets (not every person/page needs one)
  • Can’t handle css-modules (CSS is all handled behind the scenes)
  • Tied to MongoDB for official support

My Opinion

Have not test the code yet, however I somewhat agree to what the author think.

I do hope this project will success, and it will be good news for the nodejs community.

Project Repo

The project repo is at https://github.com/mattkrick/meatier.

Read More

Golang Rest API

Gin

Installation on Golang

We will use gvm to install go for the current user. Note that we need to use go1.4 to compile go1.5 because go1.5 use go itself as compiler, therefore it need to have go installed.

Note that gvm is mainly for development use, IMHO using tar install or docker is a better way for production server.

bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)
gvm install go1.4
gvm use go1.4
gvm install go1.6
gvm use go1.6 --default

Gin-Gonic/Gin

Gin is a web framework written in Go (Golang). It features a martini-like API with much better performance, up to 40 times faster thanks to httprouter. If you need performance and good productivity, you will love Gin.

go get github.com/gin-gonic/gin
#~/golang/server.go
package main

import "github.com/gin-gonic/gin"

func main() {
    r := gin.Default()
    r.GET("/ping", func(c *gin.Context) {
        c.JSON(200, gin.H{
            "message": "pong",
        })
    })
    r.Run() // listen and server on 0.0.0.0:8080
}

and execute it with go run server.go, access localhost:8080/ping and you should be able to see {"message":"pong"}

Live Reloading with codegangsta/gin

However for development, it is important to have compile on files changed function. There is a package there for go. Note that do not start any app on proxy port that is used by gin (default 3000) and the application port.

Running below code will autostart the go app.

go get github.com/codegangsta/gin
cd ~/golang/
gin -i -a 8080

now change the code without terminating gin

#~/golang/server.go
package main

import "github.com/gin-gonic/gin"

func main() {
    r := gin.Default()
    r.GET("/ping", func(c *gin.Context) {
        c.JSON(200, gin.H{
            "message": "pong pong",
        })
    })
    r.Run() // listen and server on 0.0.0.0:8080
}

access to localhost:8080/ping and you should be able to see {"message":"pong pong"}

This is It

With this, we have successfully setup a simple golang rest api server. Golang is a promising solution to give both near C performance + dynamic language development speed + goroutine.

Read More

Udacity Deep Learning Course By Google

deep-learning-google

Deep Learning By Google @ udacity

FYI, the course link is https://www.udacity.com/course/deep-learning–ud730. This course takes Approximately 3 months with assumption 6hrs/wk (work at your own pace).

Knowledge Needed

  • Python (using jupyter notebook)

Assumption On Achievement

  • Good at using Tensorflow: which should be the main reason google had built this course, to promote their open source machine learning framework. IMHO, Tensorflow really lack of documentation/example due to its age.
  • More deep learning knowledge: Andrew Ng Machine Learning course only slightly covered neural network, I do hope can learn something new about deep learning here

Syllabus

This course consist of four Lessons:

  • Lesson 1: From Machine Learning to Deep Learning
  • Lesson 2: Deep Neural Networks
  • Lesson 3: Convolutional Neural Networks
  • Lesson 4: Deep Models for Text and Sequences

So far on Lessons 1

Nothings new if you had attended other machine learning course. Basicly:

  • Prepare your Docker environment which consist of Course Materials with Jupyter Notebook for the future Lesson
  • Walkthrough on basic machine learning rules such as learning rate, overfitting, etc.

Comment

This is really a course that machine learning guys shouldn’t miss. Deep learning is the state of the art machine learning technique which has been used to solve many real world problem.

You probably know that AlphaGo - which is mainly written in deep learning - has beaten the three-time European Go champion Fan Hui 5–0.

Read More

SaltStack Vagrant Part 3

saltStack

Last week we talked about how to make use pillar to create users in our master-less salt server. Today will be about how to use salt-formula and grains. All SaltStack Article

What is Salt Formulas

Formulas are pre-written Salt States. All official Salt Formulas are found as separate Git repositories in the “saltstack-formulas” organization on GitHub: https://github.com/saltstack-formulas

Objective

This week objective will be installing NodeJS 5.4.0 from binary with salt formula.

The standard way of installing salt-formula can be found at https://docs.saltstack.com/en/latest/topics/development/conventions/formulas.html. However, I will use my own way here.

Server Structure

# vagrant sync folder
config.vm.synced_folder "salt/root/", "/srv/salt/"
config.vm.synced_folder "salt/pillar/", "/srv/pillar/"
- /srv/
  - pillar/  # Unlike state tree, pillar data is only available for the targeted minion specified by the matcher type.
  - salt/  # All the configuration for the minion to run
    - node/ # node formula folder

Svn Export from Github

My favourite Github hack: (replacing tree/master with trunk)

cd salt/root
svn export https://github.com/saltstack-formulas/node-formula/trunk/node
#salt/root.top.sls
base:
  '*':
    - node
# pillar/node/init.sls
node:
  version: 5.4.0
  checksum: f037e2734f52b9de63e6d4a4e80756477b843e6f106e0be05591a16b71ec2bd0
  install_from_binary: True
# pillar/top.sls
base:
  '*':
    - node

If you try salt-call --local state.highstate you probably will get an error now. Why?

Grains

Salt comes with an interface to derive information about the underlying system. This is called the grains interface, because it presents salt with grains of information. Grains are collected for the operating system, domain name, IP address, kernel, OS type, memory, and many other system properties.

As of the time I’m writing, https://github.com/saltstack-formulas/node-formula/blob/master/node/map.jinja line 19-28 is missing CentOS config for the grains.os (most of saltstack formula comes with Ubuntu), so add Centos config

'Centos': {
    'node_pkg': 'nodejs',
    'npm_pkg': 'nodejs' if pillar_get('node:install_from_ppa') else 'npm',
},

How to see my os value is grains? Just run the script below can look for os, for me the value is CentOS ~~~ ruby sudo salt-call –local grains.items ~~~

Grains can be customized in minion config file or at /etc/salt/grains, however normally using its default value is enough.

End Result

It should successfully install node version 5.4.0 and verify against checksum f037e2734f52b9de6.......

Now run salt-call --local state.highstate and run node -v to verify, it should output v5.4.0.

This is it

So the basic concept of saltstack to work with masterless vagrant should be covered from part 1 to part 3. After knowing the basic concept it should be relatively easy to google search for salt related article and create a salt script that fits your need.

Read More

SaltStack Vagrant Part 2

saltStack

Last week we talked about creating a simple httpd service when we spin up our vagrant VM. This week I will continue to talk more about Salt.

Inside The Vagrant Virtual Machine

Salt run in /srv/ folder in our VM. Therefore our folder structure in virtual machine will look like this.

- /srv/
  - pillar/  # Unlike state tree, pillar data is only available for the targeted minion specified by the matcher type.
  - salt/  # All the configuration for the minion to run
- /etc/salt/
    - minion  #minion configuration
    - master  #master configuration

Running in Master-less vs Master mode

  • Master-less mode
salt-call --local state.highstate
  • Master mode: salt-minion need to accept the salt-key first
salt '*' state.highstate

For running vagrant, we use master-less mode, the different between master mode and master-less mode shouldn’t be large.

Objective

To create user(s) base on pillar files, uninstall httpd and install nginx

Coding

  • Vagrantfile: mount the desired folder to your VM
config.vm.synced_folder "salt/root/", "/srv/salt/"
config.vm.synced_folder "salt/pillar/", "/srv/pillar/"
  • /srv/pillar/users/init.sls: Sensitive user data
users:
    henglim.ng:
        uid: 1000
        fullname: Heng Lim Ng
        groups:
            - wheel
  • /srv/pillar/top.sls: Include any new pillar data here
base:
  '*':
    - users
  • /srv/salt/top.sls: the list of action we want to perform in env ‘base’, according to our objective
base:
  '*':
    - httpd.absent
    - user
    - nginx
  • /srv/salt/user/init.sls: make use of our pillar users

{% for username, user in pillar.get('users', {}).items() %}
{{username}}:
  user.present:
    - fullname: {{user.fullname}}
    - shell: /bin/bash
    - home: /home/{{username}}
    - groups: {{user.groups}}
{% endfor %}

  • /srv/salt/httpd/absent.sls: make sure we uninstalled the httpd that we have installed
httpd:               # ID declaration
  pkg.removed: []

/var/www/html/index.html:
  file.absent: []
  • /srv/salt/nginx/init.sls: yum search for nginx, if found, install nginx and enable it.
nginx:               # ID declaration
  pkg.installed: []
  service.running:
    - enable: True

To Be Continue

In this post I mainly described how to make use of pillar to manage our VM user, uninstalling and installing of services using yum packages. Stay tuned for the SaltStack Vagrant Part 3.

Read More

SaltStack Vagrant Part 1

saltStack

What is SaltStack

  • SaltStack software orchestrates the build and ongoing management of any modern infrastructure.
  • SaltStack is also the most scalable and flexible configuration management software for event-driven automation of CloudOps, ITOps and DevOps.
  • SaltStack is one of the top configuration management framework among Chef, Puppet, Ansible and SaltStack.

Why SaltStack with Vagrant

  • Salt execution routines can be written as plain Python modules. (I am the Python guy)
  • SaltStack script for Vagrant will be mostly same for the cloud server
    • Achievement development server == production server Unlocked

Vagrant Folder Structure

- salt/
  - root/
    - top.sls
    - webserver.sls
  - minion
- Vagrantfile

Vagrantfile Setup

Vagrant.configure("2") do |config|

    # load up the box for centos 6.6
    # This will take long to download the os, hold up there
    # Or go for a coffee
        config.vm.box = "CentOS_7_x64"
    ## For masterless, mount your salt file root
    config.vm.synced_folder "salt/root/", "/srv/salt/"
    config.vm.network "private_network", ip: "10.0.0.201"

    ## Use all the defaults:
    config.vm.provision :salt do |salt|
        salt.bootstrap_options = "-P -c /tmp" # to salve minion_config not copied issue
        salt.masterless = true
        salt.minion_config = "salt/minion"
        salt.run_highstate = true

    end
end

top.sls

base:
  '*':
    - webserver

webserver.sls

httpd:               # ID declaration
  pkg:                # state declaration
    - installed       # function declaration
  service.running:
    - enable: True
    - require:
        - pkg: httpd

minion

file_client: local

Ending

Run vagrant up you should get a CentOS_7_x64 box with httpd server installed!

By the end of this article, you should know the basic of how salt work.

Read More

Git Merging or Combining Multiple Commits

git

Whats The Problem

  • many commits inside a feature branch
    • small enhancement or defect fix
    • not using git stash

The Solution: Git Rebase

As in my commit message I use issue number such as #143 fix abc, therefore I set git config comment character to “;”.

git config core.commentchar ";"
git log --pretty=oneline
git rebase --interactive HEAD~2

in this example, this is what you might see

pick abc2345 testing
pick def1234 latest commit

edit it to

pick abc2345 testing
s def1234 latest commit

save it and execute ~~~ git log –pretty=oneline ~~~

and you will find that the latest 2 commit has been merged to one

Conclusion

This is helpful if you are creating a lot of dummy commit due to minor bug fix or not git stashing.

Read More

XMind Mind Mapping Software

XMind

What is XMind

XMind is one of the most popular mind mapping software.

Advantages of XMind

  • XMind is an open source project, which means it’s free to download and free to use forever.
  • Professional mind mapping, good for showing on presentation.
  • XMind Cloud (coming early 2016)
    • It automatically synchronizes your XMind files across multiple Mac/PCs. Fast, secure, and free.
    • Very useful for a small team.

Trying XMind

I have found that XMind is quite popular among China software companies, therefore I plan to give it a try.

  • comes with easy to use template

Templating

  • easy file dragging: as link or hard copy

File_copy

  • Nice recent history UI

History_tab

  • Export as images

Export_images

  • Build in markers

Markers

Conclusion

XMind is powerful, should try to start using it during my daily works.

Read More

Seek Asia Hackathon 2015 website

seekasia-hackathon

Designing the website

It all begins from the day I play the role as seekasia hackathon committee. We have decided to create a single page application for our co-worker to view the latest information. After considering all the options, I have decided to put the website at seekasia-hackathon-2015.github.io, after review done we will then move the files to seekasia website.

Benefit of hosting at github repo

FYI, the repo is at github.

The good part of github is if you create a repo with {username}.github.io, the page will be seen at {username}.github.io within 5 minutes.

The structure of the single page application

As the initial committer, I will be in charge of structuring the single page application.

  • Build with bootstrap
  • I perfer bootwatch paper, due to its appearance is following material design
  • Using Github Flow - directly commit to master branch since is small application and contributors only me, Alfred, Aaron and Foo.

So Far so Good

The Seekasia Hackathon 2015 had ended successfully, it has been a good memory among our committees for making the Hackathon held successfully.

Read More

Laravel5 Productive Tips

Laravel

Laravel 5

Laravel currently is the most popular PHP framework.

With the news that PHP 7 had been released on December the 3th, I’ll expect a spike in usage of laravel usage. PHP 7 is definitely the best tool for developing web application quickly.

Some Productive tips

  • Create migration scripts directly from database, at the time I’m writing, the composer default supporting laravel-4. With this settings, we can easily generate migration script with php artisan migrate:generate flights
"require-dev": {
    "fzaninotto/faker": "~1.4",
    "mockery/mockery": "0.9.*",
    "phpunit/phpunit": "~4.0",
    "phpspec/phpspec": "~2.1",
    "xethron/migrations-generator": "dev-l5",
    "way/generators": "dev-feature/laravel-five-stable"
},
"repositories": [
    {
        "type": "git",
        "url": "https://github.com/jamisonvalenta/Laravel-4-Generators.git"
    }
],

# config/app.php
'Way\Generators\GeneratorsServiceProvider',
'Xethron\MigrationsGenerator\MigrationsGeneratorServiceProvider',
#composer.json
"require": {
    "laravelcollective/html": "5.1.*"
}

#config/app.php
'providers' => [
  // ...
  Collective\Html\HtmlServiceProvider::class,
  // ...
],

#config/app.php
'aliases' => [
  // ...
    'Form' => Collective\Html\FormFacade::class,
    'Html' => Collective\Html\HtmlFacade::class,
  // ...
],
  • Get a good open source IDE: Atom
    • no more sublime prompt when saving (ignore if you have never used sublime text)
Read More

Laravel Git CentOS 7 Setup

Laravel

Laravel

Laravel currently is the most popular PHP framework. With the news that PHP 7 is going to release on December the 3th, I’ll expect a spike in usage of laravel usage. PHP 7 is definitely the best tool for developing web application quickly.

Setup

The below setup will be using Apache + Laravel approach, as nginx + PHP-FPM seems to have potential memory leak problem.

at your remote server

rpm -Uvh https://mirror.webtatic.com/yum/el7/webtatic-release.rpm
yum install httpd php56w php56w-mysqlnd mariadb-server php56w-mcrypt php56w-dom php56w-mbstring
curl  -k -sS https://getcomposer.org/installer | php

mv composer.phar /usr/local/bin/composer

composer create-project laravel/laravel=5.1 /var/www/laravel --prefer-dist

sudo vim /vagrant/provisioner_utils/laravel.conf
  NameVirtualHost *:80
  <VirtualHost *:80>
    ServerAdmin webmaster@example.com
    ServerName your.localhost.com
    ServerAlias your.localhost.com
    DocumentRoot /var/www/laravel/public/
    <Directory /var/www/laravel>
    	AllowOverride All
    </Directory>
  </VirtualHost>

cd /var
mkdir repo && cd repo
mkdir site.git && cd site.git
git init --bare

cat > post-receive
#!/bin/sh
git --work-tree=/var/www/laravel --git-dir=/var/repo/site.git checkout -f

at your local desktop

git init
git remote add live ssh://username@your.localhost.com/var/repo/site.git
git add .
git commit -m "Initial commit"
git push live master

Conclusion

With this setup, we can easily test our code inside our VM.

Read More

More About My Technology Stack

computer-guy

PHP MVC Programmer

My main job at JobStreet is working as a programmer that use PHP MVC framework. To be more detail, it is developing using the Linux Apache Mysql with PHP MVC (model view controller) framework.

What To expect From A LAMP Stack Programmer Using MVC Framework

To my surprise, some recruiter will still ask question like “do you know jquery/css/HTML”. Nowadays when a programmer daily job is using PHP MVC framework, then the technology stack normally include Jquery, CSS, HTML, Mysql, Javascript and PHP.

More Than Just PHP

PHP had already become as simple as English to me. As a person that likes challenges, I always thrilled to solve bigger challenges, that sometimes beyond what PHP can do.

For example I have written many python scripts to automate daily works. I have also worked with Solr and Elasticsearch in my daily job. There are a lot of things that I like to do and I cannot be limited into a specific scope - as I am a full stack software engineer.

Skill Set In Details As Of 2015-11-21

  • PHP MVC framework ( with proven working experience )
    • View = Jquery, CSS, HTML, Javascript, CSS3, HTML5
    • Model = Mysql (Database indexes, writing queries, etc)
    • Controller = PHP
  • Full Stack Web Developer ( with proven working experience )
    • Twitter Bootstrap, various javascript plugin such as datatable
    • UX + UI (without the create Photoshop part, that’s what a graphic designer do)
  • Linux ( with proven working experience )
    • Setting up vagrant, jenkins, LAMP, ELK and others
  • Data Science (hobby)
    • Python (data collecting + machine learning)
    • Statistic (such as overfitting, anomaly)
    • Deep Learning (Tensorflow)
  • Mobile Development (hobby)
    • iOS (reactJS)
    • Android (Java)
  • Miscellaneous Languages (hobby)
    • NodeJS, GoLang: for web application, IMO better than PHP
    • C++, C, Scala + Rust: learning for learning
Read More

Tensorflow (Machine learning toolset Open Source by Google)

tensors-flowing

Tensorflow

On 10th November, I saw the news that Google open sourced Tensorflow. As a programmer that is passionate towards AI, this is a thing that I must try out.

Setup

FYI, setup instruction can be found at here.

Tensorflow with mnist

Put the file at /home/vagrant/notebook/

  • Download fully_connected_feed.py
    • replace from tensorflow.g3doc.tutorials.mnist import input_data to import input_data
    • replace from tensorflow.g3doc.tutorials.mnist import input_data to import mnist
  • Download input_data.py
  • Download mnist.py

Execute python fully_connected_feed.py, it should run and give you the result like this

Step 1000: loss = 0.40 (0.007 sec)
Step 1100: loss = 0.52 (0.087 sec)
Step 1200: loss = 0.46 (0.005 sec)
Step 1300: loss = 0.49 (0.005 sec)
Step 1400: loss = 0.48 (0.006 sec)
Step 1500: loss = 0.37 (0.029 sec)
Step 1600: loss = 0.45 (0.005 sec)
Step 1700: loss = 0.40 (0.005 sec)
Step 1800: loss = 0.39 (0.005 sec)
Step 1900: loss = 0.44 (0.006 sec)
Training Data Eval:
  Num examples: 55000  Num correct: 49219  Precision @ 1: 0.8949
Validation Data Eval:
  Num examples: 5000  Num correct: 4508  Precision @ 1: 0.9016
Test Data Eval:
  Num examples: 10000  Num correct: 8978  Precision @ 1: 0.8978

Graph visualization

Now run tensorboard --logdir=/home/vagrant/notebook/data, open browser at localhost:6006 to view the graph. You should be able to see something like this:

tensorboard

My Review

Not really standout from torch/caffe/etc, I thought it has the ability to drag and drop modify code like Pentaho, however it only has the ability to view the summary graph.

  • Overall - Good to use.
  • Surprising me? Not.
Read More

Markdown Enhancer JS

mardown-enhancer-JS

Markdown Enhancer JS

Markdown Enhancer JS is a plugin to customize your markdown especially for static site generator such as Jekyll or Hugo. Main purpose to let the markdown support Github checkbox - [ ] or - [x] which is not officially supported.

Repo

The repo is at nghenglim/markdown-enhancer. This is my first open source javascript plugin, hope can solve people problems well.

Read More

Kaggle titanic challenge with Julia commentary

Kaggle Titanic Challenge

Kaggle titanic challenge is a famous knowledge competition which many new Kaggler will try their first Kaggle competition. Below commentary will be based on the nbviewer.

FYI

There are also jupyter docker out there, it will be suitable if there are no GPU involved in your machine learning application.

Recently Julia is on the trend, due to its purpose of becoming an easy-to-use scripting language, while giving near to C performance speed. I always see it as combination of Python + R + C, while some might think it as Python + Matlab + C

Commentary

using Gadfly
using DataFrames
df=readtable("train.csv")
describe(df)
  • Gadfly is a popular Julia package to create the graph, equivalent to python matplotlib
  • DataFrames is useful package to read and store tabular data., equivalent to python panda
typeof(df)
df[1,:]
df[:Name]
  • I will use dump(df) though :)
pool!(df,[:Sex])
pool!(df,[:Survived])
pool!(df,[:Pclass])
  • Using pool is to make df[:Sex], df[:Survived], df[:Pclass] to become a factor, a bit similar to a dictionary.
  • By doing this, df[:Sex] will become DataArrays.PooledDataArray{UTF8String,UInt8,1} instead of DataArrays.DataArray{UTF8String,1}
plot(df,x="Sex",color="Survived",Geom.histogram)
  • Generating graph, however not working in my local, seems like something is broken in Gadfly
df[!isna(df[:Age]),:]
averageAge=mean(df[!isna(df[:Age]),:Age])
df[:Age]=array(df[:Age],averageAge)
  • From the describe(df), we can see that there are 177 NAs, so it is important to replace NAs data to average age
  • array(da::DataArray{T}, replacement::Any) is deprecated. (as the author run this long ago)
typeof(df[:Sex])
plot(x=df[!isna(df[:Embarked]),:Embarked],Geom.histogram)
df[:Embarked]=array(df[:Embarked],utf8("S"))
pool!(df,[:Embarked])
typeof(df[:Embarked])
  • Due to NAs of Embarked, one of the options is to replace NAs with the most occurence of Embarked data, based on the plot above
newdata=df[:,[:Pclass,:Age,:Sex,:SibSp,:Parch,:Fare,:Embarked]]
describe(newdata)
  • The author decided to make a prediction based on the column above: Pclass, Age, Sex, SibSp, Parch, Fare, Embarked.
using DecisionTree
xTrain=newdata
yTrain=df[:Survived]
yTrain=array(yTrain)
accuracy = nfoldCV_forest(yTrain, xTrain, 5, 20, 4, 0.7)
  • DecisionTree package is similar to python sklearn.ensemble.RandomForestClassifier
  • Testing in my local, nfoldCV_forest has no method matching nfoldCV_forest, probably due to upgraded version of dataframe
Read More

PHPStorm Xdebug Vagrant

PHPStorm

Xdebug

Xdebug is a PHP extension which provides debugging and profiling capabilities. Most PHP IDE has built in integration with Xdebug, it works as below:

  • set breakpoints at IDE
  • click debug
  • at breakpoint, variables details are shown and pausing the web application Xdebug-Breakpoint
  • resume and continue to next breakpoint

At vagrant

yum install php-devel
yum install php-pear
yum install gcc gcc-c++ autoconf automake
pecl install Xdebug
#/etc/php.d/xdebug.ini
[xhprof]
zend_extension="/usr/lib64/php/modules/xdebug.so"
xdebug.remote_enable = 1
xdebug.remote_connect_back = on
xdebug.idekey = "PHPSTORM"
xdebug.remote_handler=dbgp
xdebug.remote_host=10.0.2.2
xdebug.remote_port=9001
service httpd restart

At PHPStorm

  • File > Settings > Language & Frameworks > PHP > Servers: setup PHPStorm-servers
  • File > Settings > Language & Frameworks > PHP > Debug
    • debug port: 9001
    • check “Can accept external connections” PHPStorm-debug
  • Run > Start Listening to PHP Debug Connections

At Chrome (or other browser)

  • Install Xdebug Helper
  • Right click the new red bug in the browser > options: change IDE key to PhpStorm xdebug-helper
  • Go to your development website
  • Left click the red bug > Debug
  • Refresh website and you shall see web application paused, switch to IDE and you shall see the debug detail there.
Read More

Atom Tips Extension Base Setting

Atom

Previously when coding with Atom, there is a thing really irritating - when I code for php I have to set tab length 4 but for python or jade, I need tab length of 2.

Normally we set the tab length at File > Settings.

Problem Solving

So how to solve it? Just go to “File > Open Your Config” to open config.cson, then paste the config below.

"*":
  "exception-reporting":
    userId: ":)"
  welcome:
    showOnStartup: false
  core: {}
  editor:
    invisibles: {}
    showInvisibles: true
    fontSize: 13
    zoomFontWhenCtrlScrolling: false
    showIndentGuide: true
    softWrap: true
  "tabs-to-spaces": {}
  whitespace: {}
".php":
  editor:
    tabLength: 4

Explanation

  • “*” is meaning wildcard for all type of file, which come along after Atom has been installed
  • “.php” means for every file extension that is php, and I set tabLength to 4 (default is 2)
Read More

Unconditional Basic Income

UBI

An UBI - unconditional basic income (also called basic income, basic income guarantee, universal basic income, universal demogrant, or citizen’s income) is a form of social security system in which all citizens or residents of a country regularly receive an unconditional sum of money, either from a government or some other.

Giving Money Away? Communist?

It is not communist, in fact, it is more toward capitalism. The core of basic income is that it is giving only livable amount of money - that you cannot afford luxury. If you want a better lifestyle, then have to go to work.

The pro side is - you can now follow your passion without worry about food or shelter. More startup will be expected and people can now chase their dream without worry about their parents or children.

For those poorly paid job such as garbage collector, now they can say no to their boss as they are being paid a penny if compare to their poor working condition. Currently they are unable to bargain because they are threaten by hunger. With the implementation of UBI, then economic rule will finally correct - the employers have raise the salary in order to have them working in such poor condition, or make the working environment better.

My Opinion

I’m a supporter for UBI, most of the reason that I support for UBI can be found at reddit/basicincome.

The evolution of deep learning method, has caused a spike in automation performance - such as auto driving car, robot written news and etc. One of the famous video describing these technology evolution is human need not apply.

The solution to solve this problem is to introduce UBI, so that most people still have income, while keeping the motivation to work in order to have more.

Read More

Autoclicker with python

Python

Why Create This Autoclicker

This autoclicker is mainly for the game realm grinder, however this autocicker should be easily applied to other clicking game as well.

Feature

  • Protect the mouse
  • Save your time from duplicate clicking
  • Learn some python code

Code

I have uploaded the code to the python autoclicker github repo.

As I am running in windows 8, there are some python module that is needed to be installed through window binaries - for example pywin32.

import win32api
import win32con #for the VK keycodes
import time
import msvcrt as m
import signal
import sys

def mouseClick(timer):
    if not check_off_pos():
        print("Click!")
        x,y = win32api.GetCursorPos()
        win32api.SetCursorPos((x, y))
        win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
        time.sleep(timer)
        win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
        time.sleep(timer)
        global count
        count = count + 1
        if count >= 3 / (timer * 2):
            cast_spell(timer)
            count = 0

def cast_spell(timer):
    print("Cast Spell!")
    global spell_x
    global spell_y
    global tx
    global ty
    x = spell_x
    y = spell_y
    win32api.SetCursorPos((x, y))
    win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
    time.sleep(timer)
    win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
    time.sleep(timer)
    win32api.SetCursorPos((tx, ty))
    time.sleep(timer)


def getPos():
    x,y = win32api.GetCursorPos()
    return x, y

def wait():
    m.getch()

def signal_handler(signal, frame):
    print('You pressed Ctrl+C!')
    sys.exit(0)

def check_off_pos():
    global tx
    global ty
    a, b = getPos()
    if abs(a - tx) > 100 or abs(b - ty) > 100:
        return 1
    return 0

input("Press Enter to capture of chest...")
tx, ty = getPos()
input("Press Enter to capture of spell...")
spell_x, spell_y = getPos()
count = 0
options = []

signal.signal(signal.SIGINT, signal_handler)
print("Press Ctrl+C")
sleep = 0
while True:
    mouseClick(0.03)
    a, b = getPos()
    if check_off_pos():
        print('sleeping')
        time.sleep(3)
        sleep = sleep + 1
        if sleep == 5:
            input("Press Enter to restart...")
    else:
        sleep = 0
Read More

PHPStorm 9.0 Review

PHPStorm

Installing PHPStorm (License from company)

The installation of PHPStorm is very easy, provided we have the license.

  1. First create a new account at Jetbrains with the email that company has given the license to.
  2. Download the PHPStorm from download page
  3. Run PHPStorm, when prompt for username and password, enter username/email and password

Some Other Function

  • Help » Productivity guide : give new PHPStormer a quick knowledge of useful shortcuts
  • Import Code Style : File » Import Settings » choose your settings.jar
    • This is to import the code style which is shared across the company, can have minor customization afterwards to suite our needs
  • Wrap and unwrap code with control function:
    • ctrl + alt + T: wrap the code with selected if else statement
    • ctrl + shift + del: remove the selected control statement
  • auto indent lines:
    • ctrl + alt + l: good for HTML file mixed with php code.
  • Some shortcuts is good: full default keymap at Help » Default Keymap Preferences

Function Looking Forward

There are some function I hope that phpstorm will have after my experience with Eclipse, Atom and Sublime Text.

  • .gitignored file/folder coloring: yes it have
  • Code auto navigation : same with eclipse, just ctrl + click for code navigation
  • Show UNIX/DOS format of file : ???
  • Git integration : VCS > Git
  • Show hidden whitespace ( after use tab to whitespace) : File » Setting > Editor > General > Appearance > Tick Show Whitespaces
  • search next and remain the previous selection : Alt + J
  • nice line comment ( ctrl + / )
    • not as good as atom, the line comment start from line start, however atom is start from the indentation, which is nicer

First Review Conclusion

As for the first glance, I still like Atom UI more. However phpstorm has some nice feature more such as code navigation, better debugging tool and less crash happen(however PHPStorm uses more memory).

Anyway for developing PHP code I still have to go for PHPStorm since have bought the license, lets see will I change my mind after using it a little bit longer.

Read More

Fix vagrant connect to wrong box - Windows

Vagrant

Scenario

So there is a strange behavior in using vagrant for my work. That is when I start vagrant before starting vmware, it will not find the correct path to the vagrant box and thus create a new VM.

The reason behind this is because my default drive is C drive, and when I join company network, it will mount a Z drive to my laptop and thus my home drive become Z drive. So when I open vmware, vagrant will found my virtualbox which is in C drive, however without open it it will assume my vmware is in Z drive and thus unable to find the VM.

So now the vagrant point to the wrong VM, how can I get back the previous vagrant box? Because rerun the vagrant setup will need about 30 mins like that and there are some database data inside my VM. Therefore, the solution is to let the vagrant to point back the correct file.

Scripts

cd "C:\Program Files\Oracle\VirtualBox"
VBoxManage.exe list vms
"vagrant_box" {d055f9f7-6b67-4080-a4b0-2b4f149cac4d}

touch vagrant/tools/.vagrant/machines/default/virtualbox/id

Paste the above id d055f9f7-6b67-4080-a4b0-2b4f149cac4d inside the id file that just created, then vagrant up, and this should solve the issue.

It is refering to this Github issue

Read More

Coursera course review - Algorithms

Coursera

Algorithms: Design and Analysis, Part 1 by Tim Roughgarden @ STANFORD UNIVERSITY.

Course Syllabus

  • Week 1: Introduction. Asymptotic analysis including big-oh notation. Divide-and-conquer algorithms for sorting, counting inversions, matrix multiplication, and closest pair.

  • Week 2: Running time analysis of divide-and-conquer algorithms. The master method. Introduction to randomized algorithms, with a probability review. QuickSort.

  • Week 3: More on randomized algorithms and probability. Computing the median in linear time. A randomized algorithm for the minimum graph cut problem.

  • Week 4: Graph primitives. Depth- and breadth-first search. Connected components in undirected graphs. Topological sort in directed acyclic graphs. Strongly connected components in directed graphs.

  • Week 5: Dijkstra’s shortest-path algorithm. Introduction to data structures. Heaps and applications.

  • Week 6: Further data structures. Hash tables and applications. Balanced binary search trees

Comments On The Course

Quite like this course as I have not take standard algorithm course in my university. In this course, I’ve learn how the algorithm works and its theory.

The course coding assignment is not bad though. I’ve been using python to solve this 6 week code assignment.

About the lecture, its not quite easy to understand without some programming background. Even I listen to the course, I also have to wiki or stackoverflow to know more about the lecture material.

Today is week 6 for this course, therefore the statement of accomplishment should be coming out in the coming few weeks. This is what I like for stanford university also. Usually their course will provide a statement of accomplishment during the end of the course.

Overall, I’ll give a 4 out of 5 star rating to this course.

Read More

Atom Review After 2 Weeks

Atom

You can view back my previous related article about atom first use review.

Configuration needed to make Atom feel better

  • Settings > show invisible (checked) : make whitespace show as ·
  • Plugins > tabs-to-spaces : convert leading tabs to spaces or spaces to tabs
  • Plugins > line-ending-converter : convert line endings(EOL) to Windows/Unix/Old Mac, and show current line ending in status bar

Useful hotkey

The full hotkey list is at File > Settings > Keybindings, below are some nice feature which other IDE sometimes lack off but is useful

  • ctrl + d : find and replace (while remain selection of previous found word)
  • ctrl + / : toggle comment according to file extension, IMO the commenting style is way better than eclipse
  • shift + tab : remove 1 tab in selected lines
  • ctrl + shift + \ : reveal active file in tree view
  • ctrl k + ctrl u : text upper case

Useful shortcut

  • right click + filename on pane > Reveal in Tree View: show the file location in tree view
  • right click + file in tree view > duplicate : duplicate the file and change to desired path
  • right click + md file > markdown preview: preview markdown in atom
Read More

84 days transformation

84days Transformation

The Idea

After I have been working in KL for 1.5 years, my weight has gone up 72kg and have a big belly. Therefore I’ve decide to cut down my weight and get a six packs.

Inspiration

Inspired by the result of Kris Gethin 84 days transformation, I will also try to archive huge process in 84 days.

The result

After 84 days, my weight has gone down from 72kg to 62.3 kg. Also, the six packs is visible down (the lower 2 abs is not so obvious though).

What I have learnt

Discipline, discipline and discipline. I have to control my calories strictly and to attend gym regularly. Even if not feel like to go gym, will have to do chin up or leg raise at home.As long as we continue to do small achievement every single day, we can have big achievement one day.

Program

  • Meals:
    • Main meal is rice, chicken breast, vegetable especially broccoli
    • Sunday can have a bit more calories such as 1/5 watermelon, 0.5kg papaya etc.
    • Search “calories BMI” in google to know how much calories to take
  • Exercise:
    • Focus on lower abs:Chin up bar leg raise, side leg raise, roll wheel
    • Chin up, bicycle machine, row machine
Read More

Atom First Use Review

Atom

Installing Atom

The installation can be found at github. For me, I just have to download exe installer to execute in Windows.

The UI

The default UI is quite similar with sublime text. Those who comes from sublime text will be pretty happy with it.

Initialize time

The initialize time is slower than nodepad++ and sublime text, however its faster than eclipse.

Default Package

It comes with lots of nice packages as default. For example PHP, git source control and etc. It also comes with spell checking, word autocomplete for writing normal blog post. It has increase my productivity when writing small project or blog post.

Package that currently lack of

For coding for PHP application in my daily professional job, it is lacking the functionality of code navigation. This is due to when working on a PHP application that is manage by a big team, the code base is really large and code navigation largely increase the productivity. Eclipse do well in this part, it just has to Ctrl + click the class, normally it will navigate to the correct file.

Overall feeling

For me, its worth to install atom. It makes me feel pro to write my blog post in atom than in nodepad++.

Read More

Trying to use Torch7 CUDA

NVIDIA CUDA

What is CUDA

CUDA® is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit.

Installing CUDA with torch7

how to install

lspci | grep 'VGA\|NVIDIA'
uname -m && cat /etc/*release
wget http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-7.0-28.x86_64.rpm
sudo rpm --install cuda-repo-rhel6-7.0-28.x86_64.rpm
sudo yum clean expire-cache
sudo yum install cuda

luarocks install cutorch

Oops, something goes wrong

I forgot to search for web whether my AMD graphic card can run CUDA

Lesson learned: do more research on graphic card before buying laptop.

It seems like the best alternative I can have is to use opencl with cltorch. However the cltorch is still under development, so means that I cant utilise my GPU power to run machine learning job with torch7. This is really unexpected for me.

Read More

Coursera course review - Machine Learning by Andrew Ng

Machine Learning

Machine Learning by Andrew Ng @ STANFORD UNIVERSITY.

Andrew Ng is Associate Professor of Computer Science at Stanford; Chief Scientist of Baidu; and Chairman and Co-founder of Coursera. His machine learning course is the MOOC that had led to the founding of Coursera!

Course Summary

This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Topics include:

  • Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks).
  • Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning).
  • Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI).

The course will also draw from numerous case studies and applications, so that you’ll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.

Comments On The Course

Stanford University has done a good job in managing courses in coursera, all the courses is in good quality. This machine learning course is my first course in coursera, and also the first course that give me basic knowledge about machine learning.

At this course, Andrew Ng has given us a lot of programming assignment to solve it in octave, which include coding for support vector machines, clustering and neural network.

I wish that Andrew Ng has given more lecture on machine learning. His lecture is clear and meaningful, definitely a 5 star rating course.

Read More

Real time chat with Mongodb Express AngularJS Node.js

MEAN STACK

What I am going to write about is a real time chat proof of concept which I have done in 2014. It is now at Github. A demo can be found at http://nodejslim.herokuapp.com/

MEAN Stack

MEAN stack is a software bundle with Mongodb, Express, AngularJS and NodeJS.

What this Repo do.

  • It uses Node.js – a web server that use a event-driven, non-blocking I/O model that makes it lightweight and efficient.
  • To make use of this non blocking attributes, we use an asynchronous non-blocking nosql database MongoDB.
    • FYI, mysql and postgresql still have no good asynchronous support at this point of time.
    • As a real time chat system, it has to store many data, a relational database will be too hard to scale and also to partition it easily.
  • To setup this server quickly while maintaining a good architecture, we use express as a web framework for Node.js.
  • For the frontend, we use AngularJS as Javascript MVC framework. Everything will serve restfully from Node.js to AngularJS.

Some comment

  • Node.JS is using ECMAScript 5(current javascript syntax), therefore the code is not so clean. With the future ECMAScript 6 become mainstream, I believe that it will be easier to code for nodeJS.
  • Need to make sure everything is asynchronous, and use promise when needed.
Read More

Kaggle titanic challenge with torch7

Kaggle Titanic Challenge

Kaggle titanic challenge is a famous knowledge competition which many new Kaggler will try their first Kaggle competition. Since there are currently no tutorial to solve this challenge with artificial neural network, I decided to use torch7 to compete in this competition. FYI, click here to get the data.

Why Torch7

Deep learning is state of the art machine learning algorithm in learning image, video, sound and natural language. Torch7 is one of the famous deep learning framework, which is already used within Facebook, Google, Twitter, NYU, IDIAP, Purdue and several other companies and research labs.

Model

Learning Model

Validation

Validation is using back the training dataset since the dataset will be too small if I separate it. As I am using tanh as transfer function , for those model output bigger than 0, I classified it as survived. Using model after training for 20000 epoch, the result are:

  • False Positive : 31
  • True Positive : 276
  • False Negative : 66
  • True Negative : 518

Submission Result

The score is the percentage of passengers we correctly predict. The submission result using model on epoch 20000 is 0.74641. From the result, we know that our model is slightly overfitted because training set accuracy is 89.113% while test set is 74.646%.

Some Thought

I have showed that torch7 works in Kaggle titanic challenge too. With this method, I can easily get the probability that a passenger is survived based on the input, and this Bayesian behavior is really important in real world use case.

Some Picture

Torch7

Read More

Paper reading - ADADELTA AN ADAPTIVE LEARNING RATE METHOD

SGD vs ADADELTA vs ADAGRAD vs MOMENTUM

This paper was done by Matthew D. Zeiler while he was an intern at Google.

Introduction

The aim of many machine learning methods is to update a set of parameters $x$ in order to optimize an objective function $f(x)$. This often involves some iterative procedure which applies changes to the parameters, $\Delta{x}$ at each iteration of the algorithm. Denoting the parameters at the t-th iteration as $x_t$, this simple update rule becomes:

  • $g_t$ is the gradient of the parameters at the t-th iteration
  • $η$ is a learning rate which controls how large of a step to take in the direction of the negative gradient

Purpose

The idea presented in this paper was derived from ADAGRAD in order to improve upon the two main drawbacks of the method:

  1. the continual decay of learning rates throughout training
  2. the need for a manually selected global learning rate.

SGD vs ADAGRAD vs ADADELTA

  • SGD:
    • where $\rho$ is a constant controlling the decay of the previous parameter updates
  • ADAGRAD:
  • ADADELTA:
    • where a constant $\epsilon$ is added to better condition the denominator
    • where $E[g^2]_t$ is expected value of gradient with power 2 at time t

Result

Compared with SGD, ADAGRAD and MOMENTUM, normally ADADELTA has a convergence faster and has lower error rate.

Personal Thought

Have tried ADADELTA and SGD. Although for each epoch ADADELTA takes longer time to compute, we just have to input (default value) $\rho = 0.95$ and $\epsilon = 1e^{-6}$ then it will learn very well. If use SGD, we have to fine tune the learning rate and the error rate is often bigger than ADADELTA.

Read More

Kaggle contest review - Bike Sharing Demand

Bike sharing demand

This kaggle bike sharing demand challenge is to forecast use of a city bikeshare system.

Summary

Bike sharing systems are a means of renting bicycles where the process of obtaining membership, rental, and bike return is automated via a network of kiosk locations throughout a city. Using these systems, people are able rent a bike from a one location and return it to a different place on an as-needed basis. Currently, there are over 500 bike-sharing programs around the world.

The data generated by these systems makes them attractive for researchers because the duration of travel, departure location, arrival location, and time elapsed is explicitly recorded. Bike sharing systems therefore function as a sensor network, which can be used for studying mobility in a city. In this competition, participants are asked to combine historical usage patterns with weather data in order to forecast bike rental demand in the Capital Bikeshare program in Washington, D.C.

Evaluation

Submissions are evaluated one the Root Mean Squared Logarithmic Error (RMSLE). The RMSLE is calculated as

Bike sharing demand evaluation

Where:

  • n is the number of hours in the test set
  • pi is your predicted count
  • ai is the actual count
  • log(x) is the natural logarithm

Method

Data preprocess with python. Using random forest number of trees = 50. Result:

Bike sharing demand result

Thought

This is the first kaggle challenge that I participate in because of an coursera course Introduction to Data Science. During the process of competing, I have improved quite a lot. This is my first kaggle competition, and is not the last. I am currently working on the challenge with prize pool. Hope will have a good result with state of the art machine learning technique - deep learning.

Its quite interesting working in kaggle project as I am competing with the world data scientist. By the way, this is my kaggle profile

Read More

Coursera course review - From Nand to Tetris

Nand to Tetris

This course is From Nand to Tetris / Part I by Shimon Schocken, Noam Nisan

Course Summary

Build a modern computer system, starting from first principles. The course consists of six weekly hands-on projects that take you from constructing elementary logic gates all the way to building a fully functioning general purpose computer. In the process, you will learn – in the most direct and intimate way – how computers work, and how they are designed.

Rating

I think this course is great for programmer with no electrical engineering background. After taking this course, now when I do coding, I can imagine the background process done by the laptop – logic gates, ALU, RAM, BUSES, machine language and the assembly language.

Notes

  • I did not do the assignment of this course, as it needs to install a software that chrome thinks that it is harmful.
  • This course require no hardware to start, it uses a software program to simulate and to write a logic gate. From the logic gates we build a 16 bits pc.
    • Nowadays, we are building a computer from a computer.
  • There are no part 2 yet, the book “From Nand to Tetris” should contain from assembly to tetris part.
Read More

Coursera course review - Android

Android

This course full name is Programming Mobile Applications for Android Handheld Systems, and it contains part 1 and part 2. It is taught by Dr.Adam Porter from University of Maryland.

Favourite Part

The peer assessment is this course is great! In the peer assessment, I have to build an android app from scratch. The community is great, here is the peer assessment feedback of Android Part 2.

Android part 2 peer assessment feedbac

What I have learnt

In this 2 course, I have learnt the basic of Android handheld system. All the functionality shown in below screencast. The app shown in the screencast is built by myself from scratch for the peer assessment.

Personal thought

Android programming is quite important in the future IoT world, as most of the IoT could be monitored through Android app. Many valuable data will come from it.

With this course, I have learnt the basic of Android programming and have quite some understanding on Android architecture.

Read More

Paper reading - Weight Uncertainty in Neural Networks.

nn vs Bayes by Backprop

This paper is published by Google DeepMind.

Background

Backpropagation, is a well known learning algorithm in neural network. In the algorithm, the weight calculated is based on the out put of the result. To prevent overfitting and introduce more uncertainty, its often comes with L1 and L2 regularization.

Weights with greater uncertainty introduce more variability into the decisions made by the network, leading naturally to exploration

Reading

This article introduced a new regularization method called Bayes by Backprop.

Instead of a fixed value, they view neural network as a probabilistic model.

No-Drop vs DropOut vs DropConnect

In Dropout or DropConnect, randomly selected activations or weights are set to zero. However in Bayes by Backprop, the activation is set based on its probability. When the dataset is big enough, its similar to the usual backpropagation algorithm, with more regularization.

Result

  1. When classifying MNIST digits, performance from Bayes by Backprop(1.34%) is comparable to that of Dropout(~=1.3%), although each iteration of Bayes by Backprop is more expensive than Dropout – around two times slower).
  2. In MNIST digits, Dropconnect(1.2% test error) perform better than Bayes by Backprop.

Personal Thought

This paper comparison based on MNIST test error is not accurate enough, we should compare its false positive result with human eye classification - as some of MNIST labelling is arguable.

Bayes by Backprop might achieve higher performance in specific situation.

Read More