Recent Posts
-
July 10, 2017
NLP: Get hands dirty with Word2Vec
Last time I wrote a post about words co-ocurrences matrix: basically why do we need it and how to create it, along with taking advantage of SVD to reduce dimensions. In this quick post, I will go directly to the implementations of Word2Vec in the ...
-
July 09, 2017
NLP: Words co-ocurrences matrix
From now and then I will update a series of posts about very basic NLP IPython book demonstrations, just for my own purpose: keeping track of learning progress & connecting each dot into one line. This is my very first post about NLP where mos...
-
February 25, 2017
Linear regression with ARMA errors
Recently I’m getting more and more interested in time series prediction, which might be somehow neglected by the machine learning community. However, this topic should have attracted massive attention — who wouldn’t wish to know (even get a bit of...
-
April 19, 2016
Scraping IMDB top 250 movies in Python
Web crawling is much easier than it sounds like. I just started to use Python for about 3 weeks and now, with the help of a few modules, I’m able to start to scrape IMDB (static) pages. So … it’s not that hard. Why static pages? You will find it m...
-
April 10, 2016
Python Simulation Practice -- The Monty Hall Problem
Recently I’m following the Harvard CS109 online course, which definitely is an awesome one among many data science MOOCs. I came across the very interesting statistics problem, Monty Hall Probelm, in hw0 where we were trying to solve the problem ...