Honglei Xie

Recent Posts

  • April 19, 2016

    Scraping IMDB top 250 movies in Python

    Web crawling is much easier than it sounds like. I just started to use Python for about 3 weeks and now, with the help of a few modules, I’m able to start to scrape IMDB (static) pages. So … it’s not that hard. Why static pages? You will find it m...

  • April 10, 2016

    Python Simulation Practice -- The Monty Hall Problem

    Recently I’m following the Harvard CS109 online course, which definitely is an awesome one among many data science MOOCs. I came across the very interesting statistics problem, Monty Hall Probelm, in hw0 where we were trying to solve the problem ...

  • January 11, 2016

    Computational Inference in Logistic Regression

    Logistic regression is one of the most commonly used techniques to analyze binarydata. The classical method to estimate the parameters is through Newton-Rapson. Here I’m demonstratingthe alternative method: Bayesian method (MCMC) and make a compar...

  • December 23, 2015

    Linsanity

    Last Saturday night I watched a documentary film called linsanity starred Jeremy Lin who is the model Asian American NBA player.There are too many articles analyzing linsanity phenomenon and I don’t think I’m the right person to talk much about so...

  • December 07, 2015

    Learning from Imbalanced Data

    I once gave a short talk during lab’s Machine Learning seminar regarding classification algorithms in imbalanced data. Technically speaking, any data set that exhibits an unequal distribution between its classes can be considered imbalanced.(He H,...