December 07, 2015
Learning from Imbalanced Data
I once gave a short talk during lab’s Machine Learning seminar regarding classification algorithms in imbalanced data. Technically speaking, any data set that exhibits an unequal distribution between its classes can be considered imbalanced.(He H,...
December 04, 2015
DREAM Big Data Challenge
Last summer I and my colleagues in Ontario Institute for Cancer Research teamed up together as Chipmunks, participated in one of DREAM big data challenges: Acute Myeloid Leukemia Outcome Prediction Challenge (AML). Similar to Kaggle, we were asked...
December 01, 2015
Adaptive Thresholding
Motivated by applications in a wide range of fields in signal processing, social science, finance, genetics etc, statistical inference in high dimensional data is a problem of greatinterest. Covariance matrix plays an important role in many fundam...
November 30, 2015
SAS tricks: Assign the Same Value within Group
We often use NODUPKEY or NODUP with BY statement to filter out duplicates in terms of specified variables in BY. However, the problem I dealt with today seemed to be a bit tricky. And I surprisingly found out how powerful RETAIN statement is! Let ...
November 29, 2015
读到几首诗,甚对胃口。旅次朔方 — 刘皂(唐)客舍并州已十霜,归心日夜忆咸阳。无端更渡桑干水,却望并州是故乡。 波兰来客(节选)— 北岛那时我们有梦,关于文学,关于爱情,关于穿越世界的旅行。如今我们深夜饮酒,杯子碰到一起,都是梦破碎的声音。