Python Levenshtein distance – Choose Python package wisely

Brad and I were working on some text similarity computation. One of the most popular string distance functions is the Levenshtein distance, which is also called the edit distance. We use Python for its brevity and widely-library support

More Guidelines Than Rules: CSRF Vulnerabilities from Noncompliant OAuth 2.0 Implementations

Our paper, as titled, has been accepted by DIMVA 2015 – Milano, Italy.

Python Internals – Integer object pool (PyIntObject)

Starting from this post, I will try to make a series of blogs on Python Internals, where Python object mechanism, Python bytecode (pyc) and Python VM are gonna be discussed. We will also talk about the limited resource online about

Python Hacking – urlopen timeout issue

Recent playing with Python urllib2 reveals an interesting fact that the timeout parameter of urlopen() does not work sometimes. The interesting issue has successfully pushed me deep into the Python source code for debugging. The final debugging, without surprise, shows

Book Recommendation – Building Probabilistic Graphical Models with Python

"Building Probabilistic Graphical Models with Python" is the 3rd book I have reviewed from PACKT publishing. It has just been released on this June. Compared to the classic book of PGM – "PGM: Principles and Techniques" by Dr. Koller, this

Python hacking – make ElementTree support line number

An easy way to parse XML in Python is using Python xml.etree.ElementTree, which parses the XML document/data into a tree structure, where each node is an Element object. Only within few lines of code, one can extract all the XML

top or glances – trust on /proc/meminfo

You may notice that the output of top and glances differs on memory usage (if you are using top and glances the same time). This post will disclose some details about memory info collection by glances and top. And the

