## Data Processing  

1. pandas  
pandas is a package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python.  
Project Source: https://github.com/pydata/pandas  
Project Homepage: http://pandas.pydata.org/ 

1. Faker  
Faker is a package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you.  
Project Source: https://github.com/joke2k/faker  
Project Documentation: http://fake-factory.readthedocs.org/en/latest/

1. tablib  
Tablib is a format-agnostic tabular dataset library, written in Python.  
Project Source: https://github.com/kennethreitz/tablib  
Project Documentation: http://docs.python-tablib.org/en/latest/

1. data_hacks  
Command line utilities for data analysis.  
Project Source: https://github.com/bitly/data_hacks  

1. fuzzywuzzy  
Fuzzy string matching like a boss.  
Project Source: https://github.com/seatgeek/fuzzywuzzy  

1. snownlp   
Python library for processing Chinese text.   
Project Source: https://github.com/isnowfy/snownlp   

1. jieba   
Chinese text segmentation.  
Project Source: https://github.com/fxsjy/jieba  
Online Demo Address: http://jiebademo.ap01.aws.af.cm/ 

1. cubes   
Light-weight Python OLAP framework for multi-dimensional data analysis.   
Project Source: https://github.com/Stiivi/cubes   
Project Homepage: http://cubes.databrewery.org/