# BL-data-manipulation **Repository Path**: cai_ya_jun/BL-data-manipulation ## Basic Information - **Project Name**: BL-data-manipulation - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-11-24 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Data manipulation Data manipulation in Python - [numpy](https://numpy.org/doc/) - nd arrarys - [pandas](https://pandas.pydata.org/docs/) - dataframes - [vaex](https://docs.vaex.io/en/latest/index.html) - out of core dataframes - [dask](https://docs.dask.org/en/latest/) - multi core and distributed parallel execution - [koalas](https://koalas.readthedocs.io/en/latest/) - pandas API on Apache Spark - [darts](https://unit8co.github.io/darts/) - time series manipulation - [faker](https://faker.readthedocs.io/en/stable/) - generate fake data - [missingno](https://github.com/ResidentMario/missingno) - visualize missing data - [impyute](https://github.com/eltonlaw/impyute) - missing data imputation - [imbalanced-learn](https://imbalanced-learn.org/stable/) - re-sampling for imbalanced data - [flanker](https://github.com/mailgun/flanker) - email address data parser - [email-validator](https://github.com/JoshData/python-email-validator) - validate email addresses - [sendgrid](https://sendgrid.com/solutions/email-api/email-address-validation-api/) - sendgrid email validator API - [pandas-profiling](https://github.com/pandas-profiling/pandas-profiling) - profile reports from a pandas DataFrame - [geopandas](https://github.com/geopandas/geopandas) - pandas for geographic data - [geopy](https://github.com/geopy/geopy) - geocoding data Articles: [dask vs vaex](https://towardsdatascience.com/dask-vs-vaex-for-big-data-38cb66728747) [vaex](https://towardsdatascience.com/how-to-process-a-dataframe-with-billions-of-rows-in-seconds-c8212580f447) [dask](https://towardsdatascience.com/are-you-still-using-pandas-for-big-data-12788018ba1a) ![data](https://github.com/boyuan-li/BL-data-manipulation/blob/master/photos/1.png) ![data](https://github.com/boyuan-li/BL-data-manipulation/blob/master/photos/2.png) ![data](https://github.com/boyuan-li/BL-data-manipulation/blob/master/photos/3.png)