Tags

keep hungry keep foolish
R

TOP -- Data Science Knowledge Complete Summary

This is a continuously updated data-science concept post edit and organized by me to summarize and sort out cs, stats, machine learning and other related topics all in one from past learning.


Validation of XML Files using R

How does R enable automatic XML files validation and error log collection


Handling Big Data in R

A step by step guidance for handling large dataset 30GB in R


Citadel West Coast Regional Data Open 2020

Bikeshare Market Analysis in NYC at Citadel Data Open


Text Analytics and Neural Network

Predict the influential factors in college based on students' comments on college life.


Sleep Quality Analysis using R

Follow Up previous Python project by re-writting in R.


Natural Language Processing for Entity(name, place, etc.) Extraction using R

Apply NLP techniques in R to Annotate people and places in text files and extract them into a clean table.


Jupyter Lab Customization with Python, Jave, C++, R, Matlab Environments and SQL, Diagram, Markdown Interface

A step-by-step guidance to customize your Jupyter project


HopSkipDrive Driver Marketplace Analysis

A marketplace analysis for 27k data of suppliers & customers, including cohort analysis, concentration, take rate, conversation rate, power usrs etc. using Excel and R.


EY NextWave Data Science Challenge 2019

Local/Regional finalist, ranked top 10 in US, and regional finalist in China over 2936 participants.


Zipline Unmmaned Aerial Vehicle Data Exploration & Analysis.

An unstructured, independent exploratory data analysis & visulization assignment using 450+ flight datasets csv files to discern details and find patterns, business insights, engineering risks or anomalies.


UCLA Data Fest 2019 -- Sports Analytics for Athlete's Fatigue Levels

Effects of Acute and Chronic Fatigue on a Rugby Player’s Performance and Advice for Coaches.


Demographic Analysis of People in City of Seattle

Tableau Desktop could be a powerful tool to study cencus statistically and display plots that demonstrate business insight and any other interesting findings.


Textbook Resources for Data Science (Copyright Owned by the Authors)

Textbooks from Pubic Internet


Data Mining & Machine Learning applied in Predictive Analysis

Exploring out the most influential variables in predicting the affordability among 79 potentially variables and the most effective model by applying different classification methods including Logistic Regression, K-Nearest Neighbors Method, and Random Forest


Experiment Design -- The Effects of Emotion and Alcohol Consumption on Short-Term Memory

A Randomized Complete Block Design (RCBD) is chosen for the purpose of our research. Time spent in playing memory game serves as the variable of interest. Shorter time to finish a memory game indicates a better memory ability of the participant.


Regression Analysis on Happiness Level Project

The first regresion analysis for happiness level and other dependent variables on a survey data.


Project

Citadel West Coast Regional Data Open 2020

Bikeshare Market Analysis in NYC at Citadel Data Open


Text Analytics and Neural Network

Predict the influential factors in college based on students' comments on college life.


Natural Language Processing for Entity(name, place, etc.) Extraction using R

Apply NLP techniques in R to Annotate people and places in text files and extract them into a clean table.


Machine Learning Application on Heart Disease Prediction

Preventing heart disease is important. Good data-driven systems for predicting heart disease can improve the entire research and prevention process, making sure that more people can live healthy lives.


Web-browser Automation with Selenium

With Selenium, Python can be enabled to let users enter, search, scrape down and manipulate information from any source simply in one piece of scripts, with one click to run code and get your result.


EY NextWave Data Science Challenge 2019

Local/Regional finalist, ranked top 10 in US, and regional finalist in China over 2936 participants.


Zipline Unmmaned Aerial Vehicle Data Exploration & Analysis.

An unstructured, independent exploratory data analysis & visulization assignment using 450+ flight datasets csv files to discern details and find patterns, business insights, engineering risks or anomalies.


UCLA Data Fest 2019 -- Sports Analytics for Athlete's Fatigue Levels

Effects of Acute and Chronic Fatigue on a Rugby Player’s Performance and Advice for Coaches.


Demographic Analysis of People in City of Seattle

Tableau Desktop could be a powerful tool to study cencus statistically and display plots that demonstrate business insight and any other interesting findings.


Data Mining & Machine Learning applied in Predictive Analysis

Exploring out the most influential variables in predicting the affordability among 79 potentially variables and the most effective model by applying different classification methods including Logistic Regression, K-Nearest Neighbors Method, and Random Forest


Experiment Design -- The Effects of Emotion and Alcohol Consumption on Short-Term Memory

A Randomized Complete Block Design (RCBD) is chosen for the purpose of our research. Time spent in playing memory game serves as the variable of interest. Shorter time to finish a memory game indicates a better memory ability of the participant.


Regression Analysis on Happiness Level Project

The first regresion analysis for happiness level and other dependent variables on a survey data.


Python

My Notes for Online Experiment -- A/B Testings

Practical Applications and Concepts of AB testings.


Python String Pattern Process

Using Python to reshape String col. to long String, and conduct pattern match


Energy Consumption Calculation Based on Differnt Time and Price (Tarrif) Using Python

Python Calculation and Time Manipulation


Validation of XML Files using R

How does R enable automatic XML files validation and error log collection


Citadel West Coast Regional Data Open 2020

Bikeshare Market Analysis in NYC at Citadel Data Open


Sleep Quality Analysis using Python

final project demo for python data analysis course fall 19 at UCLA.


Jupyter Lab Customization with Python, Jave, C++, R, Matlab Environments and SQL, Diagram, Markdown Interface

A step-by-step guidance to customize your Jupyter project


Machine Learning Application on Heart Disease Prediction

Preventing heart disease is important. Good data-driven systems for predicting heart disease can improve the entire research and prevention process, making sure that more people can live healthy lives.


Web-browser Automation with Selenium

With Selenium, Python can be enabled to let users enter, search, scrape down and manipulate information from any source simply in one piece of scripts, with one click to run code and get your result.


SPE JupyterHub & Python on remote Linux/Unix servers

A Presenation to 45 related/interested fellows at Sony Pictures 19 summer -- Architecture for R , Python and Julia environments for Corporate Data Science Project Initiatives.


EY NextWave Data Science Challenge 2019

Local/Regional finalist, ranked top 10 in US, and regional finalist in China over 2936 participants.


Zipline Unmmaned Aerial Vehicle Data Exploration & Analysis.

An unstructured, independent exploratory data analysis & visulization assignment using 450+ flight datasets csv files to discern details and find patterns, business insights, engineering risks or anomalies.


UCLA Data Fest 2019 -- Sports Analytics for Athlete's Fatigue Levels

Effects of Acute and Chronic Fatigue on a Rugby Player’s Performance and Advice for Coaches.


Textbook Resources for Data Science (Copyright Owned by the Authors)

Textbooks from Pubic Internet


Data Mining & Machine Learning applied in Predictive Analysis

Exploring out the most influential variables in predicting the affordability among 79 potentially variables and the most effective model by applying different classification methods including Logistic Regression, K-Nearest Neighbors Method, and Random Forest