Hands-On Tutorial On Lens: Python Tool For Swift Statistical Analysis

Whenever we’re working with datasets step one is mostly understanding what’s the information all about. So for exploring the information we start with Exploratory Data Analysis which is analyzing the information with certain strategies and visualization as a way to get a transparent concept of the information we’re coping with. In EDA we analyze completely different attributes and their statistical properties additionally we visualize the information utilizing completely different graphs and plots.

EDA is a essential step so we can’t neglect it, however performing EDA typically is a fairly time-consuming job as a result of we have to write several types of code for statistical properties in addition to codes for several types of visualizations. There are completely different python libraries and modules which might help in decreasing the efforts and time taken in EDA by easy and simple to make use of codes. The lens is one such library.

The lens is an open-source python library which is used for quick calculation of abstract statistics and the correlation within the dataset. It helps us discover the properties of various attributes of the dataset in only a single line of code. It creates several types of visualizations of all of the attributes within the information. It works on each numerical and categorical information. It is blazingly quick and simple to make use of. 



In this text, we’ll discover how we are able to carry out EDA utilizing Lens and save effort and time.

Implementation:

We will begin by putting in lens utilizing pip set up lens

  1. Importing Required Libraries

We would load the dataset we’ll use utilizing pandas so we’ll import pandas and we’ll import lens for information evaluation and visualizations.

import pandas as pd

import lens

  1. Loading the Dataset

The dataset we’ll use right here is an promoting dataset of an MNC which accommodates completely different attributes like ‘Sales’, ‘TV’, and many others. We will load this dataset utilizing pandas.

df = pd.read_csv(‘Advertising.csv’)

df

Dataset Used
  1. Statistical Analysis of Data

Now as we now have loaded the dataset we’ll work on displaying the statistical properties of this dataset. We will use the summarise and discover operate to show the statistical properties of the dataset.

information = lens.summarise(df)

exp = lens.discover(information)

exp.describe()

Dataset Summary

Similarly, we are able to use these features to show the properties of a single column additionally.

exp.column_details(‘Sales’)

Column Summary
  1. Correlation in Dataset

Analyzing and visualizing is simple within the lens, we simply need to write down a single line of code.

See Also

Bluetooth COVID 19 Contact Tracing

exp.correlation()

Correlation Matrix

exp.correlation_plot()

Correlation Plot
  1. Visualization

We can simply visualize completely different attributes of the dataset utilizing completely different plots that are already outlined in Lens. Let us have a look at a few of the visualizations.

exp.distribution_plot(‘Sales’)

Distribution Plot

exp.cdf_plot(‘Newspaper’)

CDF Plot

Lens has a pretty operate named ‘interactive’ which creates a consumer interface the place customers can choose completely different attributes and completely different kind of attributes. Let us visualize this interface.

lens.interactive_explore(information)

Distribution Plot

Here you may clearly see that we are able to choose completely different attributes and visualize the completely different kind of plots and graphs of these attributes. Let us see another plots additionally.

Density Plot
CDF Plot

Conclusion:

In this text, we discovered about Lens which helps in quick calculation of abstract statistics and correlation. We noticed how we use the lens for analyzing the statistical property of a dataset in addition to of single columns. We additionally noticed several types of visualization which might be supplied by the lens and created a few of the plots. Finally, we noticed the interactive operate which created a consumer interface for choosing completely different graphs and plots for various attributes. The lens makes the method od information evaluation and visualization less complicated and easy. 


If you really liked this story, do be a part of our Telegram Community.


Also, you may write for us and be one of many 500+ specialists who’ve contributed tales at AIM. Share your nominations here.

Himanshu Sharma

An aspiring Data Scientist at present Pursuing MBA in Applied Data Science, with an Interest within the monetary markets. I’ve expertise in Data Analytics, Data Visualization, Machine Learning, Creating Dashboards and Writing articles associated to Data Science.

LEAVE A REPLY

Please enter your comment!
Please enter your name here