Python Data science is a technique for deriving knowledge and insights from a huge and diverse set of data by organizing, processing and analyzing the data. It involves various mathematical and statistical modeling, extracting data from its source and also applying data visualization techniques. It includes managing big data technologies to gather both structured and unstructured data.
Consider the example, recommendation system online shopping becomes prevalent in the e-commerce platforms which capture the users shopping as well as performance of various products in the market. This leads to creation of a recommendation system which creates models predicting the shoppers needs and shows the products- the shopper buys more often.
Python Data Science:
This programming requirement of data science demands versatile flexible language which is simple to write the code but can handle highly complex mathematical processing. Python is suited for requirements which have already established itself both as language for general computing as well as scientific computing. The features of python which makes it preferred language for data science are:
- This language is simple and very easy to learn. It results in fewer lines of codes than the other similar languages like R. The simplicity nature will make it robust to manage complex scenarios with the minimal code and much less confusion on the general flow of the program.
- Its cross-platform tool. The code can work in a multi environment without any change or additional installations.
- It executes quicker than other similar languages that are used for data analysis like R and MATLAB.
- It has excellent memory management capability, especially garbage collection makes it versatile in gracefully managing very large volumes data transformation, slicing, dicing and visualization.
- Python has a very large collection of libraries which serve the special purpose analysis tools. For example, the Numpy package deals with scientific computing and its array requires less memory than the conventional python list handling numeric data.
The up to update and current source code, binaries, documentation, news is available on the official website of python http://www.python.org
Installing python:
Python distribution will be available for a wide variety of platforms. We need to download only binary code applicable for your platform and install python. It will be binary code for the platform and will not be available. We need C compiler to compile the source code manually compiling the source code offers more flexibility in terms of choice the features that we require in your installation:
- Here Unix and Linux installation
- We have easy methods for installation of python unix/linux machine
- Here we have to open a web browser and click the link https://www.python.org/downloads
- We must follow the link to download zipped source code available unix/linux
- download and extract files.
- By Editing the modules/setup file if we require to customize some options.
- run./configure script
- make
- make install
- Windows installation
There are steps to install python on windows machine:
We have to open web browser and to the https://www.python.org/downloads/
Follow the link for windows installer python-XYZ.msi file where XYZ is the version that we need to install.
By executing the downloaded file, the installation is done easily.
Setting the path
The path is stored in the environment variable. It is maintained by the OS. This variable contains information that is available to the command shell and other programs. That Path variable will be specified as PATH in Unix or path in windows.
Setting the path in windows
We have involve python directory to the path for particular session in windows
This command prompt-type specified as path %path% ;
C:\Python and also press enter.
C:\Python will be the path of the python directory.
Questions:
- What is Python Data Science?
- What are the features of Python Data Science?