Posts

Introduction to Data Visualization

Image
                                  Introduction To Data Visualization                                                                                               In this world where raw data comes in huge chunks,it is difficult for us humans to quickly grasp necessary data (or information)...yeah you got me...Information ,cause  information is raw data which is processed,interpreted,handled and structured to give some meaning.We as humans love to see graphical visualizations rather than 2-D data which provides clear insight as what needs to be done and what not. Size of Data Based  on a graphic from 2015 by Ben Walker,2.5 quintillion b...

Data Scraping

Image
                                  What is Data Scraping? Data scraping, also known as  web scraping , is the process of importing information from a website into a spreadsheet or local file saved on your computer. Well the data can be both static and dynamic but it does not matter until the website which is specified is wrong. Some domains of web scraping        Well web scraping can be obtained by programming as well as using already                      available software tools but if we were to work without getting charged then     programming is a way to go because we need to pay to get those tools.The  programming language which is widely used for scraping is Python . Let us look at the code below: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -...

How to use User Agents while scraping data

                                      User Agents        As I mentioned in the previous post that a  browser's user agent   is a string which         helps identify which  browser  is being used, what version, and on which  operating            system it is on. So if you try to scrape data intensively using your same User Agent         then the defensive intelligent robots may know and block you from scraping data.     Let us consider the code below: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  from bs4 import BeautifulSoup as soup import requests import pandas as pd from pandas import ExcelWriter headers = { 'User-Agent' : 'Mozilla/5.0 (Windo...

How to store data after scraping it?

Image
            Methods of storing data after scraping         You might be wondering where and how to store data after getting it. Fortunately,there                  are many ways like a traditional relational database or a Python dataframe which does                  not require any connections(but requires package installation).              Formats:             1. CSV: As the name suggests, the data is stored in comma separated value format where                 the first row is generally the column names.         Name,Age,Gender,Role             // First row depicting column names         A,28,Male,Web Developer     ...

How to adjust output size in Pycharm

Generally in Pycharm IDE by default five columns of a Dataframe are shown as output. But whenever the user wants to see most or all of the columns, this becomes an issue. Therefore there is a technique where the user  can avoid this just by adding a simple code given below: desired_width = 320 number_of_columns = 30 pd.set_option( 'display.width' , desired_width) np.set_printoptions( linewidth =desired_width) pd.set_option( 'display.max_columns' , number_of_columns) Here the desired width is in output pixels and the number of columns can be set according to the users needs. display.width and display.max_columns are the attributes of set_option of pandas.