To create a DataFrame
It contains an ordered collection of columns and rows that can be of different value types ( numeric, string, Boolean, etc.) and can be created from the lists, dictionary, and a list of a dictionary, etc.
A Pandas DataFrame can be created in the real world by loading the datasets from existing storage. It can be SQL table, CSV file or an Excel file, which may contain time series data, stock exchange data, employees personnel details etc.. Although a DataFrame is physically two- dimensional, hierarchical indexing allows higher-dimensional data to be represented in a tabular style.
Data frame in Python may be created from the existing data structures used in a program and may contain values of different data types like numeric, float, character, string and many more. We detail below four different methods that are commonly used for constructing a data frame.
Creating a data frame using List data structure DataFrame may be created using a single list or a list of lists, where a column is created for each list.
In above code, a data frame is created with default column name '0' as no column name is explicitly provided while constructing the data frame df. Also, rows are labelled with default index value starting from 0.
Creating DataFrame from a dictionary where each value component must be of the same length in the dictionary. In the following example each value component of dictionary is a list type.
In above code, a data frame is created with two columns 'Name' and 'Class' and rows are labelled with default index value starting from 0.
Creating a DataFrame using N-dimensional array such that each nested arrays are of same length. If the index is passed, then the length index should be equal to the length of arrays. If no index is passed, then by default, the index will be range(n), where n is the array length.
In above example, narray is a 2D array where row 0 corresponds to name of student, row 1 corresponds to marks obtained and row 3 corresponds to rank obtained by the student. Note that we have given names to columns in the data frame explicitly as there is no name associated with each array in narry. Hence, data frame will have four columns corresponding to each student as shown below.
Creating a Dataframe from a list of dictionaries Passing lists of dictionaries can create pandas DataFrame as input data. By default, dictionary keys are taken as columns.
Creating DataFrame from lists using zip() function.
In this method,the user can create a dataframe using the python inbuilt zip() function.This function creates a dataframe by merging two lists.
Creating DataFrame from a dictionary of pandas series.
In method#2, a DataFrame was created using a dictionary. Now we will first create a dictionary using data series, followed by its usage for creating a data frame. The resultant data frame will have three columns corresponding to each key and number of rows will be same as the size of each list in value part.
Steps to follow are as follows:
Just click the next button to see which element goes to which position.
Name of Columns | |||
Row Index | 'One' | 'Two' | 'Three' |
0 | 1 | 2 | 3 |
1 | 4 | 5 | 6 |
2 | 7 | 8 | 9 |
The start of row 0
The element 1 of sub-list 1 is assigned to [ 0 , 0 ] in the DataFrame
The element 2 of sub-list 1 is assigned to [ 0 , 1 ] in the DataFrame
The element 3 of sub-list 2 is assigned to [ 0 , 2 ] in the DataFrame
The end of row 0
The start of row 1
The element 4 of sub-list 2 is assigned to [ 1 , 0 ] in the DataFrame
The element 5 of sub-list 2 is assigned to [ 1 , 1 ] in the DataFrame
The element 6 of sub-list 3 is assigned to [ 1 , 2 ] in the DataFrame
The end of row 1
The start of row 2
The element 7 of sub-list 3 is assigned to [ 2 , 0 ] in the DataFrame
The element 8 of sub-list 3 is assigned to [ 2 , 1 ] in the DataFrame
The element 9 of sub-list 4 is assigned to [ 2 , 2 ] in the DataFrame
The end of row 2
The end of row 3
Hence we can create a DataFrame by any of these methods. Data represented in table is more preferred over linear arrays.
Mr. Sushant Sahrma, B.Sc Physical Sciences with Computer Science, III year,
Ms. Alia, B.Sc Physical Sciences with Computer Science, II year.
Mentor:
Prof. Sharanjit Kaur,
Ms. Gunjan Rani