codingstuff.io
ExploreTutorialsProblemsCS Subjects
Get Started
ExploreTutorialsProblemsCS Subjects
Get Started
codingstuff.io

Master the art of building software through interactive tutorials, real-world problems, and guided projects.

Pune, Maharashtra, India

codingstuffmail@gmail.com

Product

  • Explore
  • Tutorials
  • Problems
  • CS Subjects

Company

  • About
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • Sitemap

© 2026 codingstuff.io. All rights reserved.

Built with ❤️ for developers everywhere

/
/
All Tutorials
🐍

Python Programming

52 / 68 topics
51NumPy Tutorial52Pandas Tutorial53SciPy Tutorial54Matplotlib & Seaborn Basics55Machine Learning Basics (Stats & Data Distribution)56Linear & Polynomial Regression57Classification & Clustering (Decision Trees, K-Means)58TensorFlow & PyTorch Basics
Tutorials/Python Programming/Pandas Tutorial
🐍Python Programming

Pandas Tutorial

Updated 2026-05-15
30 min read

Pandas Tutorial

Introduction

Welcome to the Pandas tutorial! Pandas is a powerful open-source library in Python that provides high-performance, easy-to-use data structures and data analysis tools. It's an essential tool for anyone working with structured data, especially in fields like data science and machine learning.

In this tutorial, we'll cover the basics of creating Series and DataFrames, reading data from CSV and Excel files, selecting and filtering data using loc and iloc, handling missing values, performing groupby operations, merging/joining datasets, and conducting basic data analysis. By the end of this tutorial, you'll have a solid understanding of how to use Pandas for your data manipulation needs.

Core Content

1. Series and DataFrame Creation

A Series is a one-dimensional array-like object containing a sequence of values and an associated array of data labels, called its index. A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.

Creating a Series

Python
1import pandas as pd
2
3# Create a Series from a list
4s = pd.Series([1, 3, 5, np.nan, 6, 8])
5print(s)
Output
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

Creating a DataFrame

Python
1# Create a DataFrame from a dictionary
2data = {
3 'Name': ['John', 'Anna', 'James'],
4 'Age': [28, 24, 35],
5 'City': ['New York', 'Paris', 'London']
6}
7df = pd.DataFrame(data)
8print(df)
Output
Name  Age      City
0    John   28  New York
1    Anna   24     Paris
2   James   35    London

2. Reading CSV/Excel Files

Pandas makes it easy to read data from various file formats, including CSV and Excel.

Reading a CSV File

Python
1# Read a CSV file into a DataFrame
2df = pd.read_csv('data.csv')
3print(df.head())
Output
Column1  Column2
0        A        B
1        C        D
2        E        F
3        G        H
4        I        J

Reading an Excel File

Python
1# Read an Excel file into a DataFrame
2df = pd.read_excel('data.xlsx')
3print(df.head())
Output
Column1  Column2
0        A        B
1        C        D
2        E        F
3        G        H
4        I        J

3. Selecting/Filtering Data (loc, iloc)

Pandas provides two primary indexing methods: loc for label-based indexing and iloc for position-based indexing.

Using loc

Python
1# Select rows by index label and columns by name
2filtered_df = df.loc[0:2, ['Name', 'Age']]
3print(filtered_df)
Output
Name  Age
0    John   28
1    Anna   24
2   James   35

Using iloc

Python
1# Select rows by position and columns by position
2filtered_df = df.iloc[0:3, [0, 1]]
3print(filtered_df)
Output
Name  Age
0    John   28
1    Anna   24
2   James   35

4. Handling Missing Values

Missing data is a common issue in datasets. Pandas provides several methods to handle missing values.

Checking for Missing Values

Python
1# Check for missing values in the DataFrame
2print(df.isnull().sum())
Output
Name      0
Age       0
City      0
dtype: int64

Filling Missing Values

Python
1# Fill missing values with a specific value
2df_filled = df.fillna(value=0)
3print(df_filled)
Output
Name  Age      City
0    John   28  New York
1    Anna   24     Paris
2   James   35    London

5. Groupby Operations

Grouping data is a powerful way to aggregate and analyze data.

Python
1# Group by the 'City' column and calculate the mean age
2grouped = df.groupby('City')['Age'].mean()
3print(grouped)
Output
City
London    35.0
New York  28.0
Paris     24.0
Name: Age, dtype: float64

6. Merging/Joining Datasets

Merging and joining datasets is a common task in data analysis.

Merging DataFrames

Python
1# Create two DataFrames
2df1 = pd.DataFrame({'Key': ['A', 'B', 'C'], 'Value1': [1, 2, 3]})
3df2 = pd.DataFrame({'Key': ['B', 'C', 'D'], 'Value2': [4, 5, 6]})
4
5# Merge the DataFrames on the 'Key' column
6merged_df = pd.merge(df1, df2, on='Key')
7print(merged_df)
Output
Key  Value1  Value2
0   B       2       4
1   C       3       5

7. Basic Data Analysis

Pandas provides a variety of methods for basic data analysis.

Descriptive Statistics

Python
1# Get descriptive statistics of the DataFrame
2print(df.describe())
Output
Age
count    3.000000
mean    29.000000
std     7.071068
min     24.000000
25%     26.000000
50%     30.000000
75%     34.000000
max     35.000000

Practical Example

Let's create a complete example that demonstrates reading a CSV file, filtering data, handling missing values, performing a groupby operation, and conducting basic analysis.

Python
1import pandas as pd
2
3# Read the dataset
4df = pd.read_csv('sales_data.csv')
5
6# Filter data for a specific year
7filtered_df = df.loc[df['Year'] == 2020]
8
9# Handle missing values by filling them with 0
10cleaned_df = filtered_df.fillna(value=0)
11
12# Group by 'Region' and calculate total sales
13grouped_sales = cleaned_df.groupby('Region')['Sales'].sum()
14
15# Print the results
16print(grouped_sales)

Summary

ConceptDescription
SeriesOne-dimensional array-like object with labels.
DataFrameTwo-dimensional labeled data structure with columns of potentially different types.
Reading FilesUse pd.read_csv() for CSV and pd.read_excel() for Excel files.
Selecting/FilteringUse loc for label-based indexing and iloc for position-based indexing.
Handling Missing ValuesUse isnull(), fillna(), etc., to manage missing data.
GroupbyAggregate data using the groupby() method.
Merging/JoiningCombine datasets using pd.merge() or join().
Basic Data AnalysisUse methods like describe() for summary statistics.

What's Next?

Now that you have a solid understanding of Pandas, the next step is to explore more advanced topics such as time series analysis, pivot tables, and more complex data manipulation techniques. You can continue your learning with the "SciPy Tutorial," where we'll dive into scientific computing in Python.

Stay tuned for more tutorials and happy coding!


PreviousNumPy TutorialNext SciPy Tutorial

Recommended Gear

NumPy TutorialSciPy Tutorial