Data Structures

Data structures are containers that organize and store data in specific formats. Python provides several built-in data structures that are essential for efficient programming. In this lesson, we'll explore the most commonly used data structures in Python.

Lists

Lists are ordered, mutable (changeable) collections of items. They can contain items of different types.

Creating Lists

# Empty list
empty_list = []

# List with items
fruits = ["apple", "banana", "cherry"]

# List with mixed types
mixed = [1, "hello", 3.14, True]

# List using the list() constructor
numbers = list(range(1, 6))  # [1, 2, 3, 4, 5]

Accessing List Items

fruits = ["apple", "banana", "cherry"]

# Indexing (0-based)
print(fruits[0])  # "apple"
print(fruits[1])  # "banana"
print(fruits[2])  # "cherry"

# Negative indexing (counts from the end)
print(fruits[-1])  # "cherry"
print(fruits[-2])  # "banana"

# Slicing
print(fruits[0:2])  # ["apple", "banana"]
print(fruits[:2])   # ["apple", "banana"]
print(fruits[1:])   # ["banana", "cherry"]

Modifying Lists

fruits = ["apple", "banana", "cherry"]

# Change an item
fruits[1] = "blueberry"
print(fruits)  # ["apple", "blueberry", "cherry"]

# Add items
fruits.append("orange")  # Add to the end
print(fruits)  # ["apple", "blueberry", "cherry", "orange"]

fruits.insert(1, "mango")  # Insert at specific position
print(fruits)  # ["apple", "mango", "blueberry", "cherry", "orange"]

# Remove items
fruits.remove("blueberry")  # Remove by value
print(fruits)  # ["apple", "mango", "cherry", "orange"]

popped = fruits.pop()  # Remove and return the last item
print(popped)  # "orange"
print(fruits)  # ["apple", "mango", "cherry"]

popped_index = fruits.pop(1)  # Remove and return item at index
print(popped_index)  # "mango"
print(fruits)  # ["apple", "cherry"]

# Clear the list
fruits.clear()
print(fruits)  # []

List Operations

# Concatenation
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = list1 + list2
print(combined)  # [1, 2, 3, 4, 5, 6]

# Repetition
repeated = list1 * 3
print(repeated)  # [1, 2, 3, 1, 2, 3, 1, 2, 3]

# Length
print(len(list1))  # 3

# Membership
print(2 in list1)  # True
print(5 in list1)  # False

# Iteration
for item in list1:
    print(item)

# List comprehension
squares = [x**2 for x in range(1, 6)]
print(squares)  # [1, 4, 9, 16, 25]

# Sorting
numbers = [3, 1, 4, 1, 5, 9, 2]
numbers.sort()
print(numbers)  # [1, 1, 2, 3, 4, 5, 9]

# Reversing
numbers.reverse()
print(numbers)  # [9, 5, 4, 3, 2, 1, 1]

Tuples

Tuples are ordered, immutable (unchangeable) collections of items. They are similar to lists but cannot be modified after creation.

Creating Tuples

# Empty tuple
empty_tuple = ()

# Tuple with items
fruits = ("apple", "banana", "cherry")

# Tuple with a single item (note the comma)
single_item = ("apple",)

# Tuple with mixed types
mixed = (1, "hello", 3.14, True)

# Tuple using the tuple() constructor
numbers = tuple(range(1, 6))  # (1, 2, 3, 4, 5)

Accessing Tuple Items

fruits = ("apple", "banana", "cherry")

# Indexing (0-based)
print(fruits[0])  # "apple"
print(fruits[1])  # "banana"
print(fruits[2])  # "cherry"

# Negative indexing (counts from the end)
print(fruits[-1])  # "cherry"
print(fruits[-2])  # "banana"

# Slicing
print(fruits[0:2])  # ("apple", "banana")
print(fruits[:2])   # ("apple", "banana")
print(fruits[1:])   # ("banana", "cherry")

Tuple Operations

# Concatenation
tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
combined = tuple1 + tuple2
print(combined)  # (1, 2, 3, 4, 5, 6)

# Repetition
repeated = tuple1 * 3
print(repeated)  # (1, 2, 3, 1, 2, 3, 1, 2, 3)

# Length
print(len(tuple1))  # 3

# Membership
print(2 in tuple1)  # True
print(5 in tuple1)  # False

# Iteration
for item in tuple1:
    print(item)

# Unpacking
x, y, z = tuple1
print(x, y, z)  # 1 2 3

Why Use Tuples?

  • Tuples are faster than lists
  • Tuples protect data from accidental modification
  • Tuples can be used as keys in dictionaries (lists cannot)
  • Tuples are used for multiple return values from functions

Dictionaries

Dictionaries are unordered, mutable collections of key-value pairs. They are optimized for retrieving data when you know the key.

Creating Dictionaries

# Empty dictionary
empty_dict = {}

# Dictionary with items
person = {
    "name": "Alice",
    "age": 30,
    "city": "New York"
}

# Dictionary using the dict() constructor
person2 = dict(name="Bob", age=25, city="London")

# Dictionary from a list of tuples
items = [("name", "Charlie"), ("age", 35), ("city", "Paris")]
person3 = dict(items)

Accessing Dictionary Items

person = {
    "name": "Alice",
    "age": 30,
    "city": "New York"
}

# Using keys
print(person["name"])  # "Alice"
print(person["age"])   # 30

# Using get() method (safer, returns None if key doesn't exist)
print(person.get("name"))  # "Alice"
print(person.get("email"))  # None
print(person.get("email", "Not found"))  # "Not found" (default value)

Modifying Dictionaries

person = {
    "name": "Alice",
    "age": 30,
    "city": "New York"
}

# Add or update items
person["email"] = "alice@example.com"
person["age"] = 31
print(person)  # {'name': 'Alice', 'age': 31, 'city': 'New York', 'email': 'alice@example.com'}

# Remove items
removed = person.pop("city")
print(removed)  # "New York"
print(person)  # {'name': 'Alice', 'age': 31, 'email': 'alice@example.com'}

# Remove and return the last inserted item
last_item = person.popitem()
print(last_item)  # ('email', 'alice@example.com')
print(person)  # {'name': 'Alice', 'age': 31}

# Clear the dictionary
person.clear()
print(person)  # {}

Dictionary Operations

person = {
    "name": "Alice",
    "age": 30,
    "city": "New York"
}

# Get all keys
keys = person.keys()
print(keys)  # dict_keys(['name', 'age', 'city'])

# Get all values
values = person.values()
print(values)  # dict_values(['Alice', 30, 'New York'])

# Get all key-value pairs as tuples
items = person.items()
print(items)  # dict_items([('name', 'Alice'), ('age', 30), ('city', 'New York')])

# Length
print(len(person))  # 3

# Membership (checks keys)
print("name" in person)  # True
print("email" in person)  # False

# Iteration (iterates over keys)
for key in person:
    print(key, person[key])

# Dictionary comprehension
squares = {x: x**2 for x in range(1, 6)}
print(squares)  # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

Sets

Sets are unordered collections of unique items. They are useful for membership testing and eliminating duplicate entries.

Creating Sets

# Empty set (note: {} creates an empty dictionary, not a set)
empty_set = set()

# Set with items
fruits = {"apple", "banana", "cherry"}

# Set from a list (removes duplicates)
numbers = set([1, 2, 2, 3, 3, 3, 4, 5])
print(numbers)  # {1, 2, 3, 4, 5}

Set Operations

# Add and remove items
fruits = {"apple", "banana", "cherry"}
fruits.add("orange")
print(fruits)  # {'apple', 'banana', 'cherry', 'orange'}

fruits.remove("banana")  # Raises an error if the item doesn't exist
print(fruits)  # {'apple', 'cherry', 'orange'}

fruits.discard("mango")  # No error if the item doesn't exist
print(fruits)  # {'apple', 'cherry', 'orange'}

popped = fruits.pop()  # Remove and return an arbitrary item
print(popped)  # Could be any item from the set
print(fruits)  # Set with one item removed

# Clear the set
fruits.clear()
print(fruits)  # set()

Mathematical Set Operations

set1 = {1, 2, 3, 4, 5}
set2 = {4, 5, 6, 7, 8}

# Union (all items from both sets, no duplicates)
union = set1 | set2  # or set1.union(set2)
print(union)  # {1, 2, 3, 4, 5, 6, 7, 8}

# Intersection (items that are in both sets)
intersection = set1 & set2  # or set1.intersection(set2)
print(intersection)  # {4, 5}

# Difference (items in set1 but not in set2)
difference = set1 - set2  # or set1.difference(set2)
print(difference)  # {1, 2, 3}

# Symmetric difference (items in either set, but not in both)
sym_diff = set1 ^ set2  # or set1.symmetric_difference(set2)
print(sym_diff)  # {1, 2, 3, 6, 7, 8}

# Subset and superset
set3 = {1, 2}
print(set3.issubset(set1))  # True (all items in set3 are also in set1)
print(set1.issuperset(set3))  # True (set1 contains all items from set3)

Advanced Data Structures

Collections Module

The collections module provides specialized container datatypes:

from collections import Counter, defaultdict, namedtuple, deque

# Counter: counts occurrences of elements
words = ["apple", "banana", "apple", "cherry", "apple", "banana"]
word_counts = Counter(words)
print(word_counts)  # Counter({'apple': 3, 'banana': 2, 'cherry': 1})
print(word_counts["apple"])  # 3
print(word_counts.most_common(2))  # [('apple', 3), ('banana', 2)]

# defaultdict: dictionary with default values for missing keys
fruit_colors = defaultdict(list)
fruit_colors["apple"].append("red")
fruit_colors["apple"].append("green")
fruit_colors["banana"].append("yellow")
print(fruit_colors)  # defaultdict(, {'apple': ['red', 'green'], 'banana': ['yellow']})
print(fruit_colors["cherry"])  # [] (default value for missing key)

# namedtuple: tuple with named fields
Person = namedtuple("Person", ["name", "age", "city"])
alice = Person("Alice", 30, "New York")
print(alice.name)  # "Alice"
print(alice.age)   # 30
print(alice.city)  # "New York"

# deque: double-ended queue
queue = deque(["apple", "banana", "cherry"])
queue.append("orange")  # Add to the right
queue.appendleft("mango")  # Add to the left
print(queue)  # deque(['mango', 'apple', 'banana', 'cherry', 'orange'])
print(queue.pop())  # "orange" (remove from the right)
print(queue.popleft())  # "mango" (remove from the left)
print(queue)  # deque(['apple', 'banana', 'cherry'])

Choosing the Right Data Structure

Here's a quick guide to help you choose the right data structure:

  • List: Use when you need an ordered collection that might change
  • Tuple: Use when you need an ordered collection that won't change
  • Dictionary: Use when you need to associate values with keys for fast lookup
  • Set: Use when you need to store unique values and check membership
  • Counter: Use when you need to count occurrences of items
  • defaultdict: Use when you need a dictionary with default values
  • namedtuple: Use when you need a lightweight object type
  • deque: Use when you need to add/remove items from both ends efficiently

Try experimenting with different data structures in the code playground below!

Quick Quiz

Which of the following data structures is mutable (can be changed after creation)?

Code Playground

Code output will appear here...