Greedy Algorithms

This note is part of my learning notes on Data Structures & Algorithms.

Greedy Algorithms

A greedy algorithm is an algorithmic approach that builds up a solution piece by piece. It makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution (hoping the next choice will be best for all further choices). Greedy algorithms are used to solve optimization problems, where the goal is to find the best solution from all feasible solutions.

Key Characteristics

Greedy Choice Property: A global optimum can be arrived at by selecting a local optimum.
Optimal Substructure: An optimal solution to the problem contains an optimal solution to subproblems.

Prove the Correctness of a Greedy Algorithm Using Mathematical Induction

Steps

Base Case: Show that the algorithm works for the smallest instance of the problem (usually when there is only one choice or a trivial solution).
Inductive Hypothesis: Assume that the algorithm works optimally for all instances of size $n$ or less.
Inductive Step: Show that if the algorithm works for instances of size $n$ , it also works for instances of size $n+1$ .

Example: Activity Selection Problem

Given a set of activities $S= {a_1, a_2, …, a_n }, each with a start time $s_i$ and a finish time $f_i$ , select the maximum number of non-overlapping activities.

Activities $a_i$ and $a_j$ are compatible if the intervals $[s_i,f_i)$ and $[s_j,f_j)$ do not overlap.

(): open intervals, do not include the number
[]: closed intervals, include the number

Assume the activities are sorted by their finish times $f_i$ :

f_1 \leq f_2 \leq ... \leq f_{n-1} \leq f(n)

Greedy Choice

Always select the activity that finishes first and then recursively select from the remaining activities that start after the current activity finishes.

Proof of Correctness Using Mathematical Induction

Base case
- If there is only one activity, the algorithm trivially selects it, which is obviously optimal. $A_1 = \{(s_1,f_1)\}$
- If there are no activities, the solution is also trivially optimal (selecting zero activities).
Inductive hypothesis
- Assume that for a set of $n$ activities, the greedy algorithm produces an optimal solution.
Inductive step
- For a set of $n+1$ activities $A = \{(s_1,f_1), (s_2,f_2),...,(s_{n+1},f_{n+1})\}$
- Select the activity $a_1 (s_1,f_1)$ with the earliest finish time
- Remove all activities that overlap with $a_1$ , resulting in a reduced problem with a set of $A'$ of non-overlapping activities.