CSC4020: Algorithms and Data Structures

CSC4020 is the course where computer science students learn to think about efficiency. Choosing the right data structure and algorithm can mean the difference between a program that processes millions of records in seconds and one that takes hours to handle a few thousand. This course develops the analytical mindset required to evaluate algorithmic trade-offs and select the best approach for each computational problem.

Sorting algorithm comparison

Algorithm	Best Case	Average Case	Worst Case	Space
Merge sort	O(n log n)	O(n log n)	O(n log n)	O(n) auxiliary
Quick sort	O(n log n)	O(n log n)	O(n²)	O(log n) stack
Heap sort	O(n log n)	O(n log n)	O(n log n)	O(1) in-place
Insertion sort	O(n)	O(n²)	O(n²)	O(1) in-place
Radix sort	O(nk)	O(nk)	O(nk)	O(n + k) auxiliary

What CSC4020 covers

The course opens with fundamental data structures: arrays, linked lists, stacks, and queues. Each structure embodies a different set of trade-offs. Arrays provide O(1) random access but O(n) insertion in the middle. Linked lists offer O(1) insertion at known positions but O(n) access to arbitrary elements. Stacks enforce last-in-first-out (LIFO) ordering, which makes them essential for function call management, expression evaluation, and backtracking algorithms. Queues enforce first-in-first-out (FIFO) ordering, making them the natural choice for breadth-first search, task scheduling, and buffer management. Hash tables provide average-case O(1) lookup, insertion, and deletion by mapping keys to array indices through hash functions, but students must understand collision resolution strategies (chaining vs. open addressing), load factors, and the conditions under which hash table performance degrades to O(n). Trees, particularly binary search trees (BSTs), balanced trees (AVL trees, red-black trees), and B-trees, provide O(log n) search, insertion, and deletion while maintaining sorted order. Cormen et al. (2022) demonstrate that the choice between these variants depends on the application: AVL trees guarantee strict balance for read-heavy workloads, red-black trees accept slightly less balance for faster insertions, and B-trees minimize disk I/O for database indexing.

Algorithm design strategies form the second major pillar of CSC4020. Divide-and-conquer algorithms (merge sort, quicksort, binary search, Strassen's matrix multiplication) break problems into smaller subproblems, solve each recursively, and combine results. The Master Theorem provides a formula for determining the time complexity of divide-and-conquer recurrences without unrolling them manually. Greedy algorithms (Dijkstra's shortest path, Kruskal's minimum spanning tree, Huffman coding) make locally optimal choices at each step and work correctly when the problem exhibits the greedy-choice property and optimal substructure. Dynamic programming (Bellman-Ford shortest path, longest common subsequence, knapsack problem, edit distance) solves problems with overlapping subproblems by storing intermediate results in a table (memoization or bottom-up tabulation) rather than recomputing them. Recognizing whether a problem has overlapping subproblems and optimal substructure is the key skill that separates students who can solve dynamic programming problems from those who cannot. Graph algorithms tie these strategies together: breadth-first search (BFS) and depth-first search (DFS) are the fundamental traversal methods, and from them flow topological sorting, cycle detection, connected components, shortest-path algorithms (Dijkstra, Bellman-Ford, Floyd-Warshall), and minimum spanning tree algorithms (Kruskal, Prim) (Kleinberg & Tardos, 2006).

Working on a complexity analysis, algorithm implementation, or data structure comparison?

Our CS writers deliver Big-O proofs, algorithm trace-throughs, and implementation analyses with the rigor Capella's CSC4020 rubric requires.

Get Expert Help

Key topics in CSC4020

Linear data structures: arrays, singly and doubly linked lists, stacks (LIFO), queues (FIFO), deques, circular buffers
Hash-based structures: hash tables, hash functions, collision resolution (separate chaining, linear probing, quadratic probing, double hashing), load factor and rehashing
Tree structures: binary search trees, AVL trees, red-black trees, B-trees, heaps (min-heap, max-heap), tries (prefix trees)
Graph representations and algorithms: adjacency lists vs. adjacency matrices, BFS, DFS, topological sort, Dijkstra, Bellman-Ford, Floyd-Warshall, Kruskal, Prim
Sorting algorithms: comparison-based (merge sort, quicksort, heapsort, insertion sort) and non-comparison (counting sort, radix sort, bucket sort)
Algorithm design paradigms: brute force, divide-and-conquer, greedy, dynamic programming, backtracking
Complexity analysis: Big-O, Big-Omega, Big-Theta notation; best, average, and worst-case analysis; amortized analysis; the Master Theorem for recurrence relations
Recursion and memoization: recursive problem decomposition, tail recursion, memoization vs. tabulation in dynamic programming

  Essential complexity classes every CS student must know
  O(1) constant: hash table lookup (average), array index access, stack push/pop
O(log n) logarithmic: binary search, balanced BST operations, finding an element in a sorted array
O(n) linear: linear search, traversing a linked list, counting elements, single-pass array processing
O(n log n) linearithmic: merge sort, heapsort, quicksort (average), optimal comparison-based sorting lower bound
O(n²) quadratic: insertion sort (worst), selection sort, bubble sort, nested loop comparisons of all pairs
O(2ⁿ) exponential: brute-force subset enumeration, recursive Fibonacci without memoization, backtracking without pruning

Get Help With CSC4020

Algorithm analyses, data structure implementations, complexity proofs, dynamic programming solutions. Computer science coursework done right.

Place Your Order View All Services

Related courses

Frequently asked questions

What is Big-O notation and why is it important?

Big-O notation describes the upper bound of an algorithm's time or space complexity as the input size grows toward infinity. It captures the growth rate of resource consumption while ignoring constant factors and lower-order terms. For example, an algorithm that performs 3n² + 5n + 7 operations is O(n²) because the quadratic term dominates as n grows large. Big-O matters because it lets developers predict how an algorithm will scale. An O(n log n) sorting algorithm like merge sort can handle 10 million elements in a fraction of a second, while an O(n²) algorithm like insertion sort would take hours on the same input. In CSC4020, students must analyze the complexity of algorithms they implement and justify their design choices by demonstrating that the chosen approach provides acceptable performance for the expected input size. Big-O is also essential for technical interviews and for making informed decisions about which library functions or data structures to use in production code.

When should you use dynamic programming instead of a greedy algorithm?

The choice depends on the problem's structure. Greedy algorithms work when the problem has the greedy-choice property: making the locally optimal choice at each step leads to a globally optimal solution. Classic greedy problems include activity selection (always pick the activity that finishes earliest), Huffman coding, and Dijkstra's shortest path (with non-negative edge weights). Dynamic programming is needed when a problem has overlapping subproblems (the same smaller problems are solved repeatedly) and optimal substructure (an optimal solution contains optimal solutions to subproblems), but the greedy-choice property does not hold. The 0/1 knapsack problem is the classic example: a greedy approach (always take the item with the best value-to-weight ratio) does not guarantee the optimal solution because you cannot take fractions of items. Dynamic programming systematically considers all combinations by building up solutions from smaller subproblems. The practical test is: if you can prove the greedy-choice property holds, use greedy (simpler, faster). If you cannot, and the problem has overlapping subproblems, use dynamic programming.

What is the difference between a hash table and a balanced binary search tree?

Both support efficient lookup, insertion, and deletion, but they make different guarantees. A hash table provides average-case O(1) operations by computing a hash function to map keys directly to array positions. However, worst-case performance degrades to O(n) if many keys hash to the same position (hash collisions), and hash tables do not maintain any ordering among keys. A balanced BST (AVL tree or red-black tree) provides guaranteed O(log n) operations for lookup, insertion, and deletion regardless of the input distribution, and it maintains keys in sorted order. This means a BST can efficiently answer range queries ("find all keys between 10 and 50"), find the minimum or maximum, and iterate elements in sorted order. Hash tables cannot do any of these efficiently. Use a hash table when you need the fastest possible lookup and insertion and do not need sorted access. Use a balanced BST when you need guaranteed worst-case performance or ordered traversal. In practice, many standard libraries offer both: Java provides HashMap (hash table) and TreeMap (red-black tree); Python's dict is a hash table, while the sortedcontainers library provides a sorted equivalent.

How does graph traversal work and when do you use BFS versus DFS?

Graph traversal systematically visits every vertex reachable from a starting vertex. Breadth-first search (BFS) uses a queue to explore all neighbors of the current vertex before moving to the next level. It guarantees finding the shortest path (in terms of edge count) in unweighted graphs and is the foundation for algorithms like shortest-path in unweighted graphs, level-order traversal, and finding connected components. Depth-first search (DFS) uses a stack (or recursion) to explore as far down one path as possible before backtracking. DFS is the foundation for topological sorting (ordering tasks with dependencies), cycle detection, finding strongly connected components (Tarjan's algorithm), and solving maze and puzzle problems through backtracking. The practical rule: use BFS when you need the shortest path or need to explore level by level (social network friend-distance, web crawling with depth limits). Use DFS when you need to explore all paths, detect cycles, or perform topological ordering. Both run in O(V + E) time where V is the number of vertices and E is the number of edges, but they differ in space usage: BFS may store an entire level of the graph in its queue (up to O(V)), while DFS stores only the current path (up to O(V) in the worst case for a long chain, but often much less).