A Common-Sense Guide to Data Structures and Algorithms – Part 9

This week we covered chapter 17 of A Common-Sense Guide to Data Structures and Algorithms, 2nd ed. by Jay Wengrow (link). I’ve been coding along in Python on my Github.

Our topic today was tries.

  • Wikipedia says trie can be pronounced like “tree” (the person who coined the term does this) because it’s like reTRIEval. Others say to pronounce it like “try” to distinguish it from “tree.” However, (1) “try” is a loaded software term, too (think “try-catch”), and (2) it’s a a tree structure.
  • None of us had studied this in computer science classes.
  • Jamie used a trie to solve an interview problem: Build an application where you type a word and the app helps to find that word in a performant manner (similar to the example covered in this chapter).
  • As Houston was reading through this, it reminded him of when he wrote an anagram solver.
  • I thought it was clever to add the asterisk as a delimiter to indicate when parts of words are also words themselves (e.g., cat*, catnip).
  • We thought about some other examples of where you’d use tries. Suffix trees and finite automata come to mind; you could use it to generate text based on learned sequences and frequencies. Dennis thought this could be used in a gaming skill tree to figure out what skills are needed to make it to a specific destination skill.
  • We talked about data structures that are built into languages (e.g., Ruby, Python, C#, Java). An LRU cache is built into Python. This could be a difference in problem spaces — e.g., parsing command-line args, file globbing. The more you have in the base framework, the less you have to rely on external packages.
  • Issues with the code in this chapter
    • In autocomplete the book calls collectAllWords(currentNode) with no prefix and no collection of existing words. But if collectAllWords has been called before, somehow the previous set of collected words persists, meaning you get too many values returned. Sending in an empty array fixed it: collectAllWords(currentNode, prefix, []).
    • The author included a traversal method, but I found it unhelpful in determining whether the trie was initialized correctly.
    • For Exercise 4, there’s a bug in the code provided for autocorrect that doesn’t meet the requirements of the exercise and also returns wrong results. See my solution.