• @makenowjust

Ph.D. student at SOKENDAI (NII, National Institute of Informatics). Researcher for information security and formal language. Ruby committer. I am the author of Regexp optimization to prevent ReDoS in Ruby 3.2.0.

Make Your Own Regex Engine!

Regular expression (regex) is a critical feature in Ruby. However, developers often say, "Regex is complicated and causes bugs.", so they avoid using regex. This seems to be due to their incorrect understanding of how regex matching works. The behavior of regex matching is actually simple, and a small regex engine (kantan-regex) can be implemented in a program of less than 300 lines.

In this talk, I will describe the behavior of regex matching through the implementation of kantan-regex. Furthermore, I will show how, with a few modifications, extensions such as look-around and optimizations such as memoization can be easily implemented. We believe that this talk will help to deepen our understanding of regex implementations and make using regex in everyday life more enjoyable.

A tutorial on implementing kantan-regex is available on Web (Japanese only).