A bit over a year ago, the Trojan Source attacks created quite a scare. This talk looks at what can and should be done for Ruby.
Ruby has embraced Unicode in the form of UTF-8 for source code so that identifiers as well as comments can use non-ASCII characters. This can be very convenient but also may be dangerous.
We will explain the dangers: Bidirectional attacks can use special Unicode formatting characters to regroup source text so that it looks like it does something, but actually does something else. Homoglyph attacks can use lookalike characters to confuse code reviewers. Invisible characters and special spaces can be even more difficult to detect.
Remedies include better Ruby parsing, new checks to editors, IDEs, and code management sites such as github, and stronger linters such as Rubycop.
Presentation slides at https://www.sw.it.aoyama.ac.jp/2023/pub/RubyꝩduЯ