I have recently decided to create a parser of SMTP addresses (RFC 5321) in ragel. Therefore, I have implemented the basic grammar using it in rspamd: smtp_address.rl.
Ragel has three modes of code generation:
Initially, I thought that goto driven mode is not very friendly for the modern CPU. However, I decided to perform some performance tests. I've used the ragel generated code to parse 100K of email addresses all in the form of '<@domain1,@domain2:addr%d@example.com>'. Yes, this is a valid SMTP address (counting that %d is replaced with some number) and this enforces the most of state machine to be used.
Here are results from 30 measurements using -O0 optimization level:
And with heavy optimizations - -O3 -march=native
Tests were performed on my macbook with Haswell CPU. Compiler - clang-3.8.
As you can see, T mode is the slowest and G mode is the winner. F mode has shown the intermediate results. However, the generated code size for G mode is the largest among these options and F mode is the winner in this case. On the other hand, F mode is not suitable for wide input alphabet mode.
Ragel has three modes of code generation:
- Table driven (T mode) - all states are pushed into one table and transitions are calculated using the next state and the input character performing states table lookup depending on the current state
- Alphabet driven (F mode) - same as previous but states are searched using the current character as index
- Goto driven mode - no tables are created but there are really many goto statements in the code
Initially, I thought that goto driven mode is not very friendly for the modern CPU. However, I decided to perform some performance tests. I've used the ragel generated code to parse 100K of email addresses all in the form of '<@domain1,@domain2:addr%d@example.com>'. Yes, this is a valid SMTP address (counting that %d is replaced with some number) and this enforces the most of state machine to be used.
Here are results from 30 measurements using -O0 optimization level:
And with heavy optimizations - -O3 -march=native
Tests were performed on my macbook with Haswell CPU. Compiler - clang-3.8.
As you can see, T mode is the slowest and G mode is the winner. F mode has shown the intermediate results. However, the generated code size for G mode is the largest among these options and F mode is the winner in this case. On the other hand, F mode is not suitable for wide input alphabet mode.