I have recently decided to create a parser of SMTP addresses (RFC 5321) in ragel. Therefore, I have implemented the basic grammar using it in rspamd: smtp_address.rl.
Ragel has three modes of code generation:
Initially, I thought that goto driven mode is not very friendly for the modern CPU. However, I decided to perform some performance tests. I've used the ragel generated code to parse 100K of email addresses all in the form of '<@domain1,@domain2:addr%d@example.com>'. Yes, this is a valid SMTP address (counting that %d is replaced with some number) and this enforces the most of state machine to be used.
Here are results from 30 measurements using -O0 optimization level:
And with heavy optimizations - -O3 -march=native
Tests were performed on my macbook with Haswell CPU. Compiler - clang-3.8.
As you can see, T mode is the slowest and G mode is the winner. F mode has shown the intermediate results. However, the generated code size for G mode is the largest among these options and F mode is the winner in this case. On the other hand, F mode is not suitable for wide input alphabet mode.
Ragel has three modes of code generation:
- Table driven (T mode) - all states are pushed into one table and transitions are calculated using the next state and the input character performing states table lookup depending on the current state
- Alphabet driven (F mode) - same as previous but states are searched using the current character as index
- Goto driven mode - no tables are created but there are really many goto statements in the code
Initially, I thought that goto driven mode is not very friendly for the modern CPU. However, I decided to perform some performance tests. I've used the ragel generated code to parse 100K of email addresses all in the form of '<@domain1,@domain2:addr%d@example.com>'. Yes, this is a valid SMTP address (counting that %d is replaced with some number) and this enforces the most of state machine to be used.
Here are results from 30 measurements using -O0 optimization level:
And with heavy optimizations - -O3 -march=native
Tests were performed on my macbook with Haswell CPU. Compiler - clang-3.8.
As you can see, T mode is the slowest and G mode is the winner. F mode has shown the intermediate results. However, the generated code size for G mode is the largest among these options and F mode is the winner in this case. On the other hand, F mode is not suitable for wide input alphabet mode.
Casino, Golf & Racing | Jackson County, SD
ReplyDeleteFor 여주 출장샵 more information 화성 출장샵 on golf 군산 출장마사지 and 인천광역 출장샵 betting, visit 김포 출장안마 www.jacksoncounty.com/casinos/golf-and-racing.