Item2603: BlackListPlugin takes ages to process topic with really many http links
Priority: Normal
Current State: Confirmed
Released In:
Target Release: n/a
from Item2577 Paul noted this....
I sat there with top open to watch the CPU time count up on the view script. I got distracted at about the 02:00 minutes mark (didn't take much longer than that). Wow!
The content I'm trying to save (which also triggers spamword detection) can be found at
http://wiki.trin.org.au/Mangroves/Bibliography?raw=on
An equivalent quantity of "The quick brown fox" text only takes a couple of seconds, found at
http://wiki.trin.org.au/Sandbox/TestTopic127?raw=on
--
PaulHarvey - 05 Jan 2010
I have tried with copies of the two topics.
First I could not see 2 minute or anything like this. The quick brown fox topic saves in 7 seconds.
After updating to the checkin you did for this item I could save the topic in 7 seconds.
The problem comes when I try the Bibliography topic. I get saving time now at 7 seconds with BLP disabled and 37 seconds with it enabled where the spam regex is removed. It makes no difference if I also add all the Quick Brown Fox text. It is the number of http links that sets the delay.
I am sure the
AntiWikiSpamPlugin will have the same performance as this feature is exactly the same in the two plugins. Both run through the saved topic text with the massive spam regex from the common spam regex site.
I do not see how we can speed up that unless the regexes we use can be optimized.
I tried to remove all the http strings replacing them with "hej" and then it took 7 seconds to save the topic.
I have checked in a change so we only check for http and https. There is no need to also look for gopher and telnet etc. They are not used for spam. I could not see much difference if I ran with http or https? in the regex. So this is OK to add to the plugin. Question is what we can do for speed.
With normal topics with only few http links the delay is a second or two when you save.
--
KennethLavrsen - 05 Jan 2010
Opened this bug related to the performance thing so I could close
Item2577
--
KennethLavrsen - 07 Jan 2010