[Mycroft] [Statistics] December overall
Ricky
ricky at linuxbourg.ch
Thu Jan 1 20:46:22 EST 2004
Hello folks
First of all, Happy New Year and long live the mycroft project! By the
way, did you know the project is already 3 years old? It started on
december 15, 2000, and was called by that time "Sherlock".
Ok, here are the stats. I still am trying to figure out what is really
interesting but here we go. Hope to hear some comments from you on how
to improve them. Don't forget, statistics are not for the love of the
numbers, but more for improving our service.
These stats include the last week of november and data up to Dec. 28.
General summary
From 24.11.2003 14:32:43
To 28.12.2003 23:59:53
Duration 825:27:10
Searches/Hour 91.88
Total Searches 75840 100%
0 result 7674 10%
1-15 results 15284 20%
16-30 results 1559 2%
31+ results 4935 7%
QuickLink 46276 61%
Search for a name 70806 93% 100%
10 most frequent 49964 66% 71%
Preprocessed 1900 3% 3%
Cache 11146 15% 16%
Top 75% searches 53199 75%
1. google 10351 15%
2. dictionary 9478 13%
3. yahoo 5591 8%
4. imdb 4375 6%
5. astalavista 3835 5%
6. ebay 3764 5%
7. altavista 3659 5%
8. amazon 3385 5%
9. alltheweb 3196 5%
10. leo 1554 2%
11. wikipedia 916 1%
12. dogpile 915 1%
13. labourstart 462 1%
14. java 268 0%
15. google.de 253 0%
16. php 216 0%
17. freshmeat 215 0%
18. teoma 192 0%
19. msn 150 0%
20. flash 149 0%
21. sourceforge 139 0%
22. webster 136 0%
This top 75% can also be expressed in the following way. All different
ways of searching for one engine (for example "google", "google.de",
"google.com" have been grouped together, even obvious misspellings like
"goggle". All these 22 entries have been analyzed.
1. google 12338 17%
2. dictionary 9648 14%
3. yahoo 5928 8%
4. imdb 4420 6%
5. astalavista 3895 6%
6. ebay 3837 5%
7. altavista 3805 5%
8. amazon 3514 5%
9. alltheweb 3280 5%
10. leo 1706 2%
11. wikipedia 980 1%
12. dogpile 972 1%
13. labourstart 467 1%
14. java 342 0%
15. webster 298 0%
16. php 257 0%
17. freshmeat 231 0%
18. teoma 227 0%
19. msn 210 0%
20. flash 184 0%
21. sourceforge 170 0%
We can see with these figures that webster had enough alternate
spellings (merriam-webster, m-w) to steal some ranks.
Top 10 bad query strings
1. shockwave 47 0.06%
metager 47 0.06%
3. google.co.jp 30 0.04%
4. quicktime 29 0.04%
gamefaqs 29 0.04%
6. ask.com 27 0.04%
mycroft 27 0.04%
8. google 23 0.03%
webcrawler 23 0.03%
10. all the web 21 0.03%
realplayer 21 0.03%
apple 21 0.03%
m-w.com 21 0.03%
metager.de 21 0.03%
cracks 21 0.03%
There is still a high ratio of strings that shouldn't be here in the
first place: "shockwave", "quicktime", "realplayer", "apple" (referring
to "Apple quicktime"?). Then we can see the metager (first of all if we
consider "metager" + "metager.de" (68) and the gamefaqs, requests, which
seem to be the most relevant ones, but unfortunately, their site uses
post...
More information about the Mycroft
mailing list