Friday, January 7, 2011

Master's Thesis: Improving the Quality of Web Spam Filtering by Using Seed Refinement

I finally defended my Master's thesis titled "Improving the Quality of Web Spam Filtering by Using Seed Refinement"on 16th December, 2010. The thesis deals with a significant problem from the viewpoint of today's World Wide Web namely Web spam which has become a nuisance for search engines today.

The thesis proposes seed refinement techniques for four well-known web spam filtering algorithms: TrustRank, Anti-TrustRank, Spam Mass, and Link Farm Spam. The input seed is refined by maintaining an exception list in the input seed set. This proves to be helpful in decreasing false positives while increasing true positives. Additionally, in this thesis, a strategy for the succession of the modified algorithms is also proposed. These are classified into two classes: a seed refiner followed by a spam detector. Modified TrustRank (MTR) and Modified Anti-TrustRank (MATR) which are seed refiners while Modified Spam Mass (MSM) and Modified Link Farm Spam (MLFS) which are spam detectors.

Following is my Master's thesis defense presentation which I am sharing for those interested:

The full-text of the thesis can be downloaded from here or here (the journal copy). Interested students/researchers may contact me for any questions, comments and feedback. The full-text of the thesis can also be requested via email.

For Citation:
  title={Improving the Quality of Web Spam Filtering by Using Seed Refinement},
  author={Qureshi, Muhammad Atif; Yun, Tae-Seob; Lee, Jeong-Hoon; Whang, Kyu-Young},
  journal={Journal of the Institute of Electronics Engineers of Korea},

No comments:

Post a Comment