Efficient Distributed Web Crawler Using Hefty and Enhanced Bandwidth Algorithms for Drug Website Search

Authors

  • Ramachadran A Department of CSE, University College of Engineering, Panruti, Tamilnadu, India
  • Arun Prakash R Department of CSE, University College of Engineering, Ariyalur, Tamilnadu, India
  • Aghila R Department of IT, Sethu Institute of Technology, Kariapatti, Tamilnadu, India
  • Manju Khari Ambedkar Institute of Advanced Communication Technologies and Research, New Delhi, India

Keywords:

Distributed crawler, Web crawler, Bandwidth

Abstract

 

Rrefabricate a proficient search structure is very important due to current scale of the web. Information are mined by Search engines from the web and a program called web crawler which surfs the web in an efficient manner. Distributed crawler belongs to a variant of web crawler, uses a dispersed computation method. In this paper, we design and implement the concept of Efficient Distributed Web Crawler using enhanced bandwidth and hefty algorithms. Mostly Web Crawler doesn’t have any distributed cluster performance system and any implemented algorithm. In this paper a novel Hefty Algorithm and enhanced bandwidth algorithm are combined together for better distributed crawling system. The hefty algorithm, implemented to provide the strong and efficient surfing results while applying on the drug web search. We also implemented Enhanced Bandwidth algorithm to improve the efficiency of proposed crawler.

References

S. Saranya, B.S.E. Zoraida and P. Victor Paul, “A Study on Competent Crawling Algorithm (CCA) for Web Search to Enhance Efficiency of Information Retrieval”, Artificial Intelligence and Evolutionary Algorithms in Engineering Systems, 2015.

P. Jaganathan and T. Karthikeyan, “Highly Efficient Architecture for Scalable Focused Crawling Using Incremental Parallel Web Crawler”, Journal of Computer Science 2015, vol. 11 (1): 120.126.

Subhendukumarpani, Deepak Mohapatra, BikramKeshariRatha, “Integration of Web mining and web crawler: Relevance and State of Art”, International Journal on Computer Science and Engineering, Vol. 02, No. 03, 2010, 772-776.

Monica Peshave, and KamyarDezhgosha“How Search Engines Work And A Web Crawler Application”.

Taekyoung Kwon and Yanghee Choi, Sajal K. Das, “Bandwidth Adaption Algorithms for Adaptive Multimedia Services in Mobile Cellular Networks”.

Raja Iswary, and KeshabNath, “Web Crawler”, sInternational Journal of Advanced Research in Computer and Communication Engineering,Vol. 2, Issue 10, October 2013.

VladislavShkapenyukTorstenSuel, “Design and Implementation of a High Performance Distributed Web Crawler”, NSF CAREER Award NSF CCR-0093400, Intel Corporation, and the New York State Center for Advanced Technology in Telecommunications (CATT).

Z. Bar-Yossef, A. Berg, S. Chien, J. Fakcharoenphol, and D.Weitz. “Approximating aggregate queries about web pages via random walks”. In Proc. of 26th Int. Conf. on Very Large Data Bases, September 2000.

M. Najork. Atrax: “A distributed web crawler”, Presentation given at AT&T Research, March 20, 2001.

Zhixing GAO, Kunhui LIN, “Design and Implementation of a High Performance Distributed Web Crawler”, Journal of Computational Information Systems5:6(2009) 1817-1823.

Rekha Jain, Dr. G. N. Purohit “Page Ranking Algorithms for Web Mining”, International Journal of Computer Applications (0975–8887) Volume 13–No.5, January 2011.

P. C. Saxena, J. P. Gupta, Namita Gupta, “Web Page Ranking Based on Text Content of Linked Pages”, International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010.

Yushi Jing, and ShumeetBaluja, “VisualRank: Applying PageRank to Large-Scale Image Search”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 11, november 2008.

T. Haveliwala, “Topic-Sensitive Page rank: A Context-Sensitive Ranking Algorithm for Web Search,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 4, pp. 784-796, July/Aug. 2003.

Mehdi Bahrami, MukeshSinghal, ZixuanZhuang, “A cloud-based web crawler architecture”, Intelligence in Next Generation Networks (ICIN), 2015 18th International Conference on 17-19 Feb. 2015.

Taher H. Haveliwala, “Topic Sensitive Page Rank”, Supported by NSF Grant IIS-0085896 and an NSF Graduate Research Fellowship. May 7–11, 2002.

JaytrilokChoudhary, Devshri Roy, “Priority based Semantic Web Crawler”, International Journal of Computer Applications (0975 –8887), nov-2013.

MohdAdilSiddiqui, Sudheer Kumar Singh, “URL Ordering based Performance Evaluation of Web Crawler”, International Journal of Computer and Information Technology (ISSN: 2279 – 0764), jan-015.

Akansha Singh, Krishna Kant Singh, “Faster and Efficient Web Crawling with Parallel Migrating Web Crawler”, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 11, May 2010.

Ali Selamat,FatemehAhmadiAbkenari, “Application of clickstream analysis as Web page importance metric in parallel crawlers”, International Symposium on Information Technology (Volume:1 ), 15-17 June 2010.

Bing Zhou, Bo Xiao,Zhiging Lin, Chuang Zhang, “A distributed vertical crawler using crawling-period based strategy”, Future Computer and Communication (ICFCC), 2nd International Conference on (Volume: 1), 21-24 May 2010.

Nagappan V. K, P. Elango, “Agent based weighted page ranking algorithm for Web content information retrieval”, Computing and Communications Technologies (ICCCT), International Conference on 26-27 Feb. 2015.

R. Khanchana; M. Punithavalli, “An efficient web page prediction based on access time-length and frequency”, Electronics Computer Technology (ICECT), 3rd International Conference on 2011, Volume: 5.

Downloads

Published

2020-08-17

How to Cite

A, R., R, A. P., R, A., & Khari, M. (2020). Efficient Distributed Web Crawler Using Hefty and Enhanced Bandwidth Algorithms for Drug Website Search. International Journal of Machine Learning and Networked Collaborative Engineering, 4(01), 01–11. Retrieved from https://mlnce.net/index.php/Home/article/view/124