menu
{ "item_title" : "Hybrid Algorithm for Enhancing Focused Web Crawling Using Block Segmentation", "item_author" : [" Niti Saxena "], "item_description" : "Search Engine, we are usually referring to the actual search that we are performing through the databases of HTML documents .It is software that helps in locating the information stored on WWW. The purpose of partitioning the web page into blocks is that first we partition the pages into blocks, then only those URLs are extracted which belongs to only the relevant blocks and do not extract those URLs which do not belong to relevant block. A problem faced by focused crawlers is that they measure the relevancy of a page and calculates the URL score of the whole page and a Web page usually contains both relevant as well as irrelevant topics. Page segmentation transforms multi-topic web page into many single topic context blocks and hence improves its performance. These multiple-topic content blocks such as navigation panels, copyright and privacy notices, unnecessary images, and advertisements distract a user from the actual content and the performance reduces. In this thesis, we present a method to divide the web pages into content blocks. This method uses an algorithm to partition a web page into content blocks with a hierarchical structure and partition the pages based on their pre-defined structure, i.e. the HTML tags. In our proposed method of partitioning the web pages into blocks on the basis of headings gives an advantage over conventional block partitioning is that we divide the blocks which include a complete topic. The heading, content, images, links, tables, sub tables of a particular topic is included in one complete block.", "item_img_path" : "https://covers2.booksamillion.com/covers/bam/9/79/859/038/9798590387021_b.jpg", "price_data" : { "retail_price" : "18.00", "online_price" : "18.00", "our_price" : "18.00", "club_price" : "18.00", "savings_pct" : "0", "savings_amt" : "0.00", "club_savings_pct" : "0", "club_savings_amt" : "0.00", "discount_pct" : "10", "store_price" : "" } }
Hybrid Algorithm for Enhancing Focused Web Crawling Using Block Segmentation|Niti Saxena
Hybrid Algorithm for Enhancing Focused Web Crawling Using Block Segmentation
local_shippingShip to Me
In Stock.
FREE Shipping for Club Members help

Overview

Search Engine, we are usually referring to the actual search that we are performing through the databases of HTML documents .It is software that helps in locating the information stored on WWW. The purpose of partitioning the web page into blocks is that first we partition the pages into blocks, then only those URLs are extracted which belongs to only the relevant blocks and do not extract those URLs which do not belong to relevant block. A problem faced by focused crawlers is that they measure the relevancy of a page and calculates the URL score of the whole page and a Web page usually contains both relevant as well as irrelevant topics. Page segmentation transforms multi-topic web page into many single topic context blocks and hence improves its performance. These multiple-topic content blocks such as navigation panels, copyright and privacy notices, unnecessary images, and advertisements distract a user from the actual content and the performance reduces. In this thesis, we present a method to divide the web pages into content blocks. This method uses an algorithm to partition a web page into content blocks with a hierarchical structure and partition the pages based on their pre-defined structure, i.e. the HTML tags. In our proposed method of partitioning the web pages into blocks on the basis of headings gives an advantage over conventional block partitioning is that we divide the blocks which include a complete topic. The heading, content, images, links, tables, sub tables of a particular topic is included in one complete block.

Details

  • ISBN-13: 9798590387021
  • ISBN-10: 9798590387021
  • Publisher: Independently Published
  • Publish Date: January 2021
  • Dimensions: 9 x 6 x 0.1 inches
  • Shipping Weight: 0.17 pounds
  • Page Count: 48

Related Categories

BAM Customer Reviews