欧美性猛交XXXX免费看蜜桃,成人网18免费韩国,亚洲国产成人精品区综合,欧美日韩一区二区三区高清不卡,亚洲综合一区二区精品久久

打開(kāi)APP
userphoto
未登錄

開(kāi)通VIP,暢享免費電子書(shū)等14項超值服

開(kāi)通VIP
Search Engine Indexing

> I‘ve always found the spidering of one‘s own site to be highly > ineffective and prone to error.  Some of the problems I‘ve seen (and > I‘ve done 10 page to 2+ million page sites):>   - navigation/copyright/boilerplate text frequently ends up > prominent > in the result or used in query evaluation ranking (such > content should > be excluded)Better search engines give you a way around this. For example, Ultraseek hasstop/start tags you can place around such content so it doesn‘t index it.Heck, even Fluid Dynamics Search Engine, a $40 Perl app, does that. Somebodywho‘s serious about search needs to spend some time looking at thearchitecture of their individual documents to ensure users get to just theinformation they need. This goes from excluding boiler plate to making suredocuments have meaningful titles and descriptions (how many sites have yougone to where a search results in 400 documents all of which seem to havethe same title and generic description?).>   - heavy load needs to be managed tightly so as to not conflict with > customers of the site (how many customers pull every page daily?);Solution: Run the search engine on a separate box. CPUs are cheap ...>   - spiders generally cannot tell when a page is no longer > reachable via > the site navigation (forgotten content still searchable);Then maybe this content should be pulled off the site, in which case thesearch engine would drop it after X number of attempts.> > You could put an event driven indexer into your content publishing > system.  Base it on the Verity, Autonomy, or Lucene engines (or > something else), and construct the ‘document‘ that they see > to contain > the specific content minus all of the wrapper.  You will only be > indexing/removing the content that has changed moments after > it has been > changed.This sort of leads into the old dynamic/static publishing question - thismodel only works if all your content is dynamic. Although we‘ve never playedwith it, Ultraseek does have an API that, theoretically, would let your CMSnotify the search engine of a new document as soon as it‘s published (ratherthan waiting for, say, your nightly crawl).> > Think if it the same way as publishing your site.  Do you > publish _all_ > the contents every night (and _only_ every night) throwing > out the old > and completely replacing it?  Or, do you publish only the > documents that > have changed whenever you need to?Good point. How you implement search should probably be dependent on how youpublish your pages. But Ultraseek doesn‘t re-index every page every night -it only grabs files with a new or changed timestamp.No, I don‘t work for Verity - I just like my Ultraseek :-).Adam GaffinExecutive Editor, Network World Fusion[EMAIL-REMOVED] / (508) 490-6433 / 
打開(kāi)APP,閱讀全文并永久保存 查看更多類(lèi)似文章
猜你喜歡
類(lèi)似文章
share search engine
定製Google專(zhuān)用搜索引擎
Google, Yahoo, Microsoft adopt same Web index tool
Google App Engine的全文搜索api
BTDigg DHT Search Engine: Free Search Engine For F...
電子書(shū)PDF文件,特別是一些國外的英文版電子書(shū),因為很難在國內買(mǎi)得到,所以一些好的網(wǎng)站是非常值得收藏的
更多類(lèi)似文章 >>
生活服務(wù)
分享 收藏 導長(cháng)圖 關(guān)注 下載文章
綁定賬號成功
后續可登錄賬號暢享VIP特權!
如果VIP功能使用有故障,
可點(diǎn)擊這里聯(lián)系客服!

聯(lián)系客服

欧美性猛交XXXX免费看蜜桃,成人网18免费韩国,亚洲国产成人精品区综合,欧美日韩一区二区三区高清不卡,亚洲综合一区二区精品久久