Proposed
|
(2007 Jan 14 blog post)
Home >
Blog menu >
This 'Proposed Search Engines Enhancement' blog page
INTRODUCTION: A couple of years ago (2005 Mar), I tried to propose to Google a major enhancement to their search engine. I got an automated reply --- essentially a non-reply. The image above indicates the suggestion --- a search-words distance-apart number that the user can specify. Many web pages are huge and contain sections on many different topics. If this 'words-distance-apart' suggestion were implemented, as outlined further below, this feature would drastically reduce the number of useless 'hits', in large pages [such as Google blogspot.com pages], in most of my web searches. I found what I thought was an appropriate email address --- suggestions@google.com. But their reply said to "register" at a Google "posting" web site and submit the suggestion there. Interesting --- the email address suggestions@google.com does not accept suggestions. As Spock would say, it is not logical. I did not have time or energy to go through their registry dance to post the suggestion.
The dance: So I let the suggestion to Google go, for the time being. I am still, years later, just as frustrated by the massive amount of non-pertinent pages that I get --- on doing almost any wordS search, with any search engine. |
More background info: So I am posting the suggestion openly now --- hoping that ANY search engine organization will take up this challenge. Are you listening AltaVista, A9, AOL, Ask.com, Clusty, Exalead, Gigablast, Google, Lycos, MSN, WiseNut, Yahoo, and others? Readers, please alert them. I may periodically snail-mail this suggestion to Google and other search engine developers. [Actually, there have been a couple of attempts, around 2005, at implementing a search-engine enhancement like this. One was done by an essentially-one-person web-searcher development-operation, in the Netherlands --- walhello.com (Web+valhalla+hello). They/he did not have a very big database of web documents to search, nor the huge server farm of an organization like Google. The other attempt was limited to two options --- a fixed word span of 16 words, OR no limit on word span (the current, lamentable state of affairs). This (preliminary?) attempt was by a major search engine organization in France, exalead.com. With Exalead, you could use the word NEAR between words in a search query --- to do a "proximity search". "The NEAR operator finds documents where the query terms are within 16 words of each other." Note that the French and Dutch are not willing to resign themselves to using Google for all their searches. They know they can do better. Hopefully, these two, and other searcher development organizations, are still working on this feature.] The image at the top of this page (for a hypothetical search engine called 'Hoogle') gives the gist of the suggestion in a readily assimilatable visual form. If it is technically more convenient to express the 'maximum span' between the search words in 'characters' rather than 'words', consider changing 'Word Span' to 'Characters Span'. To give some details of the suggestion, here is the text of the original proposal that I e-mailed to suggestions@google.com on March 13, 2005. |
A 2005 Communication to Google:
Subject: Dear Google Developers, In doing searches on multiple keywords, I am continually getting many pages that do not apply --- because they are long pages (like pages with hundreds of mail responses, or a lot of information on many different subjects).
Suggestion:
Implementation:
Data gathering (word location) considerations:
Storage overhead: Although the 4-bytes for each keyword might increase the size of Google database(s) by about 20%, the pay-back would be well worth it.
Cheers, 2013 UPDATE : I recently (2013 April) bought a book called "9 Algorithms That Changed the Future" by John McCormack. That book points out, in the first chapter, on web search algorithms, that the position of words within web pages IS SAVED and accessible to search engines like Google. So there is no reason why they could not provide the facility suggested here --- if not on the main search page, then via the 'Advanced Search' link. That chapter even points out that search engines like Google use the 'near' capability very heavily for their own purposes. Why they do not make that ability available to users is puzzling --- especially when it could cut down searches that return millions of pages down to returning thousands of pages instead. A situation devoutly to be wished --- especially as the databases of web pages explode in size. |
Here is a page of
web searcher sites
for examples of web searchers --- and
to search for more information on this topic.
Bottom of this blog page on the topic
To return to a previously visited web page location, click on the
Back button of your web browser, a sufficient number of times.
OR, use the History-list option of your web browser.
< Go to Top of Page, above. >Or you can scroll up, to the top of this page. Page history:
Page was posted 2007 Jan 14.
|