Halley recently wrote about the lack of search engine transparency in Search Products On The Supermarket Shelf and said:

If I made an analogy to this lack of transparency in search engines with a supermarket shelf full of 50 types of yogurt, or cookies, or soup, there is one striking difference with search products. I really don't know what's inside, but I gobble them right up anyway. When it comes to the foods I mentioned, I could most likely pick out the one with the lowest fat content, the one that has low or no carbs, or the highest salt content. I could turn the package around and verify my assumptions. I'd never pick one without looking at the contents.
That search engines just sit out there dumbly with no "Nutrition Facts" label seems insane to me. We really don't know what we're buying when we make a decision to search in Google or MSN or Amazon's A9. (Maybe all the geeks do, but "regular folks" don't.) I'm not saying I want to know the mind-numbing details of the search algorithms, but I would like some sort of product description of what I'm actually searching through to help me know what search engine I need.

However, I'm left wondering what she's really after. I understand that she wants more information about what's going on behind the scenes, but I still don't know what she wants to know.

So, I'll ask the question here. Aside from simply publishing a ranking algorithm, what are the bits you'd like to know more about?

Posted by jzawodn at April 20, 2005 03:57 PM

Reader Comments
# Graeme Williams said:

I guess I'd like to see some sort of measure of freshness, both on individual results and the page as a whole. Occasionally, I do a search on something that I know isn't very old (like some research I've heard about) but isn't appropriate for a news search.

I'm not sure exactly what I'm asking for, or how I'd use it, but it feels like it would be helpful.

And didn't search engines of yore use to give you counts on individual search terms?

And since you ask, I wish I could trade time for more detail: "this is an important search, please lower my execution priority but provide more detail in the results".

(Graeme underscore Williams at Yahoo dot com)

on April 20, 2005 05:27 PM
# Marshall said:

Questions I'd like to ask:

Does PubSub not index Blogger hosted blogs?

How long has it been since Mamma was only buying the top 5 Google search results? Should they disclose that fact, rather than just saying "we search Google"?

Does the fact that Mamma is being sued by their own shareholders mean that they are going down in flames instead of improving their services?

Why does Wikipedia not make a bigger deal about Wikiwax?

Why do more search engines not add a link to search The WayBack Machine at archive.org?

on April 20, 2005 06:51 PM
# Andres said:

OK it is a good point. But if you want to read the nutrition facts, you do not need to know the hole manufacturing process of the product. You want to know if the search engine will value more this or that when sorting the results, but that does not means that you have to know the whole algorithm that uses on the back.

on April 20, 2005 09:08 PM
# Adrian Lee said:

Heh, I don't think I'd like food type labelling on search results, they scam them as much as anything. Very economical about the truth of what is in food products. Holding that up as an example of what you want isn't saying much!

What worries me about Google's Search History thing, and I've not read much about it yet, is that I wonder how long they've been tracking it anyway. Do they have search histories for millions of people going back a couple of years anyway?

Would probably be nice to get more information about some of the stuff SE's are upto. That's primarily aimed at Google, who seem to overly private about everything, and only release small amounts of details at any given time.

I think a lot of people would like to know what is being tracked when they use search engines, or just about any web site. A kind of publically viewable policy that says "We are logging these details about you, when you use us" kind of thing.

When things are changed, why they are changed, not just some announcement saying this is how it is. Or even more commonly, no announcement, confirmation, denial or anything, just threads in a lot of SEO forums and blogs about the changes people are noticing.

on April 21, 2005 01:36 AM
# Adarsh Bhat said:

I don't get it either. We already have the text snippet in each result. That could give us a lot of information.

on April 21, 2005 01:47 AM
# Brien said:

Things that are relevant to searchers:

* When did you last look at this page? (I find many not here anymore pages, also sometimes
* When did you first find this page? (How potentially fresh is this content?)
* Do you consider this domain or this page to be part of one or several topics?
* Is this a popular click-through? (This one is tough because it is immediately self-corrupting and I know it is available in a community feature like Eurekster, I just hope that there would be some natural "horse's mouth" that would be more relevant).

Working both sides of the engine I find myself questioning the results quality and the total results pool corruption factor. The internet is not the best venue for trust, but I retain utopian dreams of the engines labeling "friend of the engine" after some unpaid valuation is made. Silly, I know. Cheers.

on April 21, 2005 05:35 AM
# Roger said:

Jeremy - Jon Udell has already pointed out some of the bits I'd like to know more about here:

I'd like to know which items on the results page people bookmarked or blogged about.

For example, it took me a fair amount of digging to find this:

Which extended your Linux Magazine article about mod_log_sql quite nicely.

Now when I do a search for "apache logs sql" on yahoo, your story is the top hit. But I had to do *lots* of digging to come up with the blog entry that extended your article so nicely.

I think search results need to focus more on returning a bundle of useful links, rather than try to deliver that *one* page that answers your immediate question.

on April 21, 2005 05:50 AM
# Jeremy Zawodny said:

Test post, ignore.

on April 21, 2005 01:10 PM
# Cap'n Ken said:

How about a Billboard-style "movers" sort of column that shows the current ranking number of each result and what the ranking was of this result for this query in the last index?

Or they could give a score for broad buckets of the factors in their rankings. Site A gets a 5.5 for link ranking, 6.5 for URL content, 7.0 for overall page content.

Those would mostly be bits for entertainment, I guess. I think users figure out pretty quickly if an engine is giving them good results, so that's probably more valuable than any kind of "ingredient" disclosure.

As far as privacy and other kinds of disclosure (referenced above), engines are really no different than any other kind of content site. It would be nice if everybody's privacy policy included disclosures on what kind of usage activity is tracked and what the site can do with usage data in composite and yours as an individual user.

But the same truths hold re: Google search history - the more information you give a site about you, the more they will know about what you are doing. If you don't want much known about you, don't register. If you don't want anything known about you, buy newspapers and use Yellow Pages instead of the Internet.

on April 21, 2005 05:10 PM
# said:

Check out www.foogro.com. I think this is what you are looking for.

on November 13, 2007 05:38 PM
Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.


Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.