Adam Rice

My life and the world around me

Is Google too big?

Google’s recent buyout of Pyra set the whole blogosphere abuzz, but it also seems to have prodded some people to wonder whether we should worry about Google being too important, too big, too valuable, too secretive.

At Austin’s blogger meetup the other night, Prentiss asserted that private projects like Google and archive.org were too important to leave in private hands (archive.org is basically a hobby of Brewster Kale’s). He suggested that the Library of Congress should be given funding to develop and maintain resources equivalent to these.

Citing privacy concerns, the BBC’s Bill Thompson suggests that Google is “a public utility that must be regulated in the public interest,” and that the British Government should establish an “Office of Search Engines” (or to use his Orwellian term, OfSearch).

Both points have some merit, although both have weaknesses. Regulating a search engine strikes me as a potentially heavy-handed. And if privacy is an issue, I’d be especially unwilling to see the U.S. Government in its current form operating a popular and all-encompassing search engine–that could easily be a back-door to Poindexter’s Total Information Awareness.

So what’s the solution? I’m not sure. But I think that if Google (or to be exact, the services it offers) is too important to leave to Google, it’s too important to leave to any one entity. Better to seed the technology widely. The open-source community might be able to come to the rescue, if it could develop and disseminate smart search-engine code, and license it under strict terms that permitted a nonprofit organization to inspect the books at licensees to make sure they weren’t misusing data they captured, etc. Result-rigging could be caught be setting up a meta-search engine that compared results from different installations of the same engine.

[Later] So how do you come up with a good search engine? Obviously part of the problem is having the bandwidth to crawl the Net frequently and thoroughly. Part of it no doubt comes down to efficient indexing. But perhaps the trickiest is results ranking. I was speculating on ways to refine the matching algorithm, and perhaps a tournament approach would be the way to go.

Here’s what I mean: Develop a bunch of matching algorithms. By default, site users would just see whichever is the preferred algorithm du jour. But willing users could see a “tournament view” where results from two different engines were presented side-by-side. They could then express their preference as to which set of results seemed most useful. With N algorithms, there would be N2-N possible tournament combinations. With a large user base, it shouldn’t be hard to generate meaningful results. This could also be part of the feedback loop in a genetic-algorithm approach, although I don’t understand genetic algorithms well enough to really develop that angle any further.

5 Comments

  1. I wouldn’t worry too much about sinister applications of a hypothetical Library of Congress Google clone. If the NSA is living up to half of its mandate (or a tenth of its budget), it already has search engines which go much further than Google in terms of invasion of privacy. (Like, f’rinstance, Googling much of the world’s e-mail traffic. Or “find me all the blogs of people who mentioned Al Qaeda in an overseas phone call last week.”)

  2. Is Google too big to be left to Google?

    Well, they’re doing a darn good job so far. IF it ain’t broke, don’t fix it. I guess that if it starts to deteriorate in quality, some interested party could buy it from them and restore the original level of service.

  3. BTW the libertarian in me is really allergic to the line of thinking that goes, “This privately run institution/service is really excellent and valuable. In fact, it’s so excellent and valuable we need to turn it into a government institution!”

  4. Jenny–I sympathize with that libertarian (lowercase-T in my case) impulse, although I also support the idea that it is a dangerous to rely on any one entity for critical infrastructure.

    Google does seem to be a well-run company, but there’s no guarantee they always will be. Microsoft could probably buy them for pocket change.

    The spirit of the Internet is to decentralize. Google is, if nothing else, a highly centralized resource. If that functionality could be distributed widely, it would be more in keeping with the Internet spirit, more fault-tolerant, and less likely to arouse the ire of BBC journalists.

    Prentiss–point taken, although if the government had its hooks right in the most popular search engine, that’d be scarier. “Oh, this IP address is searching for bomb-making instruction. Hmm, appears to be a dial-up account in Hot Springs, Arkansas…let’s round him up.” And by the end of the day, some disaffected teen is in Guantanamo.

    OK, I exaggerate, a bit. But I’d be surprised if this were possible with the NSA’s current technologies.

  5. True, there is no guarantee that Google will always be well-run or public-spirited if left to the management of one private interest. I guess in that vein, the “multiple installations” idea might be the best way to address the public infrastructure angle while appeasing libertarian instincts (small “l” for me too!).

Comments are closed.

© 2017 Adam Rice

Theme by Anders NorenUp ↑