Friday, February 8, 2013

Global search for the typosphere? Help!

Jasper Lindell asked me a question which I'm posting here with his permission:

I'm sure I'm not the only one who has been totally lost for hours looking in old posts on all the blogs of said Typosphere, but when you're looking for something specific it's difficult.  Is there any way we could make a searchable archive of all the posts from all our blogs - so we can search with ease? I haven't had much experience with Google Custom Searches, but surely they would be powerful enough to search the URLs of the blogs of the Typosphere, bring up the relevant posts to our queries.
Just a little idea, really, that would make the constant typewriter information unearthed easy to find again.

My reply was:


I don't know of a way to make a massive searchable archive. One thing you can try is limiting your Google search by adding site:blogspot.com -- this will search only Blogger blogs. Of course, that includes lots of non-typewriter  blogs, and excludes typospherians using other platforms (Wordpress, etc.).
Another problem is that typecasts aren't searchable. I try to make up for this to some extent by adding labels to my posts, but I'm sure there are many items on my blog that are very hard to find because they are in non-searchable typecasts and my titles aren't revealing.

Jasper's followup:


After I sent my first email I had a look at what Google Custom Searches can do, and they might be useful. I think you can have a list of domains, which may be powerful enough to have the subdomains of Blogger, and then the domains of WordPress and whatever else people are using, too.

I didn't even think about the typecast situation. But if there was a way to search, most of the titles of the posts give a lot away about what will follow in the typecast. But some of the titles are ambigous, like you said, and that would add another complication for searching. But surely it's worth having something. Besides, most of the "how-to" posts, seemingly, don't have a typecast for the instructions, and they're the ones that I know I'd be searching for the most. And many of the long-ish history sort of ones (like Robert Messenger's) are mostly computer-typed, so they would be easy to find if we had a search function.

Can anyone help with this technical question? Any suggestions?

6 comments:

  1. This can be quite a challenge, because, for starters, I don't think we even know of all the blogs and writers that form the Typosphere, let alone the platforms they all use. And since this movement has never meant to be structured or have a visible leader, per the Typewriter Insurgency Manifesto, it means we can't easily establish standards for titling, labeling, and composing typosphere-related entries that would make them easily indexable by the likes of Google. Why, we don't even have a standard language for the entries, which can appear in English, German, Spanish, and who knows what other languages!

    There are, however, a couple of things we could do as a group:

    - Include the label "Typosphere" in all our entries, regardless of subject, and

    - Use the advanced search feature in Google to find entries with that label.

    The other possible option is doing the equivalent of a Library of Congress' Indexing Service, having someone index each and all the entries in the (listed) typosphere-related blogs and then make that database available to search engines, complete with links to the proper entries. Think of it as the modern-day version of yesteryear's library catalogues and indexing cards. Needless to say, this will be one heck of a job to do.

    Maybe the easiest way to locate a specific entry in a blog is to use as complete a search term as possible in Google. For example, if you search for "Olivetti, lexikon site:blogspot.com", chances are you'll find the entries Richard Polt, Ton, and I have posted in our blogs about this typewriter; but if you look exclusively for, say, "remington site:blogspot.com" you'll find not only typewriter-related entries, but also any kind of contents featuring the word "Remington". You can narrow that search adding one or more extra keywords, as in "Remington, noiseless site:blogspot.com"

    ReplyDelete
  2. pshaw. it was simplicity itself.
    http://typewriterdatabase.com now has a "Search The Typosphere" searchbox on the front page, below the newest galleries. It searches pretty much all of the Typosphere blogs listed in the blogroll.

    Enjoy!

    ReplyDelete
  3. Ted - that just shows that I'm behind the times, doesn't it.

    ReplyDelete
  4. Ted's search is only as good as the blogroll, of course. As you find new blogs, please send them this way! People tend to post in the "Roll Call" topic which works for me, since I get email for every comment here (even the spammers, *sigh*)

    Similarly, help me keep the map up to date! I'm always pleasantly surprised to find typists in new places around the world. A comment in the Roll Call (or anywhere, really) can kick me into action into updating that map.

    ReplyDelete
  5. Thanks all for working on searching the typosphere. Part of me likes that my posts aren't as legible by machines, but part of me wants to be found!

    I've been using the HTML alt tag to add searchable keywords for my scanned images (i.e. "a typewritten page discussing typewriters in Curtis Hanson's film The Wonder Boys"). A quick test of Ted's search gives me about a 75% success rate for topics I know I've written on, but I think I may need to do better tagging.

    ReplyDelete
  6. Wow.

    "Simplicity itself" for someone who knows what he's doing! Thanks, Ted.

    ReplyDelete