Tuesday, November 27, 2007
voyager-hc/1.0
38.113.234.181 - - [27/Nov/2007:13:21:24 +0100] "GET /robots.txt HTTP/1.0" 200 612 "-" "voyager-hc/1.0" 38.113.234.181 - - [27/Nov/2007:13:21:35 +0100] "GET /path/to/some/file.html HTTP/1.0" 301 363 "-" "voyager-hc/1.0"
38.113.234.181 resolves to crawl1.cosmixcorp.com, and
cosmixcorp.com redirects to kosmix.com - a California, USA-based
outfit which appears to be legit in a "we're a cool California start-up" kind of way. Not quite sure
what they're doing (hey - it's Web 2.0), but it evidently involves crawling without an identifiable
bot UA.
Our secret sauce (all Web 2.0 companies need one) is our categorization engine that crawls billions of Web pages in a unique manner to create algo-generated home pages…more on this later.
Posted at 12:30 PM | Comments (1)


--------------------------------------------------------------------
School of Engineering and Applied Sciences