jump to navigation

Google searches for Windows Live Writer fixed! September 4, 2006

Posted by spikew in Google, WindowsLiveWriter.
comments closed

The issue with Google searches for Windows Live Writer has been resolved by a fix at Google.  Apparently, the issue was related to Windows Live Spaces in general, not just the Writer Zone blog.  The conversion of urls from the spaces.msn.com to spaces.live.com domains last month caused Google’s search engine to think something fishy was going on and caused spaces.live.com URLs to carry less weight in search results.

After the issue was picked up by Channel 9, and was then Scoblized, there was a lot of speculation about possible causes of the problem.  I’m actually very happy that the theories about this being an SEO issue turned out to be bogus.  While I can understand that SEO techniques have their uses for weighing one link better against another in generalized topic searches (say something like “WYSIWYG Blog Editor“), the idea that inadequate SEO could prevent a link as strongly tied to “Windows Live Writer” as the Writer Zone’s introductory blog post from even showing up in the results would make me seriously doubt the integrity of a search engine.

Matt Cutts from Google has provided some specific details about the fix on his blog: 

By the way, it looks like the primary issue with the Windows Live Writer blog was the large-scale migration from spaces.msn.com to spaces.live.com about a month ago. We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system which requires more trust in individual urls in order for them to rank (this is despite the crawl guys trying to increase our hostload thresholds and taking similar measures to make the migration go smoothly for Spaces). We cleared that flag, and things look much better now.

For a search like [windows live writer], I see the Windows Live Writer blog at number one, and the Windows Live Writer Beta product download page at number 2. Going forward, I’ll keep an eye on the spaces.msn.com to spaces.live.com migration with the crawl folks to make sure that it continues to be smooth. It also looks like Mike Torres is #1 for searches like [torres talking], so overall things look pretty good now.

I’m glad Google was so supportive in investigating and resolving the issue, and that spaces on Windows Live Spaces are now getting their proper rankings in the results.  Now, what’s up with Ask’s results? 🙂

Related posts:

Theories on Google results for Windows Live Writer August 30, 2006

Posted by spikew in Google, WindowsLiveWriter.
comments closed

As the “Windows Live Writer blog banned from Google?” question makes its way around, the theories on the unexpected Writer search result rankings are starting to roll in.

Here’s a quick summary of the running theories:

  • The Writer Zone is Irrelevant Theory – Google’s search result is right and the Writer Zone is actually not the most relevant link about Windows Live Writer.  There have been several references to Google weighing the <title> element very heavily, so since the title of the blog isn’t “Windows Live Writer”, it doesn’t get ranked as highly as all of the other pages that do.
  • Link-bomb Theory – The sudden and dramatic increase in the number of inbound links to the Writer blog when the product launched triggered some kind of link-bomb safeguard.  This safeguard caused the search engine to exclude the detected link-bomb target in search results.
  • Trashy HTML Theory – The HTML generated by Windows Live Spaces is so bad that Google’s search indexer won’t include it.
  • Google is Evil Theory – Google is mean and forcefully removed the Writer blog.
  • Blogs are Hot Air Theory – Google’s algorithm weighs inbound links from blogs very lightly.  Since most links to Writer Zone are from blogs, this makes the Writer Zone ranking more volatile as more non-blog sources link to other websites (like the Writer download URL or http://ideas.live.com)

I don’t subscribe to the Writer Zone is Irrelevant Theory since the Writer blog was the top link for the first week after the product launched, and then suddenly disappeared completely from the results. We’ve been diligently monitoring all references to Windows Live Writer since the product launched and it seems to me that a very large percentage of them include a link back to the Writer Zone blog. A quick search on Technorati will give you some idea about the comparative relevance (at least measured in inbound links) of the Writer blog versus the current top search result:

I don’t subscribe to the Trashy HTML Theory because the HTML generated by Spaces is not that bad. Any good HTML parser can deal with far worse.

I already weighed in on the Google is Evil Theory, and I think its bogus.  Call me naive, but it’s silly to think Google would waste its time (and more importantly money!) on adjusting its search rankings to drop the Writer Blog.  Also, given that Writer is a tool used by bloggers, its not worth the risk of the firestorm it would cause if they ever got caught.

I’m most intrigued by the Link-bomb Theory.  It makes a lot of sense that Google would have safeguards in place to catch attempts to influence page ranks.  Google has done good work to prevent comment spammers from influencing page ranks. I don’t know what kind of link-bomb detectors they have in place, but if there are any…then just like junk mail filters sometimes flag real mail as junk, its easy to see how a flood of new links to the Writer blog could be seen as a link-bomb.

I’d like to add that I have a contact at Google who is being very responsive about investigating this, and it looks like Scoble is going to ask around on his visit to Google tomorrow too.  So I think we’ll soon have some kind of answer to the mystery.  I’ll post the final answer when we know.

Whatever the answer, I look forward to finding a way for our customers to get an informative overview when searching for the product rather than the “File Download – Security Warning” dialog they are currently getting when clicking the first result. 🙂

Windows Live Writer blog banned from Google? August 30, 2006

Posted by spikew in Google, WindowsLiveWriter.
comments closed

For almost a week now, I’ve noticed that searching for “Windows Live Writer” on Google no longer brings up the Writer Zone blog on Windows Live Spaces.  I know for a fact that there are way more inbound links to the Writer Zone blog in reference to Windows Live Writer than to anything else.

Here’s a sampling of various search engine results for “Windows Live Writer”.

I’m having a devil of a time getting Google to return the link to the Windows Live Writer blog in almost any search.  Here’s a smattering searches that pass and fail to bring up the Writer Zone blog:

Now, I’m no conspiracy theorist, so I don’t really believe Google yanked our Writer blog on purpose.  I would certainly never accuse them of trying to bury Writer just because they recently re-announced Writely :-). However, it definitely seems like something happened that caused Writer’s blog to get de-emphasized in their rankings.  If anyone can figure out how that happened, you may have discovered a powerful weapon for de-listing your competitors in Google search results.

Update: Check out the latest theories on Google results for Windows Live Writer.

Update 2: Google searches for Windows Live Writer are fixed!.  I’m happy to report that all of the “Windows Live Writer” searches linked above are now successfully returning the Writer Zone blog as the top result.

xmlrpc problems with WordPress installations August 30, 2006

Posted by spikew in WindowsLiveWriter, WordPress.
comments closed

Dear “Worst episode ever”,

Posting comments on your blog is generating permission denied errors, so I guess I’ll have to comment via a trackback. I hope those are working better 🙂

I’m glad to hear you got Windows Live Writer working. I’ve encountered a few other users who are having problems getting Writer working because their WordPress xmlrpc.php interface is generating bad responses. Can you tell me what you changed just so I can pass the word onto other WordPress users?

—————-

For anyone else encountering XMLRPC errors when connecting Windows Live Writer to a custom WordPress installation, we’ve found a few instances where WordPress plugins installed on the server where causing invalid XML-RPC responses to be generated.  If you are having problems, try disabling all of your WordPress plugins, and see if Writer can connect.  If this fixes Writer, you can try re-enabling each plugin one at a time to find the one that is breaking your XMLRPC interface.  If you find a WordPress plugin that is breaking Writer’s ability to connect, please post a comment here so I can follow-up with the plugin developers to resolve the problem.

Also, any WordPress experts out there with ideas about how to solve Michael’s WordPress problem, please help.

Got "Temporary Post Used For Style Detection"? August 16, 2006

Posted by spikew in WindowsLiveWriter.
comments closed

As the dust is settling from our launch of Windows Live Writer this week, I’m starting to see a lot of these Writer temporary posts falling out of the blogosphere.

Marc Orchant recently posted a funny (but sadly true) comment about Writer being a litter bug, and Anil Dash commented that it provides a fun way to see evidence of users trying out Writer. Rick Segal is more annoyed and less delicately refers to them as “turds”.

If you don’t know what I’m talking about, take a look:

So what exactly is this temporary post?

One of the really cool features about Windows Live Writer is its ability to edit your blog posts in the same styles that are used on your blog.  In general, this provides a much better WYSIWYG authoring experience since you get a more realistic presentation of what your post will actually look like when it is published.  This makes dealing with the laying out of text and images much easier.

As part of the process of configuring Writer to connect to your blog, Writer creates this temporary post so that it can detect the style-related HTML and CSS associated with your blog posts.  After the style detection is completed, Writer attempts to delete the temporary blog post.  If the blog server responds with a success message, then Writer assumes the post was deleted, and continues with setting up the blog.  If Writer receives a failure message from the blog server, then Writer assumes that the deletion operation was unsuccessful and pops up a warning dialog to let the user know that the temporary post is still on the blog and needs to be manually deleted.

So what’s going wrong?

When designing this style-detection feature, we expected there would be some cases where the deletion of the temporary post would fail and wanted to make sure users were clearly warned about the fact that the temporary post was still on their blog.  We also figured there would be some isolated cases where a blog server would respond with a message acknowledging the post had been successfully deleted, but then fail to actually perform the deletion correctly. Unfortunately, the rate of failure for this latter case is higher than we expected to encounter based on testing we did before Writer was released.

We have identified a few ideas about what is causing the problem:

The delayed file-generation effect

Any user who has used Blogger or MovableType is very familiar with the process of regenerating their site when a new post is added.  This is necessary because these blog servers actually update/generate a set of static files for each page of the blog.  Whenever a change is made to the blog (adding/modifying/deleting an entry, changing the blog’s theme, etc), the blog server needs to execute an update of the files that are affected by this change.  In working with customers on this temporary post issue, we have encountered numerous cases where the temporary post was successfully deleted from the blog server’s internal list of entries, but the cached set of static files had not been regenerated.  In those cases, the temporary post will stay on the site until another operation occurs that forces the server to regenerate the site (like publishing a new entry).

This particular manifestation of the problem has been very confusing for users because they see the temporary post on their blog’s main page, but when they go to delete the entry using their blog admin interface (or using the Writer open post dialog), they don’t see the temporary post listed.  To resolve this issue, you just need to login to your blog admin, and regenerate your site.  Since the temporary post is no longer in the list of posts known to the server, it will not be included in the regenerated site.

The ping effect

Ping servers provide a nice mechanism for telling search engines (and other interested services) that a blog has been updated.  When a blog is updated, some blog servers have support for automatically sending a ping notification to these ping servers so they can examine the updated blog content.  Search engines like Technorati will use the ping broadcast as a hint that they should update their search indexes to include the new content that has been posted to your blog.  If your blog is hosted on blog server that has any kind of delay between the temporary post being deleted from the main site (like the “delayed file generation effect” described above), the ping effect will increase the changes that your temporary post will get indexed by a search engine. 

The FeedBurner effect

FeedBurner is a very cool service that will act as a proxy for your blog’s RSS feed so that it can gather and report statistics about the usage patterns of your feed.  I don’t have exact insights into the way FeedBurner works, but in trying to simulate some of the temporary post scenarios, it seems like FeedBurner will cache your RSS feed until it detects that a new entry has appeared in your feed.  In my ad-hoc experimentation, I found that if a request was made for my FeedBurner feed while Writer was processing my blog styles (meaning the temporary post was still present), FeedBurner actually managed to detect and cache the RSS entry about my temporary post in its RSS feed.  Unfortunately, after the temporary post was then deleted, the FeedBurner RSS feed still contained the entry for the temporary post.  The bummer here is that this creates a window of time where the temporary post can then appear in the RSS feed even though it no longer exists on the weblog.  Again, my ad-hoc experimentation seemed to indicate that this temporary post would remain in my feed until the next new post was made to my blog.  FeedBurner is just one example of a service taking advantage of a blog’s RSS feed that might cache feed data and inadvertently propagate these temporary posts after they have been deleted. This caching behavior is generally not a problem for normal blog usage, however it creates an unfortunate amplification effect for these temporary posts the Writer is using for style detection.

So what’s the plan going forward?

We have been following up with users who have expressed concerns about these temporary posts, but unfortunately, we don’t have a better approach up our sleeves for automatic style detection at this time. 

For the next beta, we plan to continue improving the robustness of our detection of orphaned temporary posts so that we can more accurately notify users if the posts are still visible on their blogs after a successful deletion. We are also planning to include a prompt in the blog setup wizard before performing the style detection to allow users to opt-out of style detection (and therefore avoid the temporary post). Finally, we will continue digging into the most common environments where this problem is occurring to see what we can do reduce the occurrence.

Given the wide range of hosted and custom-installed blogging engines in use these days, it’s unrealistic to expect that this will ever be completely resolved as long as it is necessary for Writer to make a temporary post in order to accurately detect the styles of a blog.  It would be great if, over time, we could help with new initiatives for evolving the discovery and publishing hooks exposed in blogging engines for tools like Writer and BlogJet to use to improve end-user authoring experiences and reduce these types of unfortunate annoyances.

That concludes this detailed lesson in the mysterious temporary posts. Anyone with further insights, suggestions, or questions, please feel free to let me know here or on the Writer blog.


 Other blogs commenting on Writer temporary posts