<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Stuff Yaron Finds Interesting &#187; SOA/Web/Etc.</title>
	<atom:link href="http://www.goland.org/category/technology/soawebetc/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.goland.org</link>
	<description>Technology, Politics, Food, Finance, etc.</description>
	<lastBuildDate>Fri, 05 Feb 2010 22:34:21 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The CAP theorem and modern data centers &#8211; for now, choose consistency!</title>
		<link>http://www.goland.org/cap/</link>
		<comments>http://www.goland.org/cap/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 22:33:39 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=726</guid>
		<description><![CDATA[The dominance of the commodity machine model for data centers is
so complete that one forgets that there was ever any other viable
choice. But IBM, for one, is still selling lots of mainframes.
Nevertheless the world I live in is built on top of data centers that
contain a lot of commodity class machines. These machines have a
nasty [...]]]></description>
			<content:encoded><![CDATA[<p>The dominance of the commodity machine model for data centers is
so complete that one forgets that there was ever any other viable
choice. But IBM, for one, is still selling lots of mainframes.
Nevertheless the world I live in is built on top of data centers that
contain a lot of commodity class machines. These machines have a
nasty habit of failing on a fairly regular basis. So when I think
about the CAP theorem I think about it in the context of a data
center filled with a bunch of not completely reliable boxes.</p>
<p>In that case partition tolerance (which, as I explain below, ends
up meaning tolerance of machine failure) is a requirement.  So in
designing frameworks for the data centers I work with the CAP theorem
makes me choose between exactly two choices -  do I want consistency
or availability?</p>
<p>My belief is that for the vast majority of developers, at least
for the immediate future, they need to choose consistency.</p>
<span id="more-726"></span>
<h2>Um... what's the CAP theorem?</h2>
<p>The CAP theorem is explained and proven in this <a HREF="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.20.1495&amp;rep=rep1&amp;type=pdf">theorem
paper</a> which is reasonably approachable. There are plenty of
articles on-line that summarize the CAP theorem so I'm not going to
write another one. I do however want to point out that the terms used
in the CAP theorem, consistency, availability and partition tolerance
don't mean what the plain meaning of the words imply.</p>
<p>Consistency is closest to its normal meaning but as the theorem
paper points out it might be easier to think of this as meaning
atomic and consistent in the AC part of ACID sense.</p>
<p>Availability, as I explore later, doesn't mean availability of the
entire system but rather availability of a particular piece of
information to be read or written to.</p>
<p>Partition tolerance is a tiny bit tricky. It's plain meaning, e.g.
dealing with what happens when machines can't talk to each other, is
part of the definition. But it also encompasses what happens if a
machine fails. After all, if machine A is trying to talk to machine B
and machine A isn't getting a response it's irrelevant if the
response didn't come because machine B failed or because there is a
network partition. The message didn't get where it was supposed to,
therefore the communication failure, from the perspective of the CAP
theorem, is modeled as a network partition.</p>
<p>The CAP theorem says of the previous three system qualities,
consistency, availability and partition tolerance, we only get to
choose two. (Wait, did I just summarize the CAP theorem? D'oh!)
Therefore when designing distributed systems I have three choices,
consistent/available, consistent/partition tolerant or
available/partition tolerant. I explore all three choice below.</p>
<h2>Consistent and Available - Not an option</h2>
<p>In an ideal world I would like all my service's data to be
consistent and available. But CAP says I only get that if I'm willing
to essentially fail if there is a network partition and as previously
discussed a network partition also includes machine failure.</p>
<p>And to be fair the consistent/available option is actually pretty
common. Anyone who is running a single box that hosts their database
is choosing this option. So long as the box is up their data is
consistent and available but if it (or its network tap) goes down
then that's that until the box gets fixed.</p>
<p>But as I mentioned above I come into this situation with a
dependency on data centers filled with commodity machines that tend
to fail on a pretty frequent basis. So wishing away machine failures
(or even network failures which, although rarer, do happen not
infrequently) is a non-starter. 
</p>
<p>So I have no choice, whatever design I use, it must be partition
tolerant (read: keep working in the face of machine failure). So
choosing consistency and availability over partition tolerance isn't
a choice available to me.</p>
<h2>Consistent and Partition Tolerant - Easy to program to</h2>
<p>I'm primarily in the development platform business. I build
platforms that other people use to build their software. So I spend a
lot of time worrying about abstractions that my customers can easily
understand and live with. The model most programmers are most
familiar with is one in which the world is 'consistent'. By which I
mean that when one wants to change system state one can do so and
either all the changes happen or they don't. Furthermore when someone
comes along to read values they will see the changes that have been
made. This is a world that is pretty easy to reason about.</p>
<p>But if I want to offer consistency and if, as I have previously
argued, I must have partition tolerance, then CAP says I have to give
up availability. Which might seem nuts. Who the heck wants a system
that isn't available? But remember, availability is not about the
global state of the system, it's about pieces of state that have to
be mutually consistent.</p>
<p>Imagine you are building a website for your car rental company.
You want to host the website in the cloud to save money and reduce
time to market. You come to me looking for storage infrastructure and
I say &quot;Hey, look, 99.99% of the time when your customer comes to
the website they will be able to access their data, put in an order
for a rental car, see what cars they have rented, etc. but 0.01% of
the time the customer request will fail and btw, typically that
failure will resolve itself within a minute or two.&quot;</p>
<p>To most businesses this is a fine trade off. Programming and
maintaining programs that expect consistency is an order of magnitude
less work than dealing with the lack of consistency (see the next
section). So a small number of failures that are quickly resolved is
probably an acceptable trade off.</p>
<h2>Available and Partition Tolerant - Takes a licking and keeps on
ticking</h2>
<p>Still, some companies, most famously Amazon's <a HREF="http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf">Dynamo</a>,
take availability very seriously. They don't ever want a customer
coming to their website and told 'sorry, we can't help you right now,
try again later.' They have done the math and figured out that even
rare failures, due to the enormous number of customers Amazon deals
with, were costing them real money. So Amazon was willing to deal
with the implementation headaches of reducing consistency in return
for getting higher availability.</p>
<p>Imagine we are using the previous system which chooses consistency
over availability. Let's say that Andres wants to rent a car. He
comes to the car rental site. The front end machine tries to access
Andres's rental records which are kept on machine Alpha (in reality
this would probably be a group of machines using a quorum protocol
with an elected master). But Alpha isn't available (i.e. the master
has died and the system is in processing of elevating another member
of the quorum to master or enough machines have died or been
partitioned so that quorum is lost). So the website has no choice but
to say &quot;please try again later&quot; until Alpha (or really the
quorum) can be brought back online.</p>
<p>Now let's look at a world with a lower level of consistency.
Andres comes to the website and the front end machine tries to get to
machine Alpha and fails. But rather than sending Andres away the
front end machine looks for another back end machine, let's call it
Beta, and asks it to handle Andres's rental records. Beta agrees. So
Andres goes through the rental process and rents a car. All this
information is recorded on machine Beta.</p>
<p>In the meantime it turns out that Beta failed before Alpha came
back up so the information that Beta had about Andres is currently
offline. Andres navigates back to the website to check on something
about his order. The front end machines goes looking for machines who
know about Andres and finds Alpha who is now back up. Much to his
surprise Andres is now shown that he has no rental order! After all,
Beta never told Alpha about the order and Beta is currently down. We
have a data inconsistency.</p>
<p>Andres, frustrated by this, puts in a second order and leaves.
Meanwhile Beta comes back up and finds Alpha. Now there is some
confusion. Both Beta and Alpha have orders from Andres for a car. Did
Andres mean to rent two cars? Is the newer order a replacement for
the older order? What should the system do?</p>
<p>All of these problems are solvable. It just takes very careful
thought about all the possible failure states and code that can
identify and resolve those failure states. The process of taking
inconsistent data and making it consistent over time as failed
systems come back on-line and share what they know is called
'<a HREF="http://queue.acm.org/detail.cfm?id=1466448">eventual
consistency</a>'.</p>
<p>Eventual consistency is an incredibly powerful mechanism for
making services more resilient but it isn't free. Modeling and
dealing with the potential problems are non-trivial. Much like the
inappropriate optimism around <a HREF="http://www.goland.org/optimisticconcurrency/">optimistic
concurrency</a> I suspect that in practice most implementers would do
well to stay away from eventual consistency frameworks. At least
until they can be reduced to well understood design patterns (a la
Amazon's shopping cart example). 
</p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/cap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recovering from self inflicted data corruption &#8211; a summary</title>
		<link>http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/</link>
		<comments>http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 06:03:23 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=687</guid>
		<description><![CDATA[ Of late I have been torturing myself about the question of - even if I build on top
of a highly reliable storage service like Windows Azure Table Service do I still
need to worry about backups, versioning, journals and such? The answer
would seem to be, yes, I do. Mostly because even if the table store [...]]]></description>
			<content:encoded><![CDATA[ Of late I have been torturing myself about the question of - even if I build on top
of a highly reliable storage service like Windows Azure Table Service do I still
need to worry about backups, versioning, journals and such? The answer
would seem to be, <a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" >yes, I do</a>. Mostly because even if the table store works
perfectly, I&#8217;m still going to have bugs I introduced that are going to hork my
data.
<!--l. 38--><p class="indent" >   In fact what I specifically need to do is:
     <ol class="enumerate1" >
     <li class="enumerate" id="x1-3x1">Lobby the Windows Azure Table Storage team to add <a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/#x1-80002" >undelete for tables</a>
     so if I accidentally blow away one of my tables I have some hope (oh and
     ACL&#8217;s would be nice too)
     </li>
     <li class="enumerate" id="x1-5x2">Be <a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/#x1-90003" >very careful</a> about how I update my schemas
     </li>
     <li class="enumerate" id="x1-7x3">Implement a <a href="http://www.goland.org/buildingacommandjournal/" >command journal</a> (and be clear about their <a href="http://www.goland.org/thelimitsofcommandjournals/" >limitations</a>)
     </li>
     <li class="enumerate" id="x1-9x4">If time permits <a href="http://www.goland.org/tombstoneazuretablestore/" >implement tombstoning</a>
     </li>
     <li class="enumerate" id="x1-11x5">If I&#8217;m feeling really wacko implement my own <a href="http://www.goland.org/aversionedwindowsazuretablestore/#x1-50004" >versioning system</a> on top of
     the table store (or just <a href="http://www.goland.org/aversionedwindowsazuretablestore/#x1-30002" >backups</a> if I&#8217;m feeling only slightly wacko)
     </li>
     <li class="enumerate" id="x1-13x6">Put into place a <a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/" >realistic plan</a> to take advantage of all these features while
     keeping in mind the <a href="http://www.goland.org/thelimitsofrecovery/" >limitations</a> of these techniques.</li></ol>
<!--l. 56--></p><p class="noindent" >The links in the previous text are to the other articles in this series that I wrote for my
blog. Those articles are:
                                                                  

                                                                  
     <ul class="itemize1">
     <li class="itemize"><a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" >Do I need to backup/journal my Windows Azure Table Store?</a>
     </li>
     <li class="itemize"><a href="http://www.goland.org/thelimitsofcommandjournals/" >The Limits of Command Journals</a>
     </li>
     <li class="itemize"><a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/" >Techniques to Ease Recovering from Self Inflicted Data Corruption</a>
     </li>
     <li class="itemize"><a href="http://www.goland.org/buildingacommandjournal/" >Thoughts on implementing a command journal</a>
     </li>
     <li class="itemize"><a href="http://www.goland.org/tombstoneazuretablestore/" >Tombstoning on top of Windows Azure Table Store</a>
     </li>
     <li class="itemize"><a href="http://www.goland.org/thelimitsofrecovery/" >The limits of recovering from application logic failures</a>
     </li>
     <li class="itemize"><a href="http://www.goland.org/aversionedwindowsazuretablestore/" >Implementing Versioning in Windows Azure Table Store</a></li></ul>
    </p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Implementing Versioning in Windows Azure Table Store</title>
		<link>http://www.goland.org/aversionedwindowsazuretablestore/</link>
		<comments>http://www.goland.org/aversionedwindowsazuretablestore/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 05:43:57 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=683</guid>
		<description><![CDATA[         In   a   previous   article   I   argued   that   I   needed   some   kind   of
     journaling/backup for my Windows Azure Tables in order [...]]]></description>
			<content:encoded><![CDATA[     <!--l. 31--><p class="indent" >    <span class="aer-9">In   a   </span><a href="http://do-i-need-to-backupjournal-my-windows-azure-table-store/" ><span class="aer-9">previous   article</span></a>   <span class="aer-9">I   argued   that   I   needed   some   kind   of</span>
     <span class="aer-9">journaling/backup for my Windows Azure Tables in order to handle my</span>
     <span class="aer-9">own screw ups. In this article I re-examine the value of versioning for</span>
     <span class="aer-9">recovering from self inflicted data corruption. Discuss backups as a possible</span>
     <span class="aer-9">substitute for versioning. Look at what versioning might look like if added</span>
     <span class="aer-9">as  a  native  feature  of  Windows  Azure  Table  Store  and  finish  up  by</span>
     <span class="aer-9">proposing a design that would let me implement versioning on top of</span>
     <span class="aer-9">Windows Azure Table Store.</span>
</p><p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-683"></span>
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">The value of versioning</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-30002" id="QQ2-1-3"><span class="aer-9">What about backups?</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-40003" id="QQ2-1-4"><span class="aer-9">Imagining a versioned Windows Azure Table Store</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">4 </span><a href="#x1-50004" id="QQ2-1-5"><span class="aer-9">In place versioning on top of the table store</span></a></span>
       </div>

                                                                  

                                                                  
<!--l. 43--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>The value of versioning</h3>
<!--l. 45--></p><p class="noindent" >The value of versioning in recovering from application errors has already been
covered <a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/" >here </a>and <a href="http://www.goland.org/thelimitsofrecovery/" >here</a>. But to summarize - when it hits the fan versioning
can help one figure out if the original damage has been compounded by
subsequent changes. Furthermore version, by providing the outcome of a
command let&#8217;s one examine what happened in the past with less baggage than
needed to understand the past then the &#8217;replay&#8217; issues that plague <a href="http://www.goland.org/thelimitsofcommandjournals/" >command
journals</a>.
<!--l. 54--></p><p class="indent" >   Versioning is also useful as a last ditch &#8217;go back in time&#8217; mechanism where
if the damage is just too great to repair at least the system can provide
the option of turning back the clock to some better state. Although one
shouldn&#8217;t overstate the utility of this feature. In non-trivial cases there will
be a variety of side effects of &#8217;turning back the clock&#8217; that will be hard to
control and the clock can&#8217;t go too far back or issues with schema changes,
functionality changes, etc. come into play. Many of the same issues with replaying
command journals apply to using versioning as an emergency escape hatch to the
past.
<!--l. 64--></p><p class="indent" >   So while versioning is useful, I suspect that command journals and tombstones in
the average case probably provide the most bang for the buck. My real hope is that
systems like Windows Azure Table Store will offer versioning as a feature
so the cost and complexity of taking advantage of versioning will go way
down.
<!--l. 71--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-30002"></a>What about backups?</h3>
<!--l. 73--></p><p class="noindent" >As discussed below implementing versioning on top of the Windows Azure
Table Store, while not brain surgery, isn&#8217;t trivial either. A much simpler
technique would be to regularly backup tables. This can be done in the
background without having to interfere with normal operations. So it&#8217;s less
risky.
<!--l. 79--></p><p class="indent" >   Backups work using snapshots. At regular intervals the table is read in (typically
with a filter that ignores values that haven&#8217;t changed since the last snapshot) and a
snapshot created. Unfortunately snapshots miss things. If a value is changed
multiple times between snapshots then the intermediate values will not be
recorded.
<!--l. 85--></p><p class="indent" >   This leads to situations where if a buggy command is given between
snapshots and then the buggy value is overwritten just before the snapshot I
have no way of knowing what the original value was unless I can replay the
command (which is tricky and assumes that the value produced by the bug
is predictable). This makes it more or less impossible to handle the <a href="http://www.goland.org/thelimitsofrecovery/" >put
syndrome</a> since I can&#8217;t see if the same value was written twice or a new value
written.
<!--l. 94--></p><p class="indent" >   For similar reasons backups are not useful when dealing with the <a href="http://www.goland.org/thelimitsofrecovery/" >etag syndrome</a>
                                                                  

                                                                  
since it&#8217;s at best just luck if the snapshot happens to have captured the correct
system state at the time the command was executed.
<!--l. 98--></p><p class="indent" >   Also backups don&#8217;t deal at all well with deletes. Unless one copies the
entire table during every snapshot (a rather expensive proposition) then any
deleted records will be missed. So if one is going to implement delta based
snapshots (e.g. just copying things changed since last snapshot) then one
also needs to implement <a href="http://www.goland.org/tombstoneazuretablestore/" >tombstones</a> and backup the tombstone table as
well.
<!--l. 105--></p><p class="indent" >   If a transaction is in progress during a snapshot then only the parts of the
transaction that occurred before the snapshot will be captured, those coming after
will be missed until the next snapshot. So restoring from the most recent snapshot
means restoring the system to an inconsistent state. While inconsistency
happens anyway in loosely coupled systems its one thing for a user to issue
a command that fails in a bad way, something the user is generally told
about. It&#8217;s another thing for the system at some point to just &#8217;shift state&#8217; to
some previous, inconsistent, point and users are then told to pick up the
mess.
<!--l. 115--></p><p class="indent" >   Still, for all of that, at least backups offer some hope of turning back the
clock in the case of hopeless data corruption so perhaps they do have some
value.
<!--l. 120--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-40003"></a>Imagining a versioned Windows Azure Table Store</h3>
<!--l. 122--></p><p class="noindent" >Versioning tends to come in two flavors, linear and non-linear. My belief is that
Windows Azure Table Store only needs linear versioning. My reasoning is that if one
looks at the table store using the lens of the <a href="http://www.julianbrowne.com/article/viewer/brewers-cap-theorem" >CAP Theorem</a> then one notices that
Windows Azure Table Store focuses on availability and consistency. If one is willing
to give up partition tolerance (which Azure Table Store has) then most of the use
cases for non-linear versioning go away. It is possible in a consistent system to enforce
an order, even without locking, thanks to optimistic concurrency which the table
store supports.
<!--l. 133--></p><p class="indent" >   So if the table store supported linear versioning then the experience would be that
every write would cause a new version of a particular row to come into existence, I&#8217;ll
call that the tip version.
<!--l. 137--></p><p class="indent" >   All existing store commands would work exactly as they do now but would only
apply to the tip version and in the case of POST and PUT would create new tip
versions. The delete command would create a tombstone entry stating that the row
was deleted. The tombstone entry would be invisible to all the existing Windows
Azure Table commands.
<!--l. 143--></p><p class="indent" >   I don&#8217;t think that check-in/check-out semantics are appropriate to a highly
distributed system like the table store so the commands available to a versioning
aware client would actually be quite limited. I would add a way to specify a version
in the URL of a row (say with a query parameter) as well as a query to
include versions in the output of a table query. Finally I would support
                                                                  

                                                                  
the ability to destroy (as in delete without trace) a row. I don&#8217;t know that
much more than that is really needed in terms of interacting with older
versions.
<!--l. 153--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">4   </span> <a id="x1-50004"></a>In place versioning on top of the table store</h3>
<!--l. 155--></p><p class="noindent" >[Disclaimer: The following is more of a mental exercise. I haven&#8217;t had time to actually
mock this up and make sure all the details are right.]
<!--l. 159--></p><p class="indent" >   Right now Windows Azure Table Store doesn&#8217;t offering versioning so I&#8217;ve given
some thought to how I might implement versioning myself on top of the table store.
The services I work on that use the table store tend to be high read and low write. So
I want an approach to adding versioning to the table store that places the cost of
versioning more on writes than reads. I also want an approach that is more or
less guaranteed to produce consistent output. That is, I don&#8217;t want to end
up in a situation where the state of my production tables and my version
history are out of whack. The whole point of introducing versioning is that
it&#8217;s correct and complete so I can reason about certain things that would
otherwise be hard to do. If I can&#8217;t get consistency in my version store I might as
well use backups which at least are simpler to implement. Thankfully the
table store provides the features to meet all of my requirements including
consistency.
<!--l. 174--></p><p class="indent" >   The approach I would use is in place versioning. That is most current
version of a row (referred to as the tip) and its previous versions all live in the
same partition in the same table. This is the opposite of the approach I used
with tombstones because in the case of tombstones consistency wasn&#8217;t a
problem.
<!--l. 180--></p><p class="indent" >   In the in place versioning approach the tip version of any row will have whatever
partition key/row key it is supposed to have plus the prefix &#8221;tip&#8221; on the row key. This
means that anytime I want to interact with the tip version of a row I just generate
the expected partition key/row key and add in &#8221;tip&#8221; as a prefix on the row key. This
makes reads fast.
<!--l. 187--></p><p class="indent" >   Every column I&#8217;m versioning will contain a version ID which is a monotonically
increasing integer. The first time I create the &#8221;tip&#8221; version of a row (e.g. when the row
is first created) I will give it the version number 0. When updating a row I
will copy the old value and give it the prefix &#8221;old&#8221;. Then I&#8217;ll update the tip
version and increment its version number. The key to consistency with an in
place versioning approach is that it&#8217;s possible to both create the old version
and update the tip atomically. The table store&#8217;s <a href="http://msdn.microsoft.com/en-us/library/dd894038.aspx" >entity group transaction</a>
mechanism is guaranteed to be atomic and so can be used to solve exactly this
problem.
<!--l. 198--></p><p class="indent" >   To version enable a table store table I need either to build a proxy or a library.
My guess is that I would use a library to save the processing and network time of a
proxy but what is really nice about a proxy is that I can use the proxy as a lock
down mechanism. I can make sure nobody but the proxy has the key to the table so
                                                                  

                                                                  
if someone doesn&#8217;t go through the proxy then they don&#8217;t get access to the
data. That alone will prevent tons of bugs. By having a single proxy I can
also more easily control issues like versioning of the proxy code which deals
with a whole other set of bugs. But proxies do demand both a processing
and latency cost so I have to consider that in deciding between proxies and
libraries.
<!--l. 210--></p><p class="indent" >   The following goes through the standard methods in their non-version aware form
and explains how their behavior would change if one was using a version aware
library/proxy to interact with the table store using an in place versioning
approach.
     <dl class="description"><dt class="description">
<span class="aebx-10">GET</span> </dt><dd class="description">If the query contains a filter that specifies a rowkey then prefix the rowkey
     value(s) with &#8221;tip&#8221;. In all cases add the filter argument (if it doesn&#8217;t already
     exist) of &#8221;rowkey gt &#8217;old&#8221;&#8217;. This will filter out everything but tip versions
     of rows since &#8217;tip&#8217; comes after &#8217;old&#8217;.
     </dd><dt class="description">
<span class="aebx-10">DELETE</span> </dt><dd class="description">A GET is needed to retrieve the current &#8217;tip&#8217; version. If none exists
     then the request should fail since there is nothing to delete. If the tip
     version does exist then create an entity group transaction that includes
     creating a new row to act as the tombstone with a column &#8217;tombstone&#8217; set
     to true as well as a delete command for the current tip that includes the
     etag from the GET in an if-match header.
     </dd><dt class="description">
<span class="aebx-10">PUT</span> </dt><dd class="description">First retrieve the existing versions &#8217;tip&#8217; (using an etag if one was provided
     in an if-match or equivalent header). If there isn&#8217;t one then the resource
     doesn&#8217;t exist or has been deleted and so the request should fail. If the &#8217;tip&#8217;
     version exists then an entity group transaction is needed to update the tip
     version as previously described but use if-match with the etag retrieved
     from the original GET.
     </dd><dt class="description">
<span class="aebx-10">MERGE</span> </dt><dd class="description">The logic is the same as PUT for all intents and purposes. It&#8217;s just
     that values not specified in the MERGE request have to be retrieved from
     the soon to be replaced &#8217;tip&#8217; version to create the &#8217;old&#8217; prefixed copy.
     </dd><dt class="description">
<span class="aebx-10">POST</span> </dt><dd class="description">Check to see if a tip version exists. If so, then fail. If not then check to
     see if there is a tombstone. If so then issue the POST request with the
     version number set to an increment of the number in the tombstone. If
     there is no tombstone then the version number is 0 and row key will have
     &#8217;tip&#8217; added as a prefix.
     </dd><dt class="description">
<span class="aebx-10">Entity</span><span class="aebx-10">&#x00A0;group</span><span class="aebx-10">&#x00A0;transaction</span> </dt><dd class="description">In essence just glue together the instructions for
     the individual methods mentioned above and apply to the contents of the
                                                                  

                                                                  
     entity group transaction. Entity group transactions even support if-match
     headers.</dd></dl>
<a id="Q1-1-6"></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/aversionedwindowsazuretablestore/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The limits of recovering from application logic failures</title>
		<link>http://www.goland.org/thelimitsofrecovery/</link>
		<comments>http://www.goland.org/thelimitsofrecovery/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 04:29:06 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=679</guid>
		<description><![CDATA[         I have been blathering on all week about how to prepare for application
     logic failures in services and how to potentially recover from the damage
     those errors cause. I have yammered on about command journals (twice),
   [...]]]></description>
			<content:encoded><![CDATA[     <!--l. 31--><p class="indent" >    <span class="aer-9">I have been </span><a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" ><span class="aer-9">blathering</span></a> <span class="aer-9">on all week about how to prepare for application</span>
     <span class="aer-9">logic failures in services and how to potentially </span><a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/" ><span class="aer-9">recover</span></a> <span class="aer-9">from the damage</span>
     <span class="aer-9">those errors cause. I have yammered on about </span><a href="http://www.goland.org/buildingacommandjournal/" ><span class="aer-9">command journals</span></a> <span class="aer-9">(</span><a href="http://www.goland.org/thelimitsofcommandjournals/" ><span class="aer-9">twice</span></a><span class="aer-9">),</span>
     <a href="http://www.goland.org/tombstoneazuretablestore/" ><span class="aer-9">tombstones</span></a><span class="aer-9">, versioning etc. But none of these techniques is magical. They</span>
     <span class="aer-9">all have very serious limits that mean in most non-trivial cases the best</span>
     <span class="aer-9">one can really do is say to the user &#8221;Here is the command I screwed up,</span>
     <span class="aer-9">here are the specific mistakes made, here is what the values should have</span>
     <span class="aer-9">been, do you want to repair this damage?&#8221; Below I explore three specific</span>
     <span class="aer-9">examples of those limits that I call: read syndrome, put syndrome and</span>
     <span class="aer-9">e-tag effect.</span>
</p><p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-679"></span>
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">Read syndrome </span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-30002" id="QQ2-1-3"><span class="aer-9">Put syndrome </span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-40003" id="QQ2-1-4"><span class="aer-9">Etag effect </span></a></span>
       </div>

                                                                  

                                                                  
<!--l. 48--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>Read syndrome </h3>
<!--l. 50--></p><p class="noindent" >Lets say there is a buggy command that incorrectly changes some piece of data. Later
the bug is discovered and so now there is a desire to fix the damage. However
between the time that the buggy command was issued and the time the
bug was discovered someone may have read the incorrect data and made
decisions based on that incorrect data. Those decisions could then be used to
update other parts of the system or systems external to the one with the
bug.
<!--l. 58--></p><p class="indent" >   For example, let&#8217;s say that due to a bug the title of some object, recorded as a
column in a row, was written out as &#8221;bar&#8221; when it should have been &#8221;foo&#8221;. After the
buggy command someone else comes along, reads the title, sees that it is &#8221;bar&#8221; and
starts writing out &#8221;bar&#8221; in other locations as a pointer to the object. If I now change
the column value from &#8221;bar&#8221; back to the intended value &#8221;foo&#8221; I will break those links
and cause damage. In essence the wrong value, to a certain extent, has become the
&#8217;right&#8217; value.
<!--l. 67--></p><p class="indent" >   To detect the possibility of read syndrome I need to, at the very least, record the
last time someone read a particular value. I also need to track down reads on any
commands that could leak, directly or indirectly, the incorrect state. But even if I
had all of this data I cannot, in the general case, tell the difference between a read
that led to action and read that did not. So once again the best I can do in
the general case is lay the facts before the user and let them decide how to
compensate.
<!--l. 77--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-30002"></a>Put syndrome </h3>
<!--l. 79--></p><p class="noindent" >It is common for clients who wish to update the state of a system to first
read in the state of the system, locally change the parts they want updated
and then upload the entire state back to the system. A classic example of
this is using PUT against the table store. PUT replaces everything in the
row it is pointed at. So if one uses PUT (rather than MERGE) then one
has to read in the entire state of the row and then upload the entire state
including desired changes. This behavior complicates recovering from data
corruption.
<!--l. 88--></p><p class="indent" >   If a row has been updated after a buggy command is issued with the same buggy
value there is an ambiguity. Did the user intend the buggy value (perhaps in the
sense of the read syndrome defined above) to now be the correct value or was the
user just using replacement style PUT logic? Generally one can&#8217;t tell the difference
and so one has to ask the user to clarify intent.
<!--l. 95--></p><p class="indent" >   Note that particularly with bugs that produce unpredictable output, versioning
can be very useful here. With versioning one can see exactly what value the
buggy command actually outputted and then see if the next update used the
same value or not. If the value is not the same then one more or less has to
assume the update is meaningful. If the value is the same then the ambiguity
                                                                  

                                                                  
exists. But at least with versioning one can reduce the number of ambiguous
cases.
<!--l. 104--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-40003"></a>Etag effect </h3>
<!--l. 106--></p><p class="noindent" >A request is sent to a service to update some values. There is a logic bug and the
update is mangled. This is later discovered and the suspect command identified from
the command journal. But the command included an if-match or similar
header that predicated the command&#8217;s execution on a specific e-tag or other
condition. It&#8217;s tempting to argue that the error should just be fixed but this
can only apply if the system conditions are the same as those described
in the e-tag. Otherwise the change could just end up causing even more
damage.
<!--l. 115--></p><p class="indent" >   So in theory in order to fix the damage first one must determine if the system
state is the same as the one in the etag. But even assuming that the etag is directly
taken from the table store (unlikely for all but the most trivial systems) the put
syndrome means its easy to be fooled into thinking the system state has changed
when it has not. At best what one can do is use a versioning system to see what state
the etag represented and then determine if the system is still in the same state.
If so then perhaps the fix can be applied. Assuming, of course, that the
user still wants the value set that way at this point in time. So again, even
assuming a versioning system is available, at best what one can do is explain the
situation to the user, provide the context information and let the user make a
decision.
<!--l. 128--></p><p class="indent" >   Note that the etag effect applies even if there is no etag. When a user
issues a command they do so in a certain context whose parts may involve
information outside the knowledge of the service itself. Undoing an erroneous
command without knowledge of that context can potentially cause more
damage than it fixes so once again, in the general case, one must consult the
user.
<a id="Q1-1-5"></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/thelimitsofrecovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tombstoning on top of Windows Azure Table Store</title>
		<link>http://www.goland.org/tombstoneazuretablestore/</link>
		<comments>http://www.goland.org/tombstoneazuretablestore/#comments</comments>
		<pubDate>Fri, 01 Jan 2010 00:32:09 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=675</guid>
		<description><![CDATA[         After command journaling probably the next most effective protection
     against application logic errors is tombstoning (keeping a copy of the last
     version of a deleted row). In this article I propose a design for adding
    [...]]]></description>
			<content:encoded><![CDATA[     <!--l. 31--><p class="indent" >    <span class="aer-9">After </span><a href="http://www.goland.org/buildingacommandjournal/" ><span class="aer-9">command journaling</span></a> <span class="aer-9">probably the next most effective protection</span>
     <span class="aer-9">against </span><a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" ><span class="aer-9">application logic errors</span></a> <span class="aer-9">is tombstoning (keeping a copy of the last</span>
     <span class="aer-9">version of a deleted row). In this article I propose a design for adding</span>
     <span class="aer-9">tombstoning to Windows Azure Table Store using two tables, a main table</span>
     <span class="aer-9">and a tombstone table.</span>
</p><p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-675"></span>
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">In place - the traditional tombstone design and why I don&#8217;t like it</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-30002" id="QQ2-1-3"><span class="aer-9">An alternative approach - two tables</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-40003" id="QQ2-1-4"><span class="aer-9">Don&#8217;t forget command journal IDs</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">4 </span><a href="#x1-50004" id="QQ2-1-5"><span class="aer-9">Isn&#8217;t tombstoning just soft deletes?</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">5 </span><a href="#x1-60005" id="QQ2-1-6"><span class="aer-9">Is tombstoning worth the effort?</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">A </span><a href="#x1-7000A" id="QQ2-1-8"><span class="aer-9">Racing Deletes</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">A.1 </span><a href="#x1-8000A.1" id="QQ2-1-9"><span class="aer-9">The problem</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">A.2 </span><a href="#x1-9000A.2" id="QQ2-1-10"><span class="aer-9">Why I can live with this</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">B </span><a href="#x1-10000B" id="QQ2-1-11"><span class="aer-9">Racing row keys</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">B.1 </span><a href="#x1-11000B.1" id="QQ2-1-12"><span class="aer-9">The problem</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">B.2 </span><a href="#x1-12000B.2" id="QQ2-1-13"><span class="aer-9">Why I can live with this</span></a></span>
       </div>
                                                                  

                                                                  

<!--l. 41--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>In place - the traditional tombstone design and why I don&#8217;t like
it</h3>
<!--l. 43--></p><p class="noindent" >Traditionally tombstones are implemented by adding a column to a table whose
semantic is &#8221;If I&#8217;m marked true then this row has been deleted.&#8221; But if I implement
tombstoning using this approach on top of Windows Azure Table Store then I have to
intercept all methods going to the table to make sure they interact properly with
tombstoned rows. For example should POST&#8217;ing to a row that is tombstoned fail
because the row &#8217;exists&#8217;? Probably not. But that won&#8217;t be the native behavior of the
table store. All GETs in particular also have to be intercepted to make sure that they
don&#8217;t return tombstones unless they were explicitly marked as wanting to do so.
This level of intervention convinces me that implementing tombstones the
traditional way on top of the Windows Azure Table Store where I have multiple
parallel writers and no equivalent of stored procedures is probably not a great
idea.
<!--l. 58--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-30002"></a>An alternative approach - two tables</h3>
<!--l. 60--></p><p class="noindent" >In the two table approach I have my main table and a separate tombstone table.
When a row is deleted it is first copied over to a tombstone table and only then
deleted from the main table. This approach only requires that the DELETE method
be intercepted. Other methods can be left alone since they will only hit the main
table.
<!--l. 66--></p><p class="indent" >   Although there are a few ways to implement the DELETE logic in the face of
tombstoning probably the most robust is a simple serial implementation
where:
     <ol class="enumerate1" >
     <li class="enumerate" id="x1-3002x1">A GET is issued to the main table row (use the if-match header if an etag
     was going to be used on the original delete)
     </li>
     <li class="enumerate" id="x1-3004x2">A PUT is then issued to the tombstone table with if-match set to &#8221;*&#8221;. The
     contents of the PUT should be identical to the GET including partition
     and row keys.
     </li>
     <li class="enumerate" id="x1-3006x3">A DELETE is then issued to the row in the main table (using the if-match
     header if an etag was going to be used on the original delete). If the
     DELETE fails then issue a GET to the main table to see if the row is still
     there (failures come in many flavors). If the row is gone then declare success
                                                                  

                                                                  
     and go home. Otherwise it would be nice (but not strictly necessary) to
     delete the row created in the PUT in step 2 and retry the three steps.</li></ol>
<!--l. 83--></p><p class="noindent" >It&#8217;s worth pointing out that the tombstone table is not a versioning table. It&#8217;s not there
to record every time a row was deleted. In practice it will sometimes record deletion
information about rows that have been subsequently recreated but that is just a side
effect and not a feature.
<!--l. 89--></p><p class="indent" >   While the three step approach is reasonably robust it can still have consistency
issues if the PUT succeeds but the DELETE doesn&#8217;t, for example the machine
crashed after issuing the PUT but before the DELETE.
<!--l. 93--></p><p class="indent" >   But the damage from this kind of situation is minimal since the main
table is the &#8217;truth&#8217;. If the main table says a row exists then it exists, end of
story.
<!--l. 97--></p><p class="indent" >   If I was feeling brave (or bug prone) I could intercept MERGE/POST/PUT in
addition to DELETE via my own library so that when I create a row I look
to see if it has a tombstone and if so I delete the tombstone. This could
even be done as a lazy process that doesn&#8217;t block the main action. But it
isn&#8217;t clear to me that it&#8217;s worth the effort. So long as I always check the
main table before the tombstone table when doing clean up operations then
having a row both in the main table and the tombstone table won&#8217;t cause
confusion.
<!--l. 106--></p><p class="indent" >   Cleaning up tombstones if they start eating up too much disk space is also a low
risk since a screw up in deleting tombstone entries will harm the tombstone table but
leave the main table untouched.
<!--l. 111--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-40003"></a>Don&#8217;t forget command journal IDs</h3>
<!--l. 113--></p><p class="noindent" >In the most simple application recovery scenario I think a particular row was
accidentally deleted. I look the row up and it doesn&#8217;t exist in the main table. So now
I go to the tombstone table where I find an entry for the row and so undelete
it.
<!--l. 118--></p><p class="indent" >   But the undelete might have been an error. For example, let&#8217;s say that after the
row was accidentally deleted the user noticed the incorrect deletion and recreated the
row themselves and then later deleted the row for their own reasons. If I implemented
the logic in the previous paragraph I would be bringing a row back from the dead
that the user didn&#8217;t want.
<!--l. 125--></p><p class="indent" >   To prevent this kind of scenario I need to mark the entries in my command
journal with IDs and then use those IDs in the tombstone table. That way I
can see if the last delete was caused by the buggy command. If not, I do
nothing.
<!--l. 131--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">4   </span> <a id="x1-50004"></a>Isn&#8217;t tombstoning just soft deletes?</h3>
                                                                  

                                                                  
<!--l. 133--></p><p class="noindent" >A soft delete is a delete in which one marks an entry as deleted rather than actually
deleting it. Sound familiar? Dare Obasanjo recently posted a blog entry about <a href="http://www.25hoursaday.com/weblog/2009/11/23/BuildingScalableDatabasesPerspectivesOnTheWarOnSoftDeletes.aspx" >soft
deletes</a> in which he mostly talked against them. I actually agree with his general
argument and believe that keeping data around forever (cost and privacy policy
permitting) is a good thing but as Dare&#8217;s article points out it&#8217;s probably better to
handle keeping data around forever by design instead of &#8217;hiding&#8217; things behind soft
deletes.
<!--l. 142--></p><p class="indent" >   However the soft delete example and tombstoning, while identical in effect, are
different in intention. Tombstones are meant exclusively as a recovery mechanism for
application logic failures. They are not intended as a way to keep data around
forever. In other words if a user accidentally deletes something they should
not my system will not be set up to undo the damage using the tombstone
table. Rather the tombstone table will only be called upon if my application
logic is faulty and I incorrectly deleted something that should not have been
deleted.
<!--l. 153--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">5   </span> <a id="x1-60005"></a>Is tombstoning worth the effort?</h3>
<!--l. 155--></p><p class="noindent" >As a service implementer I have an almost primal fear of deletion because there is no
way back. And I can&#8217;t and don&#8217;t want to get rid of deletion. Even if I had no
economic reason to ever delete things there are extremely good privacy and security
reasons to delete things. So I want to stay in the deletion business, I&#8217;d just like it to
be a bit less dangerous. A tombstone table buys me some breathing room if I screw
up. Sorta.
<!--l. 163--></p><p class="indent" >   The &#8217;sorta&#8217; is because in most interesting cases just undeleting a value isn&#8217;t
enough. Interesting rows tend to point at each other which brings up fun problems if
I just randomly undelete some row which contains links to other rows. Are those links
still correct? How do I validate them? It&#8217;s not enough to just implement a
tombstone system. One also has to think carefully about how rows can be safely
undeleted without creating more chaos. Unfortunately it&#8217;s hard to generalize
about how difficult the &#8217;undelete&#8217; process will be without getting into the
specifics of particular services. But even so, in the worse case, if I have the
tombstone table at least I have something to undelete. Without it, I have
nothing.
<!--l. 175--></p><p class="indent" >   Ideally I would rather the Windows Azure Table Store implement tombstoning (or
better yet, full versioning). But they don&#8217;t do that currently and building my own
tombstoning system isn&#8217;t too hard (think it would make a good open source project?)
and isn&#8217;t too dangerous to the integrity of my system. So if after implementing a
command journal I&#8217;m still feeling paranoid then the next thing I would look at is
tombstoning.
<a id="Q1-1-7"></a>
                                                                  

                                                                  
<!--l. 187--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">A   </span> <a id="x1-7000A"></a>Racing Deletes</h3>
<!--l. 190--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">A.1   </span> <a id="x1-8000A.1"></a>The problem</h4>
<!--l. 192--></p><p class="noindent" >The three step process outlined above still suffers from race conditions beyond those
previously described. For example, imagine that two people are simultaneously trying
to delete the same row. Let&#8217;s call them command A and command B who are both
trying to delete row alpha.
     <ol class="enumerate1" >
     <li class="enumerate" id="x1-8002x1">Command A does a GET on row alpha in the main table
     </li>
     <li class="enumerate" id="x1-8004x2">Command B does a GET on row alpha in the main table
     </li>
     <li class="enumerate" id="x1-8006x3">Command A does a PUT on row alpha in the tombstone table and puts
     in command ID A
     </li>
     <li class="enumerate" id="x1-8008x4">Command B does a PUT on row alpha in the tombstone table and puts
     in command ID B
     </li>
     <li class="enumerate" id="x1-8010x5">Command A deletes row A in the main table
     </li>
     <li class="enumerate" id="x1-8012x6">Command B tries to delete row A in the main table but fails because it&#8217;s
     already deleted
     </li>
     <li class="enumerate" id="x1-8014x7">Command B issues a GET for row A in the main table, sees it&#8217;s gone and
     declares victory</li></ol>
<!--l. 209--></p><p class="noindent" >At some time later I determine that command B was buggy. It actually shouldn&#8217;t have
deleted row alpha. But due to the ordering of the above the tombstone is going to
show command B as the last command and so I will undelete row alpha. But in fact
there was a separate, non-buggy, command, command A that also wanted the row
deleted and actually got there first! So the correct action is to not undelete the row.
But I have no record of command A in the tombstone and so I don&#8217;t recognize what
happened.
<!--l. 218--></p><p class="indent" >   Note, BTW, that even if both commands used etags the race condition would still
happen exactly as described.
                                                                  

                                                                  
<!--l. 222--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">A.2   </span> <a id="x1-9000A.2"></a>Why I can live with this</h4>
<!--l. 224--></p><p class="noindent" >Race conditions are the bane of all loosely coupled systems. It goes with
the territory. And in most cases I just live with them. Stuff is going to go
wrong. Not because the code is necessarily wrong but because the cost of
preventing these errors exceeds the benefit therein derived. For example, to
prevent the previously described race condition in a robust way I would
have to introduce some form of locking or at least serialization using two
phase commit. Both techniques have significant downsides. So sometimes I
just accept that stuff is going to break. This is one of those examples and a
good reason why I probably should never automatically recover data but
rather should notify users and present them with the data I think is wrong
and what changes I think would fix it but let them make the final call on
recovery.
<!--l. 238--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">B   </span> <a id="x1-10000B"></a>Racing row keys</h3>
<!--l. 241--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">B.1   </span> <a id="x1-11000B.1"></a>The problem</h4>
<!--l. 243--></p><p class="noindent" >Let&#8217;s imagine I have a row that has a randomly generated row key but can be
uniquely identified by several of its columns. If the user causes the row to be deleted
then a tombstone entry will be created with the row instance&#8217;s random row key.
Later on I realize that the delete was a mistake due to an application bug. To track
down the delete I check the command journal but the journal isn&#8217;t likely to tell me
the row key (which is random anyway) since this isn&#8217;t typically the kind of things I
tell my users so they wouldn&#8217;t have the row key in their command and it therefore
wouldn&#8217;t be in the journal. So now I need to do a search on the unique columns to
see if the row (whose row key I don&#8217;t know) exists. If I don&#8217;t find it in the
main table then I need to search the tombstone table to see who deleted
it.
<!--l. 256--></p><p class="indent" >   When I do the search I get back two results that both contain the column values
I&#8217;m looking for. The reason being that the row was created and deleted
twice. Once by accident due to my bug and once intentionally and correctly.
The question I now have to answer is - what order did the deletes occur
in?
<!--l. 262--></p><p class="indent" >   If the buggy delete happened first then no further action is needed.
<!--l. 264--></p><p class="indent" >   If the buggy delete happened second then I may need to recreate the tombstoned
row.
<!--l. 267--></p><p class="indent" >   But how the heck do I figure out the ordering? I could look at the time stamp on
each of the two tombstones (which is actually <a href="http://msdn.microsoft.com/en-us/library/dd179338.aspx" >prohibited</a> by Azure) and see their
ordering but that is misleading since the timestamps represent when the tombstones
                                                                  

                                                                  
were created not when the original entries were created and thus the entries could
have been created/deleted in a different order than appears in the tombstone
table.
<!--l. 275--></p><p class="indent" >   Another approach is to look at the command journal ID on the tombstone entries
and then check the command journal to see what order those IDs appear in. But that
is also potentially misleading, especially if the two command journal entries are in
different partitions. This can mean that the entries were created on different
machines in Azure and thanks to the wonders of clock skew its possible for
there to be small differences in clocks that can cause the real ordering to be
flipped.
<!--l. 285--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">B.2   </span> <a id="x1-12000B.2"></a>Why I can live with this</h4>
<!--l. 287--></p><p class="noindent" >For this scenario to happen I have to have a row that uses a random row key. If a row
used a non-random row key then if the same row exists two times it will have the
same row key both times and a single tomb stone entry (since tomb stone entries are
to have the same partition key and row key as their main table entries). But if
I have a way of uniquely identifying a row and seeing if it was repeated
then it has columns that can be used together to create a unique key and I
should have used that unique key rather than a random key for the row key.
In other words if this situation comes up then I have a design flaw in my
system.
<!--l. 298--></p><p class="indent" >   Generally speaking if I&#8217;m using a random row key its because I have cases where
two rows can have absolutely identical values but need separate identifies. In that
case the only way I would find a row is not via query but rather because some other
row is pointing at that row. So disaster recovery would only apply if I had some row
that did exist that pointed at the row with the random row key and the
pointed at row should exist but doesn&#8217;t. In that case I know exactly where to
look in the tombstone table and the problem described above doesn&#8217;t apply.</p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/tombstoneazuretablestore/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on implementing a command journal</title>
		<link>http://www.goland.org/buildingacommandjournal/</link>
		<comments>http://www.goland.org/buildingacommandjournal/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 22:39:50 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=671</guid>
		<description><![CDATA[         I  had  previously  concluded  that  command  journaling  (creating  a
     journal  of  all  the  external  user  commands  and  internal  maintenance
     commands I [...]]]></description>
			<content:encoded><![CDATA[     <!--l. 31--><p class="indent" >    <span class="aer-9">I  had  </span><a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/" ><span class="aer-9">previously  concluded</span></a>  <span class="aer-9">that  command  journaling  (creating  a</span>
     <span class="aer-9">journal  of  all  the  external  user  commands  and  internal  maintenance</span>
     <span class="aer-9">commands I issue) is really useful for recovering from self inflicted data</span>
     <span class="aer-9">corruption. In this article I look into the various techniques I can use</span>
     <span class="aer-9">to  implement  a  command  journal  so  as  to  trade  off  between  system</span>
     <span class="aer-9">performance and the journal&#8217;s utility in recovery.</span>
</p><p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-671"></span>
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">Command Journal? Isn&#8217;t that just a log?</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-30002" id="QQ2-1-3"><span class="aer-9">Performance versus accuracy</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">2.1 </span><a href="#x1-40002.1" id="QQ2-1-4"><span class="aer-9">Single parallel write</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">2.2 </span><a href="#x1-50002.2" id="QQ2-1-5"><span class="aer-9">Single serial write</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">2.3 </span><a href="#x1-60002.3" id="QQ2-1-6"><span class="aer-9">Adding a second write</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">2.4 </span><a href="#x1-70002.4" id="QQ2-1-7"><span class="aer-9">Adding command IDs and versioning</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-80003" id="QQ2-1-8"><span class="aer-9">Should I include reads in the command journal?</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">4 </span><a href="#x1-90004" id="QQ2-1-9"><span class="aer-9">Retrieving data from the command journal</span></a></span>
       </div>

                                                                  

                                                                  
<!--l. 42--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>Command Journal? Isn&#8217;t that just a log?</h3>
<!--l. 44--></p><p class="noindent" >You betcha. All I&#8217;m suggesting is that we can use those logs to help <a href="http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/" >recover
from self inflicted data corruption</a>. In this article I talk about specific ways
of generating that log that make the log more useful for data corruption
recovery.
<!--l. 50--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-30002"></a>Performance versus accuracy</h3>
<!--l. 52--></p><p class="noindent" >In deciding how to design the command journal I basically have to trade off
between the performance overhead of the command journal and accuracy of its
contents. Below I walk through a number of designs/features that set the
performance/accuracy slider to different settings.
<!--l. 58--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">2.1   </span> <a id="x1-40002.1"></a>Single parallel write</h4>
<!--l. 60--></p><p class="noindent" >The most trivial command journal I can imagine would work in the following way: A
request is received and an immediate asynchronous write request is made to the
command journal to record the request. Meanwhile, in parallel, the actual command
is executed.
<!--l. 65--></p><p class="indent" >   The main downside to this approach is false negatives, e.g. saying someone wasn&#8217;t
affected by a bug who really was. It&#8217;s quite possible that the something happened to
the write to the command journal but the command was still executed. In that case
there would be no record of the command in the journal so if that lost record
recorded a command with the bug we wouldn&#8217;t know to notify the affected
entity.
<!--l. 73--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">2.2   </span> <a id="x1-50002.2"></a>Single serial write</h4>
<!--l. 75--></p><p class="noindent" >I&#8217;m generally more concerned with false negatives (telling people they are o.k. when
they aren&#8217;t) than with false positives (telling someone a bug hit them when it didn&#8217;t).
My guess is that in any real world scenario where I&#8217;m trying to recover from a self
inflicted bug the people affected by the bug are going to have to check if the &#8217;fix&#8217;
makes sense or not and that will give them a chance to realize that they weren&#8217;t
really affected by the bug.
<!--l. 83--></p><p class="indent" >   One way to reduce my false negative rate is to serialize the front end
interaction with the command journal. When a command comes in I would
first issue a write to the command journal and only once I got confirmation
that the write succeeded would I then proceed to process the rest of the
command.
                                                                  

                                                                  
<!--l. 89--></p><p class="indent" >   This approach is more expensive then the previous one in terms of time. The time
to process all write commands will now increase by the round trip time needed to
write to the command journal. But on the positive side I&#8217;ll have a lot less false
negatives since I&#8217;m guaranteed that a record of the command exists. Of course I&#8217;ll
also have false positives since the command may exist in the journal but never have
been executed in real life.
<!--l. 98--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">2.3   </span> <a id="x1-60002.3"></a>Adding a second write</h4>
<!--l. 100--></p><p class="noindent" >One way to reduce false positives is to add a second write where we record the
outcome of the operation. This will only help in cases where a bug is directly related
to how the command completed (e.g. if a bug only shows up in operations
that succeed or only ones that fail in a certain way). By recording data
about how a command completed we can then filter down the list of affected
users.
<!--l. 107--></p><p class="indent" >   But this mechanism, besides adding another round trip and additional expense, is
far from perfect. After all, a machine could simply drop dead in the middle of
processing a command and unless a mechanism like Azure&#8217;s Queues are being used
there will be no recovery and we essentially end up back to a single write system in
terms of that command.
<!--l. 114--></p><p class="indent" >   Still, this approach won&#8217;t increase false negatives over the single write systems
(e.g. it does no harm) and it can reduce false positives.
<!--l. 118--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">2.4   </span> <a id="x1-70002.4"></a>Adding command IDs and versioning</h4>
<!--l. 120--></p><p class="noindent" >Another technique to reduce false positives is to assign an ID to each command as it
comes in. That ID is recorded in the command journal and every write we do to our
production data systems will also include that ID. When we investigate the outcome
of a command we can look for the ID in our production storage and piece together
what actually happened to the command.
<!--l. 127--></p><p class="indent" >   In the general case we will need to know not just if the write happened but also
what value it wrote (for diagnosing intermittent bugs). So if our production stores
aren&#8217;t versioned then the IDs won&#8217;t be terribly useful. I hope to get to an
article on how to implement versioning on top of Windows Azure Table
Store.
<!--l. 134--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-80003"></a>Should I include reads in the command journal?</h3>
<!--l. 136--></p><p class="noindent" >In theory I should record everything. Every read. Every write. It&#8217;s tempting to argue
that the reads, in particular, don&#8217;t matter. After all I am worried about cases where I
am corrupting data and reads shouldn&#8217;t cause data to change. Of course,
                                                                  

                                                                  
code shouldn&#8217;t have bugs and I shouldn&#8217;t have to worry about corrupted
data either. But in reality code does have bugs and my reads could have
potentially done something stupid. Or more likely, caused something stupid to
happen.
<!--l. 144--></p><p class="indent" >   The simplest example is that there is a bug in the code that reads out values
and every once in a while I return the wrong value. If I logged both reads
and writes I could potentially figure out which reads were likely to have
run into the bug and let specific customers know if they could have been
affected.
<!--l. 150--></p><p class="indent" >   If I don&#8217;t log reads, then I can&#8217;t provide extra notice to users most likely to have
been affected. Of course I also have to pay the bills and while cold data isn&#8217;t
terribly expensive it isn&#8217;t free. For example, Azure has <a href="http://www.microsoft.com/windowsazure/pricing/" >stated</a> they are going to
charge $0.15/GB stored and $0.01 per 10,000 storage transactions. Assuming
I&#8217;m using Windows Azure compute as my front end I don&#8217;t have to pay
anything to move the journal commands from my front end machines to
storage.
<!--l. 159--></p><p class="indent" >   Let&#8217;s say I get 1000 read requests/second. That works out to 1000 * 60 * 60 *
0.01/10,000 = $3.6/hour for transaction costs for writing to the journal (assuming a
single write model). Lets further assume that every read requires 2048 bytes of
journal space. I&#8217;m assuming that I only record the fact that the read happened and
don&#8217;t actually record the response body, hence the 2048 bytes guesstimate for size. In
that case we need 1000 * 60 * 60 * 2048 / 1024 / 1024 / 1024 * 0.15 = $1.03 to
store the resulting data for one month Let&#8217;s assume I&#8217;ll keep the data for six
months. So the real cost is $1.03 * 6 = $6.18 accrued each hour. So running for
a single day would cost ($1.03 + $6.18)*24 = $173.04/day. The previous
cost covers the full cost of keeping all the data stored in that day for six
months.
<!--l. 172--></p><p class="indent" >   Which actually isn&#8217;t too bad all things considered.
<!--l. 174--></p><p class="indent" >   So maybe I can just journal everything?
<!--l. 177--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">4   </span> <a id="x1-90004"></a>Retrieving data from the command journal</h3>
<!--l. 179--></p><p class="noindent" >Writing to the command journal is reasonably straight forward. Interrogating the
command journal isn&#8217;t necessarily as easy. The amount of data in the command
journal can very quickly grow to stupendous heights. The previous example I used of
journaling reads was writing out 6.87 gigs or so an hour. That means if we are
storing six months worth of data then our journal will have a size in the
neighborhood (just for reads!) of 6.87 * 24 * 30 * 6 = 29678.4 gigs or 29
terabytes.
<!--l. 187--></p><p class="indent" >   This is pretty typical for this kind of usage log. When it&#8217;s time to grub through
the journal to find people affected by a bug we need a search platform. Even using
SQL Azure won&#8217;t magically solve the problem for us since the largest size for a
database in SQL Azure is currently 10 Gig, we&#8217;ll need a lot of databases and a
framework to query and aggregate across those databases. And in Windows Azure
                                                                  

                                                                  
Table Store we could issue a single query that would go through the entire command
journal but it would take so long to run that we would all but certainly need to
break the query up into sections (say by partition key) and then combine
the results together. So in both cases we need a way to fan out to make
sub-queries and then fan in the results. This is of course the classic Map/Reduce
pattern.
<!--l. 200--></p><p class="indent" >   This is a pretty trivial version of map/reduce. Essentially we need to partition the
search space and then, to use a trivial example, fire off all the queries into
a Windows Azure Queue and having different Windows Azure Compute
instances hit the queue until it&#8217;s empty writing out the results to some result
store.
<!--l. 206--></p><p class="indent" >   Still, it would be nice if we could use an engine like <a href="http://hadoop.apache.org/" >Hadoop</a> or <a href="http://research.microsoft.com/en-us/projects/Dryad/" >Dryad</a> to handle
all the dirty work for us. Or better yet, maybe Microsoft could make <a href="http://www.goland.org/whatiscosmos/" >Cosmos</a>
available as a publicly available service.
<a id="Q1-1-10"></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/buildingacommandjournal/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Techniques to Ease Recovering from Self Inflicted Data Corruption</title>
		<link>http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/</link>
		<comments>http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/#comments</comments>
		<pubDate>Mon, 28 Dec 2009 20:02:49 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=666</guid>
		<description><![CDATA[         In a previous article I argued that even with the protections Windows
     Azure Table Store provides for my data I can still screw things up myself
     and so need to put in place protections against my own mistakes. [...]]]></description>
			<content:encoded><![CDATA[     <!--l. 31--><p class="indent" >    <span class="aer-9">In a </span><a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" ><span class="aer-9">previous article</span></a> <span class="aer-9">I argued that even with the protections Windows</span>
     <span class="aer-9">Azure Table Store provides for my data I can still screw things up myself</span>
     <span class="aer-9">and so need to put in place protections against my own mistakes. Below</span>
     <span class="aer-9">I walk through the three scenarios I previously listed and explain how</span>
     <span class="aer-9">command journaling, tombstoning and versioning could make recovering</span>
     <span class="aer-9">from my errors much easier.</span>
</p><p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-666"></span> 
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">Application logic failure</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">1.1 </span><a href="#x1-30001.1" id="QQ2-1-3"><span class="aer-9">Do nothing extra</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">1.2 </span><a href="#x1-40001.2" id="QQ2-1-4"><span class="aer-9">Add command journaling</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">1.3 </span><a href="#x1-50001.3" id="QQ2-1-5"><span class="aer-9">Add tombstoning</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">1.4 </span><a href="#x1-60001.4" id="QQ2-1-6"><span class="aer-9">Add versioning</span></a></span>
     <br />     <span class="aer-9">&#x00A0;</span><span class="subsectionToc" ><span class="aer-9">1.5 </span><a href="#x1-70001.5" id="QQ2-1-7"><span class="aer-9">There&#8217;s no such thing as a free lunch</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-80002" id="QQ2-1-8"><span class="aer-9">Table deletion</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-90003" id="QQ2-1-9"><span class="aer-9">Schema update failure</span></a></span>
       </div>

                                                                  

                                                                  
<!--l. 42--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>Application logic failure</h3>
<!--l. 44--></p><p class="noindent" >An application logic failure means my own application logic was supposed to perform
some action, screwed it up and ended up corrupting (which I define in the most
generic sense of - didn&#8217;t do what it should have done) my Windows Azure Table
Store. I generally model application logic failures as my service receiving request X
(which might even have been generated internally) and doing the wrong
thing. So let&#8217;s say that I find out that I have an application logic failure.
What should I have done before hand to making recovering from the error
easier?
<!--l. 55--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">1.1   </span> <a id="x1-30001.1"></a>Do nothing extra</h4>
<!--l. 57--></p><p class="noindent" >The bulk of my testing is focused on this kind of error so I&#8217;m already spending lots of
time and money trying to prevent application logic failures, is it really cost effective
for me to do more? Below I&#8217;ll discuss strategies like journaling, tombstoning and
versioning but those techniques require time and money to implement and maintain.
So it&#8217;s not unrealistic, especially if the data my service is managing is either derived
from someplace else and thus recoverable through external means or if it&#8217;s
reasonably low value, to do nothing more than I do now which is test like
crazy.
<!--l. 68--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">1.2   </span> <a id="x1-40001.2"></a>Add command journaling</h4>
<!--l. 70--></p><p class="noindent" >I get a bug report that my system is corrupting data in my Windows Azure Tables. I
investigate and determine that a POST containing a certain kind of JSON is going to
do the wrong thing. I need to notify users who were affected by this bug so they can
begin the process of dealing with the consequences.
<!--l. 76--></p><p class="indent" >   Today the best I can do in this situation is send out a notice to all of my users
and wish them the best of luck. I have no idea who issued the POST with the JSON
body that could trigger the bug.
<!--l. 80--></p><p class="indent" >   But a fairly simple feature to implement that could help me out is command
journaling. A command journal is a log of every command given to my system. Most
of these commands will come from my users but some will also be generated
internally as part of maintenance operations.
<!--l. 85--></p><p class="indent" >   If I had a command journal then I could do a search through the journal looking
for commands that would trigger the bug, see which account issued that command
and then notify that account. With a bit of extra effort (depending on the nature of
the bug) I might even be able to suggest what the proper fix is. But I don&#8217;t want to
oversell the capabilities of a command journal. As I discussed in a <a href="http://www.goland.org/thelimitsofcommandjournals/" >dedicated article</a>
on the topic there are significant limits to how a command journal can be used in real
world situations.
                                                                  

                                                                  
<!--l. 94--></p><p class="indent" >   So I tend to think of command journals on their own as a way to identify
potentially problematic commands and so hopefully winnow down the users and the
data that needs to be examined in order to recover from the bug.
<!--l. 100--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">1.3   </span> <a id="x1-50001.3"></a>Add tombstoning</h4>
<!--l. 102--></p><p class="noindent" >The <a href="http://www.goland.org/thelimitsofcommandjournals/" >previously discussed</a> limitations of command journaling make recovering from
some fairly simple bugs harder than it needs to be. For example, let&#8217;s say I have a
bug that instead of updating a row ends up deleting it instead. Trying to recover the
lost data using a command journal would, in the general case, require replaying some
or all of the journal. For the reasons explained in the previously linked article I don&#8217;t
think that&#8217;s realistic.
<!--l. 111--></p><p class="indent" >   So this means once data is deleted from my Windows Azure Table Store that&#8217;s it,
it&#8217;s gone which makes recovering from a typical delete bug pretty much impossible.
A reasonably simple solution to this problem is called Tombstoning. This
is a technique whereby information isn&#8217;t deleted, instead it is marked as
deleted. And what is marked as deleted can always be unmarked later if
necessary.
<!--l. 119--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">1.4   </span> <a id="x1-60001.4"></a>Add versioning</h4>
<!--l. 121--></p><p class="noindent" >Now let&#8217;s say I have a really nasty bug where it turns out that in some cases I used
an object across threads that wasn&#8217;t actually thread safe. Rather than just crashing
nicely the shared object caused data corruption. I might have a shot of identifying
which commands could run into the problem and then looking in the data store to
see if the value written in the data store is the value I would expect from the
command.
<!--l. 129--></p><p class="indent" >   Except if I have a disconnect between the value I was expecting and the value
that was present I have a conundrum. The disconnect could be because the bug
manifested itself or it could simply be that someone later came along and overwrote
the potentially wrong value. How can I tell the difference? In theory I could replay
the command journal and see if anyone ever issued any command that would alter
the suspect value(s) but as <a href="http://www.goland.org/thelimitsofcommandjournals/" >previously discussed</a> I don&#8217;t think that&#8217;s realistic in the
general case.
<!--l. 138--></p><p class="indent" >   Another technique would be to use some kind of command ID for each command
in the command journal and then mark any updates with that ID. But that wouldn&#8217;t
handle the case where someone just blindly wrote back the same value that they
previously read in. This would look like an update (since the command ID would be
different) but in fact it isn&#8217;t.
<!--l. 145--></p><p class="indent" >   Another alternative is table versioning. Imagine if every row in my Windows
Azure Table Store was versioned. I could find the version in the table store that
contained the value written by the command, see if it matches what the
command should have done, if it doesn&#8217;t then I can look to see if there are any
                                                                  

                                                                  
subsequent updates to that row. If not then I know I have an error condition
and can either fix it or at least tell the user which data in which location is
problematic.
<!--l. 154--></p><p class="noindent" >
   <h4 class="subsectionHead"><span class="titlemark">1.5   </span> <a id="x1-70001.5"></a>There&#8217;s no such thing as a free lunch</h4>
<!--l. 156--></p><p class="noindent" >All of the previous features can be implemented, today, over the Windows Azure
Table Store. If time permits I hope to write a few articles explaining how to do so in
ways that are scalable and don&#8217;t have huge performance penalties. But for now if I
want these features I have to implement them myself. So I&#8217;ll have to make a service
by service call if it&#8217;s worth the effort.
<!--l. 163--></p><p class="indent" >   In the long run however I hope to see these features available on top of
Windows Azure Table Store. Once these are off the shelf functionality the math
on which ones to use changes significantly over having to implement them
myself.
<!--l. 169--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-80002"></a>Table deletion</h3>
<!--l. 171--></p><p class="noindent" >Another self inflicted wound I discussed in my <a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" >previous article</a> was accidentally
deleting one of my Windows Azure Tables. This is incredibly easy to do in Windows
Azure Tables, just one HTTP DELETE will do it. As <a href="http://www.goland.org/thelimitsofcommandjournals/" >previously explained</a> I don&#8217;t
believe that Command Journals can be relied upon to recover from total data loss
because I don&#8217;t feel comfortable that I can replay the journal (and even if I could I
doubt I could afford the time and resources necessary to do so). So the only strategy
that might help me here is versioning.
<!--l. 181--></p><p class="indent" >   But really I don&#8217;t think that versioning is the right strategy here either. I think
the right strategy is talking to the Windows Azure Table Store team and getting
them to do two things:
     <ol class="enumerate1" >
     <li class="enumerate" id="x1-8002x1">Implementing Undelete - We need a undelete command along with some
     kind  of  guarantee  about  how  long  a  table  will  be  allowed  to  remain
     undeleted before being garbage collected.
     </li>
     <li class="enumerate" id="x1-8004x2">Add ACLs - Right now every component I have that has any reason to
     interact with my Windows Azure Table Store can do everything up to and
     including deleting the table. I would love to have an ACL system so I can
     lock down components to just the features they need to do their job so
     the scope of their screw ups is reduced.</li></ol>
<!--l. 194--></p><p class="noindent" >If these are features you would like to see in Windows Azure Table Store then you
need to let Microsoft know. I believe one way to do that is to go vote on
<a href="http://www.mygreatwindowsazureidea.com/" >www.mygreatwindowsazureidea.com</a>. Jamie Thomson started <a href="http://www.mygreatwindowsazureidea.com/pages/34192-windows-azure-feature-voting/suggestions/426220-build-a-journaling-system-for-azure-table-storage?ref=title" >a vote</a> to ask for
                                                                  

                                                                  
journaling for Azure Table Store. Personally I&#8217;d rather see that modified to ask for a
versioning interface. In theory it&#8217;s really easy to replay a table journal (which unlike a
command journal just contains simple CRUD commands limited to a single table)
but in practice there are versioning and other issues that can get into the way (e.g. if
Windows Azure Table Store changes/enhances its logic in any way over time). If we
had a versioning store instead then we wouldn&#8217;t care. The difference is between
recording &#8217;before&#8217; (a table journal) and &#8217;after&#8217; (a versioning store). &#8217;after&#8217; is easier to
deal with. But whatever, this can all get figured out if the basic idea of having some
kind of versioning/journal story gets adopted by Azure. So if you believe,
vote!
<!--l. 212--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-90003"></a>Schema update failure</h3>
<!--l. 214--></p><p class="noindent" >The last self inflicted wound from my <a href="http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/" >previous article</a> was screwing up a schema
update. This is when I change the structure and meaning of my tables in a
non-backwards compatible way. This is an area rich in potential for data
corruption.
<!--l. 219--></p><p class="indent" >   I generally won&#8217;t do a schema update in place. In other words if I need to make a
non-backwards compatible change to a table(s)&#8217; schema the way I&#8217;m going to
do it is to create a completely new set of tables that are set up using the
new schema. Then typically I&#8217;m going to tell my users &#8221;I&#8217;ll support the old
service on the old tables for X months then retire it, if you want to be on
the new system you need to move your data to the new system.&#8221; I will, of
course, provide tools to help with the transfer but this is one of those things
that I think has to be left to the end user. But even if I&#8217;m forced to handle
moving the data myself I will still use a model where a user is required to
say they want to move because once they do move their old data won&#8217;t be
available in the V1 system any longer. They and all their users will have to
move.
<!--l. 233--></p><p class="indent" >   So at that point moving the user is really just an application logic scenario where
the initial command is &#8217;move data from table A to table B&#8217;. Now I can model error
recovery using the same techniques I previously discussed for application logic
failure.
<!--l. 238--></p><p class="indent" >   If, on the other hand, I have to support accounts on both V1 and V2
simultaneously then I doubt a breaking schema change is feasible. See my article
on <a href="http://www.goland.org/webservicesn1versioning/" >versioning Web Services</a> for more information on my thinking in this
area.
<a id="Q1-1-10"></a>
</p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/techniques-to-ease-recovering-from-self-inflicted-data-corruption/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Limits of Command Journals</title>
		<link>http://www.goland.org/thelimitsofcommandjournals/</link>
		<comments>http://www.goland.org/thelimitsofcommandjournals/#comments</comments>
		<pubDate>Thu, 24 Dec 2009 02:32:08 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=656</guid>
		<description><![CDATA[         In   a   previous   article   I   argued   that   I   needed   some   kind   of
     journaling/backup  for  my  Windows  [...]]]></description>
			<content:encoded><![CDATA[     <!--l. 31--><p class="indent" >    <span class="aer-9">In   a   </span><a href="http://do-i-need-to-backupjournal-my-windows-azure-table-store/" ><span class="aer-9">previous   article</span></a>   <span class="aer-9">I   argued   that   I   needed   some   kind   of</span>
     <span class="aer-9">journaling/backup  for  my  Windows  Azure  Tables  in  order  to  make  it</span>
     <span class="aer-9">easier for me to recover from my own screw ups. One type of journaling</span>
     <span class="aer-9">I  suggested  was  command  journaling.  In  this  article  I  look  at  the</span>
     <span class="aer-9">practical limitations of command journals and conclude that while they</span>
     <span class="aer-9">are (somewhat) useful for notifying users who might have been affected</span>
     <span class="aer-9">by data corruption they aren&#8217;t likely in the general case to be re-playable</span>
     <span class="aer-9">so their real value is probably less than it might appear.</span>
</p><p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-656"></span> 
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">Defining Command Journaling</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-30002" id="QQ2-1-3"><span class="aer-9">It costs money to implement a command journal replay facility</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-40003" id="QQ2-1-4"><span class="aer-9">It costs serious money to implement a command journal replay facility correctly</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">4 </span><a href="#x1-50004" id="QQ2-1-5"><span class="aer-9">So what good is a command journal if we can&#8217;t replay it?</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">A </span><a href="#x1-6000A" id="QQ2-1-7"><span class="aer-9">But what if the command journal contained no failed commands?</span></a></span>
       </div>

                                                                  

                                                                  
<!--l. 44--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>Defining Command Journaling</h3>
<!--l. 46--></p><p class="noindent" >A command journal is a log of all the commands a service receives from its
customers. Command journals as I think about them typically don&#8217;t include any
information about the response to the command (although this isn&#8217;t a requirement as
we&#8217;ll see below). It&#8217;s also worth keeping in mind that command journals record the
command (or a representation of the command) actually sent by the user. This
means that a single user command could cause state changes and side effects in more
than one place.
<!--l. 56--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-30002"></a>It costs money to implement a command journal replay facility</h3>
<!--l. 58--></p><p class="noindent" >If a situation arises where in order to recover from a disaster I need to replay my
command journal I&#8217;ll first have to write software that can handle such a replay. The
command journal isn&#8217;t going to contain all the authentication and authorization
information (or at least it shouldn&#8217;t) so I&#8217;ll need to create a separate command
pathway that can execute the contents of the command journal but bypass
authentication and authorization. This isn&#8217;t too big a deal though because this is
probably just a layer on top of my existing command processing system. I&#8217;ll also,
however, need to either figure out how to disable billing (since I shouldn&#8217;t be charging
for my own recovery) or compensate for any billable events re-running the command
journal causes. And then of course there are side effects of issuing commands
like system alerts, e-mails, etc. I&#8217;ll need to identify all of those and either
disable or compensate for them as well. If I plan on replaying more than a few
commands I&#8217;ll also need to think about how to run the replay in parallel
in such a way that system state isn&#8217;t corrupted and nothing is run out of
order.
<!--l. 76--></p><p class="indent" >   All of this is completely doable but it isn&#8217;t free and it&#8217;s an ongoing expense since
every change in the system&#8217;s command functionality will have to be evaluated
and potentially compensated for in the context of replaying the command
journal.
<!--l. 83--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-40003"></a>It costs serious money to implement a command journal replay facility
correctly</h3>
<!--l. 85--></p><p class="noindent" >Let&#8217;s imagine that command X is received, processing of command X caused some
unknown number of state updates at which point the machine that was processing
command X failed.
<!--l. 89--></p><p class="indent" >   Presumably the command journal was updated with command X before any of
the other processing could start (which means the command journal now adds an
extra internal round trip to at least all write commands). So we know that
command X was submitted. What we don&#8217;t know, especially since the machine
                                                                  

                                                                  
processing command X crashed, is what part of command X got implemented.
So when we replay command X what should we do? Let it succeed? Skip
it?
<!--l. 97--></p><p class="indent" >   In theory one could argue it doesn&#8217;t matter. After all, if no one knows what state
the system is in aren&#8217;t we free to put the system in whatever set of states the
processing of command X could potentially have generated? The problem however is
what happens if someone either directly (by doing say a GET) or indirectly (by
issuing a command which depends on the values that were potentially affected by
command X) determined the state of the system? In that case the external actor is
making decisions based on the state the system is in as a consequence of command
X. So if we just replay command X without exactly replicating its actual
(as opposed to theoretical) output then the state the system is in and the
state people think the system is in will not be the same introducing more
bugs.
<!--l. 110--></p><p class="indent" >   This challenge could also be overcome if we, for example, included not just all
commands in the command journal but also all responses to all commands in the
command journal. Then we could write a simulator that ran through all the
responses and determine if any of them directly or indirectly could tell us the values
that command X created. I&#8217;m trying hard not to think too much about the expense
of writing and maintaining that simulator as well as the cost of storing all that
data.
<!--l. 119--></p><p class="indent" >   All of this having been said it is completely possible to build a correct command
journal. In certain restricted scenarios it might not even be that painful and could
potentially be extremely useful. But in the general case it looks like a nightmare to
me.
<!--l. 125--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">4   </span> <a id="x1-50004"></a>So what good is a command journal if we can&#8217;t replay it?</h3>
<!--l. 127--></p><p class="noindent" >The scenario where I invoked the need for a command journal was when my service
incorrectly executed commands it received from users and caused data to be
corrupted. I wanted the command journal so I could look through the commands I
received and potentially identify commands that could have tripped the data
corruption bug. This would allow me to alert users who were most at risk from the
bug and give them pointers on what might have been damaged. But in the general
case when there is a bug all users need to be notified so we are just arguing if there is
a one size fits all notification or if certain users may get an additional notification
with more targeted information. How useful this facility is, is of course, context
dependent.
<a id="Q1-1-6"></a>
<!--l. 144--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">A   </span> <a id="x1-6000A"></a>But what if the command journal contained no failed commands?</h3>
                                                                  

                                                                  
<!--l. 146--></p><p class="noindent" >Using two phase commit techniques it is theoretically possible to create a
system where if a command fails any state changes would roll back. An
no, two phase commit is not a sin in a distributed system. I&#8217;ve been on
this <a href="http://www.goland.org/nonacidtwophase/" >hobby horse before</a>, but the bottom line is that one can reasonably
implement a two phase commit in a fully distributed system. But let&#8217;s say
I did that. Let&#8217;s say I have a full 2PC system so that all my failures are
well defined. I am still not sure that replaying the journal would actually
work.
<!--l. 155--></p><p class="indent" >   The main reason is that over any non-trivial period of time I probably deployed
multiple versions of my website. So to fully replicate the actual behavior I would not
just have to replay the commands. I would have to replay the commands with the
exact version of the software that was used when the command was issued. In simple
cases where all aspects of the command were run on a single box then I could
just record the version the box was running in the command journal. But
in non-trivial cases multiple different boxes potentially running different
versions of the software (especially if I use a rolling upgrade system) could have
been involved. I would need as part of the command journaling process to
take a census of all of their versions and which parts of the command they
handled.
<!--l. 168--></p><p class="indent" >   Again, this is all doable. But my guess is that by the time I&#8217;ve implemented the
2PC logic with roll back, the version census system and built some kind of framework
to host multiple simultaneous versions of my software I&#8217;ve probably already gone out
of business from cost overruns.
<!--l. 173--></p><p class="indent" >   So while, again, I think there are certain limited scenarios where a re-playable
command journal is conceivable I don&#8217;t think I&#8217;ll be building one any time soon for
any of my services.</p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/thelimitsofcommandjournals/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do I need to backup/journal my Windows Azure Table Store?</title>
		<link>http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/</link>
		<comments>http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/#comments</comments>
		<pubDate>Wed, 23 Dec 2009 01:25:30 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false">http://www.goland.org/?p=650</guid>
		<description><![CDATA[Windows Azure provides a highly scalable, reliable, fault resistent table
     store. So in theory my service can dump data into the table store and
     walk away secure in the knowledge that I&#8217;ll get back what I put in and
     that the data will [...]]]></description>
			<content:encoded><![CDATA[Windows Azure provides a highly scalable, reliable, fault resistent table
     <span class="aer-9">store. So in theory my service can dump data into the table store and</span>
     <span class="aer-9">walk away secure in the knowledge that I&#8217;ll get back what I put in and</span>
     <span class="aer-9">that the data will be there when I need it. So is there any reason I should</span>
     <span class="aer-9">care about backing up or journaling my Windows Azure Tables? As I</span>
     <span class="aer-9">argue below the answer is - yes. But the reason isn&#8217;t to protect me against</span>
     <span class="aer-9">Azure&#8217;s mistakes, it&#8217;s to protect me from myself.</span>
<p>This article is part of a series. Click <a href="http://www.goland.org/recovering-from-self-inflicted-data-corruption-a-summary/">here</a> to see summary and complete list of articles in the series.</p>
<span id="more-650"></span> 
       <h3 class="likesectionHead"><a id="x1-1000"></a><span class="aer-9">Contents</span></h3>
       <div class="tableofcontents">
       <span class="sectionToc" ><span class="aer-9">1 </span><a href="#x1-20001" id="QQ2-1-2"><span class="aer-9">D&#8217;oh! Deleting myself</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">2 </span><a href="#x1-30002" id="QQ2-1-3"><span class="aer-9">Data Migration Failure</span></a></span>
     <br />     <span class="sectionToc" ><span class="aer-9">3 </span><a href="#x1-40003" id="QQ2-1-4"><span class="aer-9">Application Logic Failure</span></a></span>
       </div>
 In building a service based on Windows Azure Table Store I see three classes of
mistakes I can make that are likely going to make me wish I had some kind of
backup/journaling for my Windows Azure Table Store.
                                                                  

                                                                  
     <ol class="enumerate1" >
     <li class="enumerate" id="x1-1002x1">Deleting my own tables in production
     </li>
     <li class="enumerate" id="x1-1004x2">Screwing up a schema change
     </li>
     <li class="enumerate" id="x1-1006x3">Screwing up my application logic</li></ol>
<!--l. 51--><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">1   </span> <a id="x1-20001"></a>D&#8217;oh! Deleting myself</h3>
<!--l. 53--></p><p class="noindent" >The Windows Azure Table Service has a nice REST API that contains a nice
DELETE method that can be <a href="http://msdn.microsoft.com/en-us/library/dd179387.aspx" >applied to an entire table</a>. In other words, in a single
REST command I can nuke a production table. I&#8217;m only one misconfigured
maintenance script away from severely hurting myself. I suppose one can argue that
running an Internet service is a job for adults and that adults shouldn&#8217;t have
problems like accidentally deleting production tables but hey I&#8217;m basically a big
scaredy cat and I&#8217;d like a bit more cover.
<!--l. 62--></p><p class="indent" >   Here is where my first request to Azure comes in. Could we please have undelete?
The webpage I previously linked to says:
     <div class="quote">
     <!--l. 65--><p class="noindent" >When a table is successfully deleted, it is immediately marked for
     deletion and is no longer accessible to clients. The table is later
     removed from the Table service during garbage collection.</p></div>
<!--l. 69--></p><p class="noindent" >So perhaps we could let folks set a policy specifying how long a table is guaranteed to
stick around before being deleted and then add in an undelete method?
<!--l. 73--></p><p class="indent" >   While I&#8217;m asking for things currently all access to the table store is handled via a
single key. So any part of my service that needs access to the table store has a
key that lets it do anything, including things it has no business doing, like
deleting tables. Again, adults should run services securely and although I can
grumble about some defense in depth issues, this single key shouldn&#8217;t really
matter for data integrity issues. After all, why should there be any code
running around issuing DELETE&#8217;s if they don&#8217;t need to? Still, see previous
comment about being a big scardey cat, I wouldn&#8217;t mind if there wasn&#8217;t some
more fine grained access control so that parts of my service that need to
interact with the store could only do the things they were supposed to do.
(And yes, I know Blobs have <a href="http://msdn.microsoft.com/en-us/library/dd179391.aspx" >basic ACLs</a>, but I&#8217;m talking about the table
service)
<!--l. 87--></p><p class="indent" >   But who&#8217;s kidding who? Given that I spent most of the last year working on the
<a href="http://msdn.microsoft.com/en-us/library/ee732536.aspx" >AppFabric Access Control Service</a> of course I&#8217;m going to whine about Access
Control.
                                                                  

                                                                  
<!--l. 91--></p><p class="indent" >   In any case, for now, the situation is that if I do something stupid I can seriously
hurt myself. Now, admitedly, that&#8217;s always true, but I really have no objections to a
few safety measures if Azure sees fit to introduce them. But until then I wouldn&#8217;t
mind some kind of backup to keep me from completely screwing myself if I
accidentally nuke a table.
<!--l. 99--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">2   </span> <a id="x1-30002"></a>Data Migration Failure</h3>
<!--l. 101--></p><p class="noindent" >When laying out tables in Azure Table Store one makes lots of fun trade offs between
things like referential integrity and performance. Over time the world behind those
trade offs will change and the tables will need to be redesigned. But that introduces a
really rich source of screw ups. Any time data gets moved from one form to another,
especially when large bodies of data are involved, a screw up is all but guaranteed. So
I&#8217;m really going to want to have some kind of backup so that when I get the
inevitable user escalation about data corruption I can at least go back to where I was
before I got into this mess and maybe bring the user back to some reasonable
state.
<!--l. 113--></p><p class="noindent" >
   <h3 class="sectionHead"><span class="titlemark">3   </span> <a id="x1-40003"></a>Application Logic Failure</h3>
<!--l. 115--></p><p class="noindent" >My service receives a command. Based on that command my service performs some
series of actions on our underlying Windows Azure Table Store. All is fine and good
so long as we don&#8217;t screw anything up. Screw ups tend to come in a few basic
flavors:
     <ol class="enumerate1" >
     <li class="enumerate" id="x1-4002x1">We delete something we shouldn&#8217;t have
     </li>
     <li class="enumerate" id="x1-4004x2">We didn&#8217;t delete something we should have
     </li>
     <li class="enumerate" id="x1-4006x3">We transformed the state of the right row in the wrong way
     </li>
     <li class="enumerate" id="x1-4008x4">We transformed the wrong row</li></ol>
<!--l. 125--></p><p class="noindent" >To be fair when building a service the bulk of the testing is focused on detecting any
logic screw ups that could lead to the previous failures. But any non-trivial Internet
scale service is going to deal with an enormous variety of data input and it&#8217;s
highly unlikely that our tests could ever catch everything. As we say in the
Internet business &#8221;When your data set is large enough there are no edge
cases&#8221;.
                                                                  

                                                                  
<!--l. 132--></p><p class="indent" >   So when we figure out that we have a data corruption bug, what do we do? How
do we know who was affected? What can we repair ourselves? Do we just throw up
our hands and tell our users &#8221;Oh, um... well... you see... we have a problem and you,
dear user, are on your own?&#8221;
<!--l. 137--></p><p class="indent" >   At an absolute minimum I would like to have a command journal that records
every command issued against the system. In my ideal world I would journal data
retrieval as well as manipulation but in practical terms I can probably only afford to
journal commands that change data. If I could build this journal then when I find out
about a data logic corruption bug caused by my front end I could at least try
to figure out which of my users was likely to be affected by reviewing the
journal looking for commands (or combination of commands) that would
trigger the bug. It&#8217;s not great protection but at least it gives me some potential to give my users guidance when I screw up.
<a id="Q1-1-5"></a>
</p>]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/do-i-need-to-backupjournal-my-windows-azure-table-store/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>What do program managers on the Cosmos team do anyway?</title>
		<link>http://www.goland.org/hiringpmsforcosmos/</link>
		<comments>http://www.goland.org/hiringpmsforcosmos/#comments</comments>
		<pubDate>Fri, 18 Jul 2008 00:00:00 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
				<category><![CDATA[SOA/Web/Etc.]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[In previous articles (here and here) I have talked about what software  program managers do. And in another previous article I talked about Cosmos. In  this article I bring the two topics together and talk about what  Cosmos program managers actually do. (For those just joining us  Cosmos is Microsoft&#39;s internal [...]]]></description>
			<content:encoded><![CDATA[<p>In previous articles (<a href="/whatdoesaprogrammanagerdo" shape="rect">here</a> and <a href="/toolsofthepmtrade" shape="rect">here</a>) I have talked about what software  program managers do. And in another <a href="/whatiscosmos" shape="rect">previous article</a> I talked about Cosmos. In  this article I bring the two topics together and talk about what  Cosmos program managers actually do. (For those just joining us  Cosmos is Microsoft&#39;s internal platform for reliably storing  and processing petabytes of information such as all of  Microsoft&#39;s log data from its various websites.) The issue of  what PMs on the Cosmos team do is near and dear to my heart  because I&#39;m the lead program manager for Cosmos and we are  <a href="http://members.microsoft.com/careers/search/results.aspx?FromCP=Y&amp;JobCategoryCodeID=10015&amp;JobLocationCodeID=&amp;JobProductCodeID=&amp;JobTitleCodeID=&amp;Divisions=&amp;TargetLevels=&amp;Keywords=cosmos&amp;JobCode=&amp;ManagerAlias=&amp;Interval=10" shape="rect">  hiring</a>!</p>
<p>  <span id="more-625"></span>
<p>I talk at length about Cosmos <a href="/whatiscosmos" shape="rect">here</a>  but the general overview is that Cosmos provides a service for  use by internal Microsoft groups to enable them to store and  analyze petabytes worth of data. My team is looking for a few  good technical program managers. Following the structure I used  in my discussion about <a href="/whatdoesaprogrammanagerdo" shape="rect">what</a> PMs do (and make sure to  also check out my article on <a href="/toolsofthepmtrade" shape="rect">how</a>  they do it) here is an outline of what a Cosmos technical PM  does:</p>
<ul>
<li>
<p><b>Dev &amp; Test</b> &#8211; As a Cosmos PM your job is to      deeply understand the nitty gritty details of exactly how      Cosmos works so that you can make sure we are building the      product that our customers actually need. This means that you      need to partner with Dev and Test in driving Cosmos&#39;s      technical architecture to make sure that we architect our      system so as to meet our customers needs without over      investing. You need to be extremely comfortable debating      technical architecture and holding your own against some of      the smartest engineers around in discussing exactly how      Cosmos should evolve. You also need to be very comfortable in      writing the occasional bit of code. No, you won&#39;t be a      programmer, but if we need to do some performance analysis to      help us estimate customer needs we expect you to be happy to      roll up your sleeves and write Scope code.</p>
</li>
<li>
<p><b>Customers</b> &#8211; Our customers are internal so they are      very smart and extremely demanding. You will need to use      strong customer empathy to not just understand what customers      want but also what they need and not just today but into the      future. One of your most powerful tools in working with Dev      and Test and driving Cosmos will be using the detailed      information you have about our customers needs to drive      Cosmos&#39;s design and direction.</p>
</li>
<li>
<p><b>Development Dependencies</b> &#8211; We are blessed to be      near the bottom of the technology dependency stack and really      only have a single development dependency, <a href="http://research.microsoft.com/users/misard/abstracts/osr2007.html" shape="rect">      autopilot</a>. We have extremely good relations with them so      this isn&#39;t likely to be a huge part of your job.</p>
</li>
<li>
<p><b>Operations</b> &#8211; We are a service so we have lots of      interactions with operations. Thankfully we have great      relations with them and they know our service very well so we      tend to partner with operations in driving solutions to our      rather unique issues.</p>
</li>
<li>
<p><b>Business Development</b> &#8211; This is the most technical      of technical PM roles so you will find yourself doing      basically no business development. Your focus will be on      solving incredibly hard distributed system problems.</p>
</li>
<li>
<p><b>Marketing/Evangelism/PR</b> &#8211; This is not a big focus      for us. Not just because we are an internal only service but      more because we already have more customers than we can      handle. &acirc;&Auml;&uacute;If you build it, they will      come&acirc;&Auml;&ugrave; turns out to be true and we are      drowning in requests to use Cosmos.</p>
</li>
<li>
<p><b>Legal/Privacy</b> &#8211; Given the incredible sensitivity of      the data we store privacy in particular is a huge issue. Our      job is to make sure that Cosmos gives its users the tools      they need to do the right thing in terms of protecting user      privacy.</p>
</li>
<li>
<p><b>Vendors</b> &#8211; We certainly do use vendors for certain      ancillary tasks so depending on your seniority and particular      area of focus it is quite possible that you might be asked to      help us manage different kinds of vendors.</p>
</li>
</ul>
<p>In terms of qualifications see the job postings but the real  summary is that we are looking for technical people with strong  computer science programmers who are extremely intelligent. We  don&#39;t really care if you have previous experience with  distributed systems. We are running on the cutting edge so we  expect to train anyone we hire. Also don&#39;t particularly worry  if you don&#39;t have previous PM experience. If you are a rock  star developer or tester who has read my articles on <a href="/whatdoesaprogrammanagerdo" shape="rect">what</a> and <a href="/toolsofthepmtrade" shape="rect">how</a> PMs do their jobs and think you  would be a good fit then drop me a line.</p>
<p>The easiest way to apply is to send me your resume (and when  writing your resume please review this <a href="/SolidResumes" shape="rect">article</a>) to yarong@[insert the bloody obvious  mega corporation name here].com. Or you can submit your resume  through our job site. Just go to any of the Cosmos PM jobs ads  (<a href="http://members.microsoft.com/careers/search/details.aspx?JobID=9B44AA1A-74F4-48F0-AF01-8A0D3AD8BAAF&amp;start=1&amp;interval=10&amp;SortCol=DatePosted" shape="rect">  1</a>, <a href="http://members.microsoft.com/careers/search/details.aspx?JobID=2480D55C-4027-4FF1-A465-EB8A3D7BB2A1&amp;start=1&amp;interval=10&amp;SortCol=DatePosted" shape="rect">  2</a> &amp; <a href="http://members.microsoft.com/careers/search/details.aspx?JobID=E2BC0C24-8D81-42E1-A70A-CAE173C86442&amp;start=1&amp;interval=10&amp;SortCol=DatePosted" shape="rect">  3</a>) and hit &#39;Submit Resume&#39;.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.goland.org/hiringpmsforcosmos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
