One of the buzz phrases presently doing the rounds, is ‘Big Data’. ThoughtWorks’ Ken Collier gives a great introduction for CIOs and alike. Implicit is that “Small Data” is traditional RDBMS systems.

Very Small Data

I wonder whether SCMs as data stores could be slot in as Very Small Data

Primary Criteria

Considering DB-centric criteria, SCM could be a choice if:

  • Records written infrequently compared to reads
    • A cache will make reads ‘faster’, and writes being ‘slow’ isn’t a problem
  • Not a lot of ‘records’ by any definition, and the same records are typically overwritten
  • The SCM server’s storage capacity is never going to be remotely threatened

Secondary Criteria

There’s some aspects that the use of an SCM suggests, that aren’t on the data read/write/size spectrum above:

  • Should really be textual documents (unicode representation)
    • Documents should really leverage carriage-returns
    • Documents are pretty/indented/regular
  • Typically schema-less
    • Documents formats can vary over time
  • Change history needs to be retained
  • Changes might be better as a set
    • Comments pertaining to a change-set are appropriate
  • If two people try to update the same document, then the following are OK resolutions:
    • Accept ‘theirs’ (potentially start over on ‘your’ changes)
    • Assert yours/mine (essentially undo ‘their’ change)
    • Merge automagically (yay!) or manually (gulp)

Tertiary Criteria

  • Multiple branches could be important. If so:
    • Maintained divergence could be a factor
    • Merges would be meaningful
    • Different read/write permissions per branch (this rules out Git, sadly)

Maintained Divergence is an interesting concept worth it’s own blog entry.

A Continuing Obsession

Of course this is part of my general source control obsession, so feel free to ignore me. Logan McGrath, foolishly, asked me what to do on his first day in Dallas ThoughtWorks office (he wasn’t at the time assigned to a project), so I inducted him into that obsession. I’m ‘product owner’ for an app that uses Perforce, Sinatra and AngularJS and he’s coding it :)