bjc's blog

Let's walk through my daily workflow, up to the point where I'm ready to commit back to my upstream SVN repository.

Since I work on a fair number of branches and I don't always know what I worked on last, I run git-status to see which branch I'm in, and any uncommitted work. This command shows my current branch, uncommitted files, and files in the index cache1.

I like to keep my tree as fresh as possible, so I'll next run git-svn rebase to update my tree with the latest from SVN. You can't run a rebase, however, if your working tree is dirty, so I'll clean up what I can using git-commit and git-add --patch2. Any code I can't fold into a commit, I put aside for a moment with git-stash.

When the rebase is completed, I'll put any stashed code back with git-stash apply. Now my tree is up-to-date, and I have all my local changes worked into it.

The rest of the day proceeds pretty normally. I hack on code, and I'll commit whenever I see fit (which is often - as I've mentioned, I'm a Machine Gun Committer).

We haven't committed any of this back to the SVN repository yet, but in the next article we'll cover that, along with how I fool my coworkers into thinking I magically do everything in one commit, even though I've actually committed very frequently.

  1. Git's index cache (a.k.a. Staging Area) is an intermediate location between the Git repository and the current working copy. Coming from SVN, this will appear a little baffling to you, but it exists as a way to manipulate commits before generating a patch. You add files to the cache using git-add and git-commit commits everything in the cache to the repository.
  2. git-add --patch, along with git-rebase -i branch, are some of the most useful and world-shaking things that come with git. In a future article I'll take a look at both of these commands to highlight some of their utility.

The other day, I made a mistake while committing some work I'd done via Git to our SVN repository. Since this series of articles is about working with Git and SVN, I wanted to take a short time to look over what happened as well as a small change in my workflow that will prevent it from happening in the future.

The mistake was that I accidentally pushed a file I was working on to our main repository. There was no damage caused, as the file wasn't in use by anything in production yet, but I'd rather not have pushed it at all.

The solution is to diff my work against the current repository and make sure only the changes I want or going to be pushed. But since Git doesn't work in a centralized fashion, you can't run a straight diff: you first have to know which revision to use as the left hand side. In Subversion, this is straight forward as it's centralized; the left hand side is the SVN repository HEAD, and the right hand side are your uncommitted changes.

Well, with Git, it's only a tiny bit harder: all you need to know is that remote repositories have special "tracking" branches, in the Git's "remotes" branch heirarchy. When you initialized your repository with git-svn, it also created a "remotes/trunk" reference that you can use. Therefore:

% git-diff remotes/trunk
...

will show me a diff of exactly what will be committed. As I roll this into my workflow, I should be making fewer and fewer of these kinds of errors.

I've been using Git for my personal projects for about six months now, but I've only recently taken the plunge into using it for work projects as well.

There are many good articles and talks out there on what Git is and what makes it different from "normal" version control systems like CVS and Subversion (SVN). In the interests of brevity, I will only summarize: with Git, your entire repository is cloned onto every developer's computer, and this opens up a huge range of possibilities for repository management. In addition, Git's revisions are cryptographically dependent on all previous revisions, so it is impossible to intentionally or unintentionally corrupt a repository.

We use Subversion here as our main source repository, and Subversion has worked very well for us in most ways. However, we like to use branches a lot, to segregate our work until it's ready to be merged, and this is one area where SVN just doesn't cut the mustard. To work around SVN's limitations, we tag our merge commits with the left hand and right hand revisions in the commit messages. This is passable, but error-prone and brittle.

In addition, when I'm coding, I tend to be a Machine Gun Committer. I commit constantly so I can always get a fine grained picture of how I've been working and what I've been doing. With a little bit of legwork, this also allows me to more accurately pin-point where my bugs showed up. However, it annoys my coworkers to no end, and I can see their point. It is pretty obnoxious.

Git offers the solutions to all these problems and more, all while integrating cleanly with our existing Subversion repository.

Let's start by first cloning your existing SVN repository into Git with git-svn:

% git-svn clone http://localhost/mysvnproj -T trunk -b branches -t tags
...

Now you have a private copy of mysvnproj (and it's entire history) in Git. A quick note: your "trunk" branch in SVN is called "master" in Git. You can change this behavior, but by default this is what you'll get.

Any changes you make in mysvnproj can be pushed back to the SVN repository when you're finished with "git-svn dcommit":

% git-svn dcommit
...

Pulling updates from the SVN repository into your Git repository is done with "git-svn rebase":

% git-svn rebase
...

This will not commit any of your local changes, but "rebase" them onto the new SVN head. In git terminology, "rebasing" takes a set of changes and applies them to a new head. I find it's useful to imagine this as a graph:

          /------ B
         /
A ------/-------- master

If I rebase B to master, the new change graph looks like this:

                 /------ B
                /
A -------------- master

This changes my local commit history to look like B inherited from the current master, rather than at some point in its history.

Now, the only other thing we have to cover in this installment is using an SVN-hosted branch. This is subtly different from a normal git branch, because it backends into SVN and thus can't use much of Git's history analysis for branching and merging. Because merging isn't made any easier (and in some ways is made harder) by using SVN branches, I prefer not to use them. Never-the-less, sometimes you might need to.

The good news is that once a branch has been created in SVN, you can use that branch in Git fairly easily:

% git-checkout mysvnbranch
% git-checkout -b local/mysvnbranch

This will tie your SVN branch to local/mysvnbranch so "git-svn dcommit" will Do The Right Thing™.

Over the next few days we'll dive into some work flows that will help show the power of Git, as well as solve the problems I've outlined above. Stay tuned!

There are plenty of libraries out there for attaching scripted behavior to HTML elements. Unfortunately, we're using Prototype, which lacks support for this. While we could integrate one of these libraries, in theory, I believe it's not worth the side-effects: possible destabilization due to conflicts, learning a new library, and additional download time for our clients.Instead, I wrote one up myself. It doesn't have to do very much: scan the DOM when the page loads and monitor the DOM for dynamic updates. I've chosen to do strict class-based behavior. To enable behavior on an element, you must do three things:

  1. On the JavaScript side, you code up the actual behavior (for instance, submitting an Ajax request for a link, instead of following it normally).
  2. On the HTML side of things, you have to add a class to your element so our DOM watcher can know where to attach it. For instance, you would specify a class of "async_link" on the A tags where you want an Ajax request submitted.
  3. Finally, you have to inform your DOM watcher that a given CSS class has a given JS behavior. With the module I've written, this is as simple as:
    DOMWatcher.EventHandlers.async_link = AsyncLink.Watcher;

Well, that's the theory anyway. On to the code.First, let's create a module to contain all this code, along with the public API methods:

var DOMWatcher = function () {
  return {
    EventHandlers: {},
 
    scanDocument: function () {
      attachBehavior();
    },
 
    addWatcher: function (klass, watcher) {
      DOMWatcher.EventHandlers[klass] = watcher;
      if (document.body) {
        attachBehavior();
      }
    },
 
    removeWatcher: function (klass, watcher) {
      delete DOMWatcher.EventHandlers[klass];
    }
  };
}();

Simple enough. A method to scan the document and attach behavior, and a couple of accessors to add and remove behavior after the document has been loaded.To define the element behaviors themselves, I've gone with a simple map of CSS classes to behavior objects, which is stored in DOMWatcher.EventHandlers. The format of this object is straightforward: a behavior object may contain a setup method, which is called with a single argument of the element to be initialized, and methods starting with 'on' which specify the event on this element to attach behavior.Here's a simple example:

var AlertLink = {
  setup: function () {
    this._alert = 'Hello World!';
  },
 
  onclick: function (event) {
    alert(this._alert);
  }
};
DOMWatcher.EventHandlers.alert_link = AlertLink;

During execution of behavior methods, I want the this object to be set to the element for which the behavior applies. It's not terribly important - we could just pass it in, but I like this way better.So now that we know what we want the code to look like, lets have a go at the attachBehavior function which makes all this possible:

var DOMWatcher = function () {
  var ATTRIBUTE_BOUND = '_DOMWatcher_bound';
 
  function attachBehavior(target) {
    var elements, elt, klass, i, length, handler, method;
 
    target = target || document.body;
    elements = target.getElementsByTagName('*');
    for (i = 0, length = elements.length; i < length; i++) {
      elt = $(elements[i]);
      for (klass in DOMWatcher.EventHandlers) {
        if (DOMWatcher.EventHandlers.hasOwnProperty(klass) &&
            !elt[ATTRIBUTE_BOUND] && elt.hasClassName(klass)) {
          elt[ATTRIBUTE_BOUND] = true;
 
          handler = DOMWatcher.EventHandlers[klass];
          if (handler.setup) {
            handler.setup.call(elt);
          }
 
          for (method in handler) {
            if (method.substring(0, 2) == 'on') {
              Event.observe(elt, method.substring(2, method.length),
                            handler[method].bindAsEventListener(elt));
            }
          }
        }
      }
    }
  }
 
  ...
 
}();

Let's break that down: first we grab the node from which to begin scanning, defaulting to document.body, if it wasn't passed in, and grab all its children:

target = target || document.body;
elements = target.getElementsByTagName('*');

Once we have all the child elements, we can iterate over them, looking for classes with behavior defined:

for (i = 0, length = elements.length; i < length; i++) {
  elt = $(elements[i]);
  for (klass in DOMWatcher.EventHandlers) {
    if (DOMWatcher.EventHandlers.hasOwnProperty(klass) &&
        !elt[ATTRIBUTE_BOUND] && elt.hasClassName(klass)) {
      elt[ATTRIBUTE_BOUND] = true;
 
      ...
 
    }
  }
}

Note that we're using elt[ATTRIBUTE_BOUND] in order to store whether or not we've already attached behavior to this element, as an optimization to prevent reattaching behavior.With the element stored in our temporary elt variable and a behavior class in klass, we can now run the setup routine and attach event handlers from DOMWatcher.EventHandlers:

...
 
handler = DOMWatcher.EventHandlers[klass];
if (handler.setup) {
  handler.setup.call(elt);
}
 
for (method in handler) {
  if (method.substring(0, 2) == 'on') {
    Event.observe(elt, method.substring(2, method.length),
                  handler[method].bindAsEventListener(elt));
  }
}
 
...

Now let's set up the scan to happen when the document is finished loading, so our behavior will get attached when the page is ready:

Event.observe(window, 'load', DOMWatcher.scanDocument);

And we're almost done. We also need to handle the case of Ajax updates to the DOM. Unfortunately, there's no good cross-browser way to do this, as only a few support DOMNodeInserted. Notably, IE does not, and has no equivalent that we could use in its stead. Also, Prototype has no facilities to support this, and in fact makes it quite painful to try and do it cleanly.Luckily, JavaScript is incredibly dynamic, and has no real security model, so what we can do instead of events is scan the document when certain Prototype functions are called. Since we use Prototype exclusively, this is only a matter of figuring out which calls update the DOM, and wrapping them to add a call to DOMWatcher.scanDocument:

(function () {
  var oldReplace = Element.Methods.replace;
  var oldUpdate = Element.Methods.update;
  var oldInsertion = Abstract.Insertion.prototype.initialize;
 
  Element.Methods.replace = function (element, html) {
    oldReplace(element, html);
    DOMWatcher.scanDocument();
  }; 
  Element.replace = Element.Methods.replace;
 
  Element.Methods.update = function (element, html) {
    oldUpdate(element, html);
    DOMWatcher.scanDocument();
  };
  Element.update = Element.Methods.update;
 
  Abstract.Insertion.prototype.initialize = function (element, content) {
    oldInsertion.call(this, element, content);
    DOMWatcher.scanDocument();
  };
})();

It's worth noting that I used three temporary variables, because IE had issues when I tried to use a single variable inside an iterative function. Oh well. People would probably find this version easier to read anyway.And that's all there is to it. Really. With this foundation in place, we now have the ability to add behaviors to elements fairly cleanly, which encourages a nice separation of code from HTML and from other code by use of the module pattern.Coming up: using DOMWatcher to automatically enable and disable links and forms.

Syndicate content