That Commit Bubbles thing I did a few months ago sits half complete without data to shove into it based on real commits. Here’s some bash to scrape numbers out of a subversion repo:

#!/bin/bash

if  [ ! -d .svn ]; then echo "no .svn folder - checkout the trunk of some subversion repo"; exit 10; fi

svn log | grep '^r[0-9]* ' | cut -d' ' -f 1 | cut -d'r' -f 2 | sort -n > svn_revisions.txt

echo "[ {}" > sequence.json

while ((i++)); read -r rev; do
    trap "echo Exited!; exit;" SIGINT SIGTERM

    svn log -v -r $rev > svn_to_git_revision.txt

    revisionLine=$(cat svn_to_git_revision.txt | grep '^r[0-9]* ')
    author=$(echo $revisionLine | cut -d'|' -f2 | sed 's/(no author)/none/' | cut -d' ' -f2 | sed "s/^$/none/")
    date=$(echo $revisionLine | cut -d'|' -f3 | cut -d'(' -f1 | cut -d' ' -f 2-4)
    messageText=$(cat svn_to_git_revision.txt | awk '/^$/ {do_print=1} do_print==1 {print} NF==3 {do_print=0}' | sed '/------/d' | sed 's/\"/\\\"/g' | tr -d '\n')

    mainDiff=$(svn diff -c $rev "src/main" 2>&1 | sed '/^svn: E155010: The node/d' | sed '/^svn: E160013: Diff target/d' | wc -l )
    mainScore=${#mainDiff}
    testDiff=$(svn diff -c $rev "src/test" 2>&1 | sed '/^svn: E155010: The node/d' | sed '/^svn: E160013: Diff target/d' | wc -l)
    testScore=${#testDiff}

    echo "Revision ${rev}. Main: $mainScore Test: $testScore"

    echo ", { \"date\": \"$date\", \"author\": \"$author\", \"rev\": $rev, \"message\": \"$messageText\", \"mainScore\": \"$mainScore\", \"testScore\": \"$testScore\"}"  >> sequence.json

done < svn_revisions.txt

echo "]" >> sequence.json
jshon < sequence.json | sed '/^ {},$/d' | sponge sequence.json

The trouble is that this requires knowledge of the repo. I’m assuming src/main and src/test are the only places in an entire tree that have prod and test source. For LambdaJ that’s true, but not every project is like that. Here’s the JSON output (first nine commits to the trunk):

[
 {
  "rev": 1,
  "date": "2009-01-07 08:32:40 -0500",
  "author": "none",
  "mainScore": "0",
  "message": "Initial directory structure.",
  "testScore": "0"
 },
 {
  "rev": 2,
  "date": "2009-01-07 08:36:33 -0500",
  "author": "luca.marrocco",
  "mainScore": "33121",
  "message": "initial project startup",
  "testScore": "12381"
 },
 {
  "rev": 11,
  "date": "2009-01-07 17:26:35 -0500",
  "author": "luca.marrocco",
  "mainScore": "23832",
  "message": "update remove or er suffix;update using assertThat in lambda test",
  "testScore": "11340"
 },
 {
  "rev": 12,
  "date": "2009-01-07 17:48:42 -0500",
  "author": "luca.marrocco",
  "mainScore": "1534",
  "message": "adding some test to join",
  "testScore": "2062"
 },
 {
  "rev": 13,
  "date": "2009-01-08 04:13:58 -0500",
  "author": "luca.marrocco",
  "mainScore": "0",
  "message": "adding html folder; adding first version of lambda logo",
  "testScore": "0"
 },
 {
  "rev": 14,
  "date": "2009-01-08 04:44:45 -0500",
  "author": "luca.marrocco",
  "mainScore": "2241",
  "message": "update refactor rename from to forEach",
  "testScore": "2695"
 },
 {
  "rev": 15,
  "date": "2009-01-08 05:55:35 -0500",
  "author": "luca.marrocco",
  "mainScore": "10318",
  "message": "adding test to aggregate; update refactor rename aggregate function using nouns or verb",
  "testScore": "6198"
 },
 {
  "rev": 16,
  "date": "2009-01-08 06:15:09 -0500",
  "author": "luca.marrocco",
  "mainScore": "692",
  "message": "update sum return 0.0 as empty item",
  "testScore": "429"
 },
 {
  "rev": 17,
  "date": "2009-01-08 11:25:13 -0500",
  "author": "mario.fusco",
  "mainScore": "2632",
  "message": "remove Groups",
  "testScore": "0"
 }
]

What’d be nice is a way to ask Subversion direct questions, where I could even score changes not prod-source and not test-source:

select size(diff("src/main")) as prodScore, size(diff("src/test")) as testScore, size(diff(not("src/main", "src/test"))) as nonProdNonTestScore, author, date, message
from commits
where date between 2015-01-01 and 2015-02-01

While Jim Webber might suggest that everything is a graph, I’m still stuck focussing on sets, with graph as a transformations of them.

Update / Jun 28th, 2016

Added a line-count command as it is more tangible (thanks Kohlman). Also GraphQL - Jim Webber was right.



Published

February 14th, 2015
Reads:

Categories