Wednesday, December 26, 2012

Nerding for no reason

I've recently started puttering a bit with Brainfuck as an amusingly nerdy way to encode little snips of text. I'd seen this done on mailing list signatures, and found it amusing.

I wrote the following by hand:

++++++++[>++++++++++<-]>.<++[>----<-]>-.<++[>++++<-]>+.<++[>-----------
<-]>.<++[>-------------<-]>.<++++[>++++<-]>.<++++++++[>+++++++++<-]>.<+
+++++++[>---------<-]>.<++[>+++++++++++<-]>.<+++[>-----<-]>--.<++[>++++
+++<-]>.<++[>-----<-]>.---.----.-.


After initially being quite satisfied with myself, I began to feel a bit sheepish.  What's the utility in obfuscating my PGP keyid in that way?  It certainly doesn't make discovering my keyid any easier than just expecting them to check the keyservers.

After that, I looked encoding my Geek Code block... but it's (a) too obscure, and (b) too verbose.  Encoded, it looks like:

+++++++[>>++++++++++<<-]>>+.<<++[>>--<<-]>>.<<++++[>>++++<<-]>>.<<++[>
>>+++++++++++++++++++++++<<<-]>>>+.<<<+++[>>----<<-]>>.--.>.<++.<<+++[
>>++++<<-]>>.<<++[>>>-------<<<-]>>>-.<<<++++++++++[>++++++++++<-]>.>>
.<<<++[>+++++++<-]>+.<++[>>>+++++<<<-]>>>+.<<<++[>>>+++++++<<<-]>>>+.<
<<++[>>>-------<<<-]>>>-.<<<++[>>>-----<<<-]>>>-.<<<+++[>------<-]>.>>
.<<<++++[>>----<<-]>>.<<++[>>>+++++<<<-]>>>+..<<<++[>>>---<<<-]>>>-.<<
<++[>>>--<<<-]>>>.<<<+++[>>++++++<<-]>>.<<+++[>>------<<-]>>-.<<++[>>>
+++++<<<-]>>>+..<<<++[>>>-----<<<-]>>>-.<<<++[>>+++++++<<-]>>.<<++[>>>
+++++<<<-]>>>+...<<<++[>>>-----<<<-]>>>-.+.<<<++[>>--<<-]>>.>-.<<<++[>
>---<<-]>>-.<<++[>>>+++++<<<-]>>>+..<<<++[>>>-----<<<-]>>>-.<<<+++[>>+
+++++<<-]>>.<<++[>>>+++++<<<-]>>>+...<<<++[>>>---<<<-]>>>-.<<<++[>>>--
<<<-]>>>.<<<++[>>----<<-]>>-.<<++[>>>+++++<<<-]>>>+..<<<++[>>>-----<<<
-]>>>-.<<<++[>+++++++<-]>.<+++++[>>>++++++<<<-]>>>+.<<<+++++[>>>------
<<<-]>>>-.<---.<<+++++[>>>++++++<<<-]>>>+.<<<+++++[>>>------<<<-]>>>-.
+.<<<++[>++++<-]>.>>-.+.<<<++[>>++<<-]>>.>-.<--.<<++[>>>+++++<<<-]>>>+
..<<<++[>>>-----<<<-]>>>-.<<<++[>>++++<<-]>>+.<<+++[>>>++++<<<-]>>>+.<
<<++[>>>-----------------<<<-]>>>-.<<<++[>>>+++++++++++<<<-]>>>.<<<++[
>>---<<-]>>.+++.<<++[>>>+++++<<<-]>>>+..<<<++[>>>-----<<<-]>>>-.<---.<
<++[>>-----<<-]>>-.<<+++[>>>++++<<<-]>>>+.<<<+++[>>>----<<<-]>>>-.<<<+
+++[>>+++++<<-]>>.<<++[>>>+++++<<<-]>>>+.<<<+++[>>>++++++<<<-]>>>+.<<<
+++[>>>------<<<-]>>>-.<<<++[>>>-----<<<-]>>>-.<<<++[>>----<<-]>>-.<<+
+[>>----<<-]>>-.<<++[>>++++<<-]>>+.<<++[>>>+++++<<<-]>>>+..<<<+++[>>>+
+++++<<<-]>>>+.<<<+++[>>>------<<<-]>>>-.<<<++[>>>-----<<<-]>>>-.<<---
.<++[>>>+++++<<<-]>>>+..<<<+++[>>>++++++<<<-]>>>+.<<<+++[>>>------<<<-
]>>>-.<<<++[>>>-----<<<-]>>>-.<<<++++[>>>+++++<<<-]>>>+.<<<++[>>>-----
<<<-]>>>..<<<++[>>>-----<<<-]>>>-.<<<++[>>++++<<-]>>.>.<<<++[>>---<<-]
>>.>.<<.++.<+++[>>>++++<<<-]>>>+.<<<+++[>>>----<<<-]>>>-.<<<++++[>----
-<-]>.<++[>>>+++++<<<-]>>>+..<<<++[>>>-----<<<-]>>>-.<<<++[>>-------<<
-]>>.<<++[>>++<<-]>>+.>.<<<++[>>--<<-]>>-.<<++[>>>+++++<<<-]>>>+..<<<+
+[>>>-----<<<-]>>>-.<+++.<<++[>>>+++++<<<-]>>>+..<<<++[>>>-----<<<-]>>
>-.<<+++.<++[>>>+++++<<<-]>>>+...---.+++.--.<<<++[>>>----<<<-]>>>-.<<+
++.<+++[>>>++++<<<-]>>>+....<<<++[>>>-----------------<<<-]>>>-.<<<++[
>>>+++++++++++<<<-]>>>.<<<++[>+++++<-]>.<++[>>>+++++<<<-]>>>+...<<<++[
>>>-----<<<-]>>>-.<<<++[>--<-]>-.<+++++[>>>++++++<<<-]>>>+.<<<+++++[>>
>------<<<-]>>>-.


In the process of trying to write that out, I became somewhat frustrated, and did what most programmers do.  I wrote a program to write my program:

do_factor = function(x){
  tmp = c()

  if (x>1){
    for (i in 1:x){
      if ( floor(x/i) == x/i){
        tmp = rbind( tmp, c(i, x/i, i+x/i))
      }
    }
    ix = head(which(tmp[,3] == min(tmp[,3])),1)
    sort(tmp[ ix, c(1,2) ])
  } else {
    c(1, 1)
  }
}

prep = function(x,n){ paste(rep(x,n), collapse='') }

x = as.numeric(charToRaw(
  "String to 'compile' goes here"))
cur = c(0,0,0)

output=""

for (i in 1:length(x)){
       if (x[i]>=97) { ctype=1 }
  else if (x[i]>=65) { ctype=2 }
  else               { ctype=3 }

  delta = x[i] - cur[ctype]

  R = paste(rep('>', ctype),collapse='')
  L = paste(rep('<', ctype),collapse='')

  if (delta==0){
    output=paste(output,R,'.',L,sep='')
  } else {
    if ( delta >= 0 ) { pch = '+' }
    else              { pch = '-'; delta = -delta }

    if (delta%%2==1 & delta >= 2){
      fx = do_factor(delta-1); add=pch
    } else {
      fx = do_factor(delta); add=''
    }

    if (fx[1]==1){
      output=paste(output, R, prep(pch,fx[2]),add,'.',L,sep='')
    } else {
      output=paste(output, prep('+',fx[1]),'[',R,prep(pch,fx[2]),L,'-]',R,add,'.',L,
                   sep='')
    }
  }
  cur[ctype] = x[i]
}

n_prev = nchar(output)
repeat {
  output = gsub('<>','',output)
  n_cur = nchar(output)
  if (n_cur == n_prev) break
  n_prev = n_cur
}



Now, I just need to find a sufficiently pithy quote such that the blob of BF is closer to the length of my PGP keyid, and farther from the insanely long geek code block.  The closest I've gotten thus far is:

+++++++[>>++++++++++++<<-]>>.<<++++++++[>+++++++++++++<-]>.---.<++++[>
>>++++++++<<<-]>>>.<<<++[>+++++++<-]>+.--.+++.-.<+++[>----<-]>.>>.<<+.
<++[>+++++<-]>.>>.<<-.<++++[>----<-]>-.<++++[>++++<-]>+.<+++[>----<-]>
-.<++[>+++<-]>+.<+++[>++++<-]>+.>>.<<<++[>----<-]>-.<++[>++<-]>+.---.<
+++[>----<-]>-.>>.<<<++[>--<-]>.<+++[>++++<-]>+.<++[>-----<-]>.>>.<<<+
+[>+++++<-]>.<++[>----<-]>-.<++++[>++++<-]>+.<++++[>----<-]>-.<+++[>++
++<-]>+.>>.<<+.<++[>-----<-]>.<++[>++<-]>.+++.<++[>--<-]>.<++[>---<-]>
-.<++[>>>+++++++<<<-]>>>.<<<++++++[>>>------<<<-]>>>.<<<++[>>>++++++++
+++<<<-]>>>..<<<+++[>>>++++<<<-]>>>+..<<<+++[>>>----<<<-]>>>-.<<<++[>>
--<<-]>>-.<<++[>+++++++<-]>.<++++[>----<-]>.--.<++++[>++++<-]>+.>>.<<<
++[>>++++<<-]>>.<<++[>----<-]>-.+++.<++[>----<-]>.+.<

Monday, October 15, 2012

A display_filter for mutt to (among other things) localize dates

I'm neurotic about my email.  It's probably clinically significant.

Recently, I was looking for a way to have mutt display dates in my local timezone. I happened upon this stackexchange thread, which suggests the use of a display_filter.

I'm already using the excellent t-prot filter, to decrease my genocidal urges.  So, I'll need to chain that onto the end of whatever new display_filter I use. Furthermore, I'm not terribly thrilled that the stackexchange example relies on formail (because I'd rather avoid using the unmaintaned procmail).  I'd also rather avoid creating tempfiles, if I can.

So, here's what I came up with.  (gdate is GNU date from the coreutils package).

#!/usr/bin/perl
open (FH, "|-",
   "t-prot -acelmtkwS -Mmutt -L ~/dot/mutt/mlfooters -A ~/dot/mutt/adfooters")
  or die "Can't run $!";

$found_date=0;

while(<>){
  if ($found_date==0 && /^Date: (.*)$/) {
    print FH "Date: ".`gdate -R -d "$1"` ;
    $found_date = 1;
  } else {
    print FH;
  }
}

close(FH);

Wednesday, June 13, 2012

Is it weird that...

A friend recently posted "Is it weird that at least 3 of my friends have their 3rd anniversary today?". I was curious.  So, I constructed a Monte Carlo simulation to determine if it was, in fact, "weird".  Short answer: no (p ≈ 0.21).

If we assume that that 14% of weddings happen in June, 75% of those on a Saturday, and that relationship lengths are exponentially distributed with a mean of 8 years, then a frequency distribution for the number of independent couples out of a population of 649 having 2009-06-13 as their anniversary is:

1: 35.5%
2: 23.2%
3: 14.5%
4: 4.7%
5: 1.9%
6: 0.3%

The R code I used for this was:

# Probabilities of weddings in each month
# Data for weddings in the Netherlands in 1995 taken from
# http://g2.cvs.sourceforge.net/viewvc/g2/g2/g2/demo/bargraph/graphdata.py
pr_month = c(
  2947, 3121, 4514,
  5708, 9794, 12427,
  6825, 8630, 12108,
  6411, 4045, 4939 )
pr_month = pr_month/sum(pr_month)

# Unfortunately, I'm not able to find data on which week days are most common.
# All I've found is that Saturday is 'by far the most common', followed by
# Sunday, with weekday weddings being less common.
pr_day = list()
pr_day[["Monday"   ]] = 0.01
pr_day[["Tuesday"  ]] = 0.01
pr_day[["Wednesday"]] = 0.01
pr_day[["Thursday" ]] = 0.01
pr_day[["Friday"   ]] = 0.01
pr_day[["Saturday" ]] = 0.75
pr_day[["Sunday"   ]] = 0.20

# First marriages which end in divorce last an average of 8 years
# (see http://www.census.gov/prod/2005pubs/p70-97.pdf )
# I'll assume marriage lengths follow an exponential distribution with mean 8

# For performance, cache things we will re-use
month_days = list()
day_probs = list()

# Define a function which can generate a specified number of wedding dates,
# following the probabilities given above:
random_dates = function(n){
  dates = rep(as.Date(NA), n)
  ages = sort(floor(rexp(n, 1/8)),decreasing=T) # Sort to get good cache performance
  month = sample(1:12, n, prob=pr_month, replace=T)

  for (i in 1:length(ages)){
    m_ix = sprintf('%i-%i-1',2012-ages[i],month[i])
   
    if (is.null(month_days[[m_ix]])){
      # Get a list of the days in the month
      start = as.Date(m_ix)
      stop = as.Date( ifelse(month[i]==12,
        sprintf('%i-1-1',  2012-ages[i]+1),
        sprintf('%i-%i-1', 2012-ages[i], month[i]+1)) )
      month_days[[m_ix]] <<- head(seq(start, stop, by='day'),-1)
    }
   
    if (is.null(day_probs[[m_ix]])){
      # Figure probabilities for each day
      day_prob = rep(NA, length(month_days[[m_ix]]))
      for (j in 1:length(day_prob)){
        day_prob[j] = pr_day[[ weekdays(month_days[[m_ix]][j]) ]]
      }
      day_probs[[m_ix]] <<- day_prob/sum(day_prob)
    }
   
    # Pick a day out of this month
    dates[i] = sample(month_days[[m_ix]], 1, prob=day_probs[[m_ix]] )
  }
  dates
}

# Generate 1000 sets of 649 independent dates
#  Figure out how many times out of those 1000 simulations
#  there are more than three which fall on 13 June 2009.
count=rep(0,1000)
for (i in 1:1000){
  # Progress meter:
  if (i%%10==0) cat (i)
  else if (i%%2==0) cat ('.')
  if (i%%50==0) cat('\n')

  count[i] = sum( random_dates(649) == as.Date('2009-06-13'))
}
cat('\n')



# Generate a frequency distribution
for( i in 1:max(count)){
  cat(sprintf('%i: %.1f\n', i, 100*sum(count==i)/length(count)))
}

Wednesday, May 30, 2012

Autocommit II

 And yet, sometimes one *doesn't* want every last keystroke autocommitted.  Like when one is accessing repositories at work via tramp.  For those cases, I've tweaked my after-save-hook as follows:


(add-hook 'after-save-hook '(lambda ()
  (if (vc-git-registered (buffer-file-name))
    (if (file-exists-p (concat (vc-git-root (buffer-file-name)) "/.autocommit"))
      (vc-git-checkin (buffer-file-name) nil (format-time-string "Autocommit %F %T"))))))

Much better.

Thursday, April 26, 2012

git autocommit ftw

After an unfortunate incident involving "R --slave > analysis.r" rather than "R --slave < analysis.r", I've added the following to my .emacs:

(require 'log-edit)
(add-hook 'after-save-hook '(lambda ()
  (if (vc-git-registered (buffer-file-name))
    (vc-git-checkin (buffer-file-name) nil (format-time-string "Autocommit %F %T")))))

Such that if I'm in a git repo, every save gets committed. So, I'll need to find a new stupid mistake to destroy my work.
 
I also learned that a list of unpushed commits can be viewed with "git log origin/master..HEAD".

Update: See Autocommit II for an improved version.