My Talk at the GoSF Meetup

24 Jan 2013

I gave a talk at the San Francisco Go Meetup about how we fend off the big bad garbage collector for MemCachier.

With a naïve implementation, our benchmarks were showing GC pauses on the order of seconds for heavy (but realistic) workloads. By using the unsafe package and a few simple tricks we were able to reduce worst case GC pauses to a couple dozen milliseconds. GC pauses are now also totally independent of the size of the cache!

I’ve posted my slides here.


What's wrong with discrimination law?

08 Nov 2012

Dean Spade, Normal Life:

Discrimination law primarily conceptualizes the harm of racism through the perpetrator/victim dyad, imagining that the fundamental scene is that of a perpetrator who irrationally hates people on the basis of their race and fires or denies service to or beats or kills the victim based on that hatred. The law’s adoption of this conception of racism does several things that make it ineffective at eradicating racism. First, it individualizes racism. It says that racism is about bad individuals who intentionally make discriminatory choices and must be punished. In this (mis)understanding, structural or systemic racism is rendered invisible. Through this function the law can only attend to disparities considered that come from the behavior of a perpetrator who intentionally considered the category that must not be considered (e.g. race, gender, disability) in the decision she was making (e.g. hiring, firing, admission, expulsion). Conditions like living in a district with underfunded schools that “happens to be” 96 percent students of color, or having to take an admissions test that has a number of disparities in life conditions (access to adequate food, health care, employment, housing, clean air and water) that we know stem from and reflect long-term patterns of exclusion and exploitation cannot be understood as “violations” under the discrimination principle, and this remedies cannot be won. This narrow reading of what constitutes a violation and can be recognized as discrimination serves to naturalize and affirm the status quo of maldistribution. Anti-discrimination law seeks out aberrant individuals with overly biased intentions. Meanwhile, all the daily disparities in life chances that shape our world along lines of race, class, indigenity, disability, national origin, sex, and gender remain untouchable and affirmed as non-discriminatory or even as fair.


Two TED Talks About Perception

19 Dec 2011

How well do we actually understand what we desire? Can real value be derived from artificially changing our desires?

Dan Ariely: Are we in control of our own decisions?

Rory Sutherland: Life lessons from an ad man


Coypond: Semantic grep for Ruby

10 Aug 2011

For the impatient: I wrote a tool to search through Ruby code. It’s available at http://github.com/alevy/coypond

I recently made some patches to the Paperclip Ruby gem for a photo hosting web application I’ve been building (called PicAcs - my gallery is here). Paperclip handles file uploads in Rails apps, and specifically makes resizing images and uploading to Amazon S3 very easy. The problem was that PicAcs allows users to use their own S3 accounts to store photos, and Paperclip doesn’t support using dynamic S3 credentials out of the box, which I definitely need. Turns out the patch isn’t too complicated (a only 3 lines of code or so), but figuring out where to patch was a pretty painful task.

This is not the fault of the Paperclip code-base, which as far as I can tell, is generally very good. The reason, I think, is a combination of Ruby’s flexibility as a language and a lack of good tools for navigating Ruby source code:

  • Ruby’s flexibility means that code for a particular module or class may be defined in multiple places (this is important for a framework like Paperclip where it needs to dynamically load code based on runtime variables)

  • As opposed to languages like Java which have heavy-weight mature IDEs like Eclipse, there’s no equivalent for Ruby where I can just F3 my way around a code base. I end up having to hold a lot more information in my short term memory with Ruby, so everything is much messier.

There was (I assume sill is) an internal tool at Google that functioned essentially like Google Code Search (which is pretty nice for searching open source C/C++/Java code, but sucks for other things like Ruby). I remember becoming so reliant on it when I was at Google, and I wished for a similar tool to file through my Ruby projects and all of their dependancies.

A couple years ago during a hackathon, David Balatero and I wrote a web based tool to do this over a particular codebase. Unfortunately, neither of us can find the code… So it may as well not have happened.

Introducing Coypond

Over the last couple days, I built Coypond - a very basic grep-like tool to tackle this task. It indexes the class, module and method names in a Ruby code base, noting the files they were found in and the locations within those files. It can search through specific files, source code directory trees, or through locally installed gems. And all this through a simple command line interface! (Yeah, I know… it’s awesome).

Check it out on GitHub: http://github.com/alevy/coypond Or you can just install it:

$ gem install coypond

And run:

$ coypond -h

for instructions.

Coypond uses ripper (a built in library as of Ruby 1.9) to generate parse trees from Ruby source files. These parse trees are then use to create an inverted index of the code, annotated with semantic information like whether the definition is a class, module or method.

To be continued…

I’d like to use this library to build something with a nicer interface (perhaps web based?). I especially want it to be very convenient to move between search and looking at code (hyperlinks tend to be good for that, no?). It would be nice to integrate somehow with rubygems.org, or GitHub in order to the user to search through gems or project source they don’t have on their local machine.

Please check out the gem if you program in Ruby and you’ve ever run into an this sort of problem. And please give me some feedback - creating issues on the GitHub project page is probably the best.


A Case Study in Lying Statistics

19 Jul 2011

Today, Alex Loddengaard shared a graph by Steve Jurvetson Tory Piro titled “Belief in Evolution vs. National Wealth”.

From looking the graph, one might be inclined to believe that only 39% of Americans accept evolution. If this were true, it would be a major fail for American cultural intelligence and the American education system.

I think this chart is actually quite misleading. Looking at the study on which this graph is based PDF (originally published in Science Aug 2006), I think the picture is actually not so bleak.

To measure belief in evolution, the chart uses the question “Human beings developed from earlier species of animals? (true/false/not sure)”. In fact a large minority of Americans (39%) answered false to this question. The rest answered true (40%) and not sure (21%). This question, however, is closely tied to religion. Another, more direct, question in the survey asked: “Humans were created by God as whole persons and did not evolve from earlier forms of life?”. 62% answered true to this question. Only 2% answered not sure to this question. About the same proportion of people answered false as in the other question (36%). This difference highlights, how people who may understand, and generally accept evolution, might see a conflict when it comes to the evolution of humans because of whichever religious dogma they identify with.

Conversely, looking at a broader question about evolution from the same survey tells a different story: “Over periods of millions of years, some species of plants and animals adjust and survive while other species die and become extinct.” An overwhelming 78% of Americans answered true (meaning they accept evolution in principle). A mere 6% answered false. This question more clearly asks about the process of natural selection, as well as avoids using the term evolution - which is highly politicized in the US.

It is well established that generally speaking, richer countries tend to be less religious, and that within that generalization the US is an outlier. This graph, after a better examination of what it’s actually showing, somewhat reinforces that claim. However, it does not tell us very much about acceptance of the science of evolution.

Hand drawn graphs lie. Graphs generated by gnuplot lie somewhat less. Raw data lies even less. Jeff Dean never lies.


Google+ or Meditations on Software Patents

09 Jul 2011

Programming is an art form. Not because it’s a marvelous expression of humanity (it’s not), or because I think there is some sort of amazing talent to it. Programming is an art form because of the how software evolves within the community. Even though the software community tends to be hung up on questions of the sort: “How is this different than Facebook?”, software itself is actually an area where imitation, especially when it involves experimenting with different ways of thinking about old technology, is a value.

Google+ is a current example of how, what is essentially re-implementation of existing software (Facebook), with very small differences and a fresh start can be hugely impactful.

  1. Circles are just another name for Facebook’s friend’s lists. However, the semantics are slightly different, they are more fun to create, and they’re there from the beginning. Who the hell actually retroactively puts their Facebook friends into lists? I have one list - “Restricted” and it’s for not-friends whose friendships I feel obligated to accept.
  2. Asymmetric friendships are stolen from Twitter, but in the context of Google+ they seem to be taking on a new meaning.
  3. Privacy has gotten a makeover with Google+. But, again, there is nothing really technical that seems new here.

If the Google+ team were standing in a room full of VCs a month ago, they would just get the same “How is this different from Facebook?” (of course, in their case they could just answer “We’re fucking Google!” and money would rain from the heavens). Ok, “Hangout”s are new and are the freshest thing since watermelon. But that’s a minor setback to my argument. Point is, maybe we put too much emphasis on innovation per se. Does our software have to be “the next best thing”? Or maybe we should focus more on playing around with existing ideas, copying and remixes other people’s software and experimenting?

And, of course, Google+ isn’t unique in being merely incremental. Social networks are all just “copies in plaster of copies in marble of copies in wood” anyway. Today’s Facebook News Feed is a rip off of Twitter, which is itself just simpler blogging (which existed a million times over - blogger, wordpress etc), which was ironically a beefed version of online journals like LiveJournal, which had social networking features! (In the beginning I need LiveJournal, I can explain everything else with science…)


An All New Website

29 May 2011

I took Saturday to completely revamp my website. I’ve been wanting to give it a facelift for a while - both the content and design were stale. After seeing some CSS tricks in the new Railscasts design that I wanted to figure out how to do, I decided I had procrastinated long enough. I also took the opportunity to convert all the content to Markdown to make it easier to edit in the future.

Design

I was originally using a modified WordPress theme as a basis for my terribly designed site. Wiping everything and starting from scratch actually proved to be no more complicated than trying to retro fit HTML to CSS themed for different content.

I stole the banner with gradient and border from Railscasts, as well as the social media icons.

Markdownify

My website content has remained stale for a while mainly because I’ve found it to be too big of a hassle. When I built the site originally, I decided to put everything in one HTML file and use Javascript to hide pages that aren’t in the foreground. I still like this design, because it means I don’t have to deal with managing consistent layouts across different pages, but finding specific content to edit in a big HTML file is annoying as f*$#. So I separated the site in individual source files for each page, and use a small Ruby script to combine them (using ERB in the layout template to place pages).

This was great! Once I had the pages separated like this, and was already running a script to recombine everything for each update, it because too easy not to throw a markup language parser on top of it all. I chose Markdown because I’ve been using it a lot recently and because it transparently lets you fall-back on HTML. Finally, I factored out all of the links to a common references file so I can manage URLs centrally and I’m less likely to make mistakes typing them in if I’ve already used them once in the site. Rewriting the pages in Markdown was a little tedious but only took an hour or so (mostly to organize links).

Markdown has several drawbacks - namely that you don’t get to choose id’s or set explicit classes on elements very easily. However, I actually really like being forced into really basic HTML constructs. It means that I pretty much have to write unobtrusive CSS/JS (and btw, let’s be honest: if you’ve ever used named a class something like “float-left-above-header”, you’re not being very unobtrusive).


OneOff

12 Apr 2011

I hate giving out my e-mail address to random web applications. It’s true, spam filters are pretty good these days, and the truth is I don’t get a ton of unwanted e-mail. Still, somehow the thought of my e-mail address sitting on a bunch of random servers, waiting to be sold to advertisers makes me feel icky.

Solution

In an attempt to address this problem I’ve been working on OneOff - a program that lets you create e-mail address that  only receive mail sent from specific e-mail addresses. The mail gets forwarded to your real e-mail account, but the person/application on the other side never sees your real address. When you reply to one of these e-mails, your real address gets re-written with the OneOff too, so using OneOff doesn’t require any special work after generating an address.

Inards

The actual implementation is still somewhat in flux. However, for now, OneOff uses a technique similar to the PwdHash project:
  1. OneOff takes your e-mail address (john@example.com), and the domain name of the site your generating a OneOff e-mail for (e.g. reddit.com, ommwriter.com).
  2. It generates a OneOff e-mail address by taking the hash (SHA1) of your e-mail address and the domain name (e.g. "john@example.com|ommwriter.com").
  3. OneOff stores a mapping between the generated hash and your e-mail address and gives you back an OneOff address based on the hash. Something like this: ff40a935b7738c138b28567dc5e34385ac0134d2@oneoff.chelax.com
  4. You (the user) register your e-mail address with whatever service it is your using (like the OmmWriter download page).
  5. When the service eventually send you an e-mail (e.g. sending you the download link, confirming you are a human), OneOff gets the e-mail, finds the previously stored mapping in its database, and confirms that the sender's domain and your address match the hash (by taking the hash again). If it does, the e-mail gets forwarded to you, and if not, it gets dropped on the floor.

Next Steps...

Right now the only way to generate OneOff addresses is by going to http://oneoff.chelax.com/ and using the online form. This is less convenient than I’d like, so I’m working on a browser plugin that, again, works very much like PwdHash’s browser plugin. Basically you would type your actual e-mail into the appropriate field on a website with some prefix (PwdHash uses “@@”) or some special keystroke, and it would automagically replace it with a OneOff address.

OneOff currently only re-writes the reply-to, to, and from fields in e-mails. There are a handful of other headers that either expose the original e-mail or at least give the domain it came from etc. I’m working on scrubbing those, but for now I doubt many web apps look at anything but the e-mail you registered with.

Your turn...

Please try OneOff if you think it’s useful and let me know what you think. It’s totally functional now, but as I mentioned the interface sucks. It’s currently hosted on my Dreamhost account, so it’ll probably break if too many people start using it (who has a guess? 3 people at once? hopefully we’ll see). If a significant number (for some definition of significant) of people find it helpful and actually use it I’ll migrate it over to something more robust and scalable ASAP.

A final disclaimer: While the original applications you’d be hiding your address from won’t know your real e-mail address, I technically could see it. I promise not to use it maliciously, or generally look at it (although I might accidentally see it in server logs etc). If anyone has suggestions for how to change the protocol in order to eliminate this concern, please let me know.


Restarting

12 Apr 2011

I’m restarting my blog. This site used to contain most of my travel entries, but those have now moved to http://travel.amitlevy.com/. This blog will be specific for non-travel posts, mostly my software and research projects (I hope).