Hello, world! A lot of you are on the last bits of your vacation this week. That is awesome. There is likely no better time you can take vacation. Your team has hopefully shipped all deliverables for 2014 Q4. You have likely planned out Q1. You almost certainly have no real bugs in production. Cthulhu willing, you have automatic regression and integration tests so that you can rest assured knowing that The Person Who Does Not Vacation can safely fix anything that does come up.
Welcome back to having fun with Elasticsearch and Python. In the first part of this series, we learned the basics of setting up and running with Elasticsearch, and wrote the very basics we needed to cover basic indexing and searching of Gmail metadata. In the second part, we extended the search and querying to cover the full text of the emails as well. That theoretically got us most of what we wanted, but there’s still work to be done.
I have a tricky relationship with C++. There is a narrow subset of the language that, when properly used, I find to be a strict improvement over C. Specifically, careful use of namespaces, RAII, some pieces of the STL (such as std::string and std::unique_ptr), and very small bit of light templating can actually simplify a lot of common C patterns, while making it a lot harder to shoot yourself in the foot via macros and memory leaks.
In my earlier post on Elasticsearch and Python, we did a huge pile of work: we learned a bit about how to use Elasticsearch, we learned how to use Gmvault to back up all of our Gmail messages with full metadata, we learned how to index the metadata, and we learned how to query the data naïvely. While that’s all well and good, what we really want to do is to index the whole text of each email.
I find it all too easy to forget how fun programming used to be when I was first starting out. It’s not that a lot of my day-to-day isn’t fun and rewarding; if it weren’t, I’d do something else. But it’s a different kind of rewarding: the rewarding feeling you get when you patch a leaky roof or silence a squeaky axle. It’s all too easy to get into a groove where you’re dealing with yet another encoding bug that you can fix with that same library you used the last ten times.
I’m really happy to see that Factor 0.97 is now available. Factor is a modern, concatenative programming language, similar to FORTH or Joy, but actively maintained. It’s got great performance, solid documentation, and rich libraries and tooling, including a robust web framework that powers the Factor website itself. Along with Pharo Smalltalk, Factor is one of two languages and environments I go to when I just want to have fun for a bit, which is how I ended up making my own little contribution to the release in the form of a rewritten and more robust Redis package.
Sometimes, I feel that my career as a coder is demarcated by the tech stacks I used to write software. Partly that’s about the programming language—Smalltalk in college, C# and Python at Fog Creek—but it’s also about all the other tools you use to get the job done. I spent eight years working for Fog Creek, and in that capacity, I had a pretty consistent stack: FogBugz for bugs, customer support, and documentation; Trello for general feature development; Kiln for code review; Mercurial for source control; Vim and Visual Studio for actual coding; and our in-house tool, Mortar, for continuous integration.
For the past week, I have felt a wave of relief that we shipped Kiln Harmony, the first DVCS-agnostic source control system. Kiln Harmony’s translation engine ruled my life for the better part of a year, and, as the technical blog series is revealing, probably took some of my sanity with it. But we’ve received nearly universally positive feedback, and built a product that I myself love to use, so I can’t help but feel the project was an incredible success.
Business of Software has long stood as a unique conference for me: while nearly every tech conference I attend focuses on the technological side of delivering a solution, Business of Software focuses on actually delivering the goods. How do you reach people? How do you know you’ve reached people? How, if you’ve reached people, do you turn that into profit so that you can keep making people’s lives better? These are insanely important questions, and ones that are far too easily glossed over in the debate of what database software to use, or what language has the easiest hires, and that’s a huge part of why so few start-ups actually manage to get anywhere.
I’ve been coding full-time for only a few weeks, and already I’m going somewhat insane by people engaging in what I’d call cargo-cult debugging. Cargo cults were religions that developed when primitive societies, who’d had little exposure to any technology, were suddenly confronted with top-of-the-line modernism in the form of World War II military machines. When the armies disappeared at the conclusion of festivities, they took all of their modern marvels with them.