I See Deno in Your Future

Deno is a re-imagining of Node: still JavaScript for the server and command line, still based on V8, but with a drastically improved build story, simplified (hell, genuinely simple) dependencies, and a vastly improved standard library and web compatibility story. I’ve been using it on-and-off for hobby work for a couple of years now1, and I’ve really enjoyed playing with it.

One especially unique feature of Deno is its security model. By default, Deno scripts aren’t allowed any dangerous access: not the file system, not the network, not environment variables, not even high-resolution timers. Basically, they need to be hermetically sealed scripts, or be explicitly granted permissions by the user to do anything. The upshot is that you can blindly run a script (e.g. the official welcome script, via deno run https://deno.land/std/examples/welcome.ts) safe in the knowledge you can’t hose your computer.

For awhile, I’d had an idea that I’d port some of my personal programs such that I could simply deno run them right off my GitHub account, rather than installing them. In practice, that proved a bit tricky: Deno’s APIs for reading local files (e.g. Deno.readFileSync) were different from reading remote files (via fetch), so handling a script running both locally and remotely, if it required external resources, ended up being a bit of a pain and require varying amounts of conditional branching. Not a deal-breaker in a strict sense, but it took enough fun away I didn’t bother.

But I was happy to discover that Deno 1.16 actually added file:// URL support to fetch. That means that fetch(new URL("./file.txt", import.meta.url)) will work both when run locally and when run remotely. I gave this a shot in the silliest way imaginable, and, well…feel free to enjoy my Deno port of fortune, Dortune. Sure, you can clone and run it locally, but you can also do deno run --allow-net https://raw.githubusercontent.com/bpollack/dortune/main/dortune.ts and enjoy the exact same code working remotely without installing anything.

Granted, this particular example is fairly ridiculous, but I’m honestly quite excited about having a suite of personal utilities I can keep up-to-date transparently and that don’t care where they run.

  1. At least, when I’m not playing with Factor↩︎

The Deprecated *nix API

I realized the other day that, while I do almost all of my development “in *nix”, I don’t actually meaningfully program in what I traditionally have thought of as “*nix” anymore. And, if things like Hacker News, Lobsters, and random dotfiles I come across on GitHub are any indication, then there are many developers like me.

“I work on *nix” can mean a lot of very different things, depending on who you ask. To some, it honestly just means they’re on the command line: being in cmd.exe on Windows, despite utterly different (and not necessarily inferior!) semantics, might qualify. To others, it means a rigid adherence to POSIX, even if GNU’s incompatible variants might rule the day on the most common Linux distros. To others, it truly means working on an actual, honest-to-goodness Unix derivative, such as some of the BSDs—or perhaps a SunOS or Solaris derivative, like OpenIndiana.

To me, historically, it’s meant that I build on top of the tooling that Unix provides. Even if I’m on Windows, I might be developing “in *nix” as long as I’m using sed, awk, shell scripts, and so on, to get what I need to do done. The fact I’m on Windows doesn’t necessarily matter; what matters is the underlying tooling.

But the other day, I realized that I’ve replaced virtually all of the traditional tooling. I don’t use find; I use fd. I don’t use sed; I use sd. du is gone for dust, bash for fish, vim for kakoune, screen for tmux, and so on. Even the venerable grep and awk are replaced by not one, but two tools, and not in a one-for-one: depending on my ultimate goal, ripgrep and angle-grinder replace either or both tools, sometimes in concert, and sometimes alone.

I’m not particularly interested in a discussion on whether these tools are “better”; they work better for me, so I use them. Based on what I see on GitHub, enough other people feel similarly that all of these incompatible variations on a theme must be heavily used.

My concern is that, in that context, I think the meaning of “I write in *nix” is starting to blur a bit. The API for Windows is defined in terms of C (or perhaps C++, if you squint). For Linux, it’s syscalls. For macOS, some combo of C and Objective-C. But for “*nix”, without any clarifying context, I for one think in terms of shell scripts and their utilities. And the problem is that my own naïve scripts, despite being written on a legit *nix variant, simply will not run on a vanilla Linux, macOS, or *BSD installation. They certainly can—I can install fish, and sd, and ripgrep, and whatever else I’m using, very easily—but those tools aren’t available out-of-the-box, any more than, I dunno, the PowerShell 6 for Linux is. (Or MinGW is for Windows, to turn that around.) It amounts to a gradual ad-hoc breakage of the traditional ad-hoc “*nix” API, in favor of my own, custom, bespoke variant.

I think, in many ways, what we’re seeing is a good thing. sed, awk, and the other traditional tools all have (let’s be honest) major failings. There’s a reason that awk, despite recent protestations, was legitimately replaced by Perl. (At least, until people forgot why that happened in the first place.) But I do worry about the API split, and our poor ability to handle it. Microsoft, the paragon of backwards compatibility, has failed repeatedly to actually ensure that compatibility, even when armed with much richer metadata than vague, non-version-pinned plain-text shell-scripts calling ad-hoc, non-standard tooling. If we all go to our own variants of traditional Unix utilities, I worry that none of my scripts will meaningfully run in a decade.

Or maybe they will. Maybe my specific preferred forks of Unix utilities rule the day and all of my scripts will go through unscathed.

Goodbye, Twitter

Let’s zoom back to early April. We’re several weeks into the COVID-19 epidemic. I’m not sleeping well. In fact, a “good night” for me is just a few hours. I’ve realized a couple days ago I’m averaging about 30 to 40 hours of sleep per week. The talking heads on Fox have already begun their drumbeat about how we should reopen businesses to save the economy, despite zero economists arguing for that. The Overton window is already starting to shift from trying to avoid deaths to discussing how many hundreds of thousands of deaths are a “reasonable number” of deaths. The CDC itself will ultimately become a part of this narrative, revising their death count below what anyone else thinks so that they can say a “mere” thirty-nine thousand people have died, which they’re in turn doing because it helps Trump’s narrative when the Overton window doesn’t shift fast enough.

But I don’t even know that yet. That happens in late April. It’s still early April. What I do know, even now, is that the deaths are of course going to hit the poorest, the most disenfranchised, the people who cannot do what I’m doing and work from home. The deaths are going to hit people who have to work the grocery stores, the drug stores, the delivery services. I am unsurprised to see the death tolls comically high amongst people of color, and I’m equally unsurprised to see the talking heads “wonder” why. I’d honestly be more surprised if that weren’t the case.

I cannot make my brain stop thinking about this. Ever. So I have night after night of unending insomnia, forever, because I cannot do anything to fix any of this.

But there is one small thing I can “do”: I can scream on Twitter. I have relatively few followers, but I’m an old account, so my screaming “means something.” But that is itself complicated. Alongside the people whom I merely disagree with are the conspiracy theorists who are trying to say the US caused it. And they themselves are right alongside those saying China deliberately unleashed it. And both of them are alongside those saying the entire thing is a scam. None of these arguments make any sense; they don’t survive the first couple of questions, let alone an actual dialog. But that doesn’t matter; Twitter values all of these positions equally, and escalates them equally, as long as The Ratio is high enough.

So it’s bots against bots, all the way down. Even the people who aren’t bots are basically bots, because it’s not about facts; it’s about sides. There is zero room for nuance or discussion. You need enough people who either agree with you—or even disagree with you, as long as they elevate you—and you are going to be elevated in The Feed. It’s about retweets, and likes, and comments, and only the numbers matter; not what you said. If you tweet a middle finger emoji and you get a hundred thousand retweets, you are more valuable than the epidemiologist who tweets out an actual cure. No actual discussion happens. No one learns. No one is persuaded. Everyone is just angry.

Twitter is so caustic that I long ago had to use scripts that block many, many people. I only follow a couple hundred people, but I’ve had to block nearly forty thousand just to make Twitter tolerable, let alone hospitable. But Twitter is how I communicate. I want to stay involved. I want to stay a part of the conversation. And I have convinced myself that staying on Twitter is how I can do that. I cannot, absolutely cannot, put down this particular bullhorn. I must remain a part of the discussion. So I lay awake and don’t sleep and wonder what the debate will be tomorrow.

That was a few weeks ago.

This evening, for whatever reason, I had it. I’m done. There are many great people on Twitter, but it’s no longer worth engaging with them this way. It’s not even worth it when I’m not getting harassed, because Twitter’s algorithms—correctly, for their bottom line—are only sated when I’m angry enough to use Twitter when I’m on the toilet, and that’s a very high bar. So Twitter goes to great lengths to ensure that I’m as angry as possible all the time. And that will only capture me at my worst. No one will be persuaded, and nothing will be gained.

And I realized I’d felt this once before, about Facebook. And I deleted my account then. And I hadn’t deleted my account on Twitter because I “needed” it.

But the truth is…I don’t need Twitter. Twitter needs me.

So I’m done. If you follow me on Twitter at @gecko, please feel free to subscribe to this blog’s RSS feed, or to email me at bpollack@bitquabit.com if you want to contact me. I’m available a multitude of other ways, many in person if you live in the Raleigh/Durham/Chapel Hill area. And if you’re a real person and want to talk, I’d really like to chat with you about what you feel, and why, and about how I feel, and why, and see where we overlap and where we differ, and where we can learn from each other.

Twitter and me, though? We’re done.

When class-based React beats Hooks

As much as I love exploring and using weird tech for personal projects, I’m actually very conservative when it comes to using new tech in production. Yet I was an immediate, strong proponent of React Hooks the second they came out. Before Hooks, React really had two fundamentally different ways to write components: class-based, with arbitrary amounts of state; or pure components, done as simple functions, with zero state. That could be fine, but the absolutely rigid split between the two was a problem: even an almost entirely pure component that had merely one little tiny bit of persistent state—you know, rare stuff like a checkbox—meant you had to use the heavyweight class-based component paradigm. So in most projects, after awhile, pretty much everyone just defaulted to class-based components. Why go the lightweight route if you know you’ll have to rewrite it in the end, anyway?

Hooks promised a way out that was deeply enticing: functional components could now be the default, and state could be cleanly added to them as-needed, without rewriting them in a class-based style. From a purist perspective, this was awesome, because JavaScript profoundly does not really want to have classes; and form a maintenance perspective, this meant we could shift functional-components—which are much easier to test and debug than components with complex state, and honestly quite common—back to the forefront, without having the threat of a full rewrite dangling over our heads.

I was able to convince my coworkers at Bakpax to adopt Hooks very quickly, and we used them successfully in the new, much richer content model that we launched a month ago. But from the get-go, one hook made me nervous: useReducer. It somehow felt incredibly heavyweight, like Redux was trying to creep into the app. It seemed to me like a tacit admission that Hooks couldn’t handle everything.

The thing is, useReducer is actually awesome: the reducer can easily be stored outside the component and even dependency-injected, giving you a great way to centralize all state transforms in a testable way, while the component itself stays pure. Complex state for complex components became simple, and actually fit into Hooks just fine. After some experimentation, small state in display components could be a useState or two, while complex state in state-only components could be useReducer, and everyone went home happy. I’d been entirely wrong to be afraid of it.

No, it was useEffect that should’ve frightened me.

A goto for React

If you walk into React Hooks with the expectation that Hooks must fully replace all use cases of class-based components, then you hit a problem. React’s class-based components can respond to life-cycle events—such as being mounted, being unmounted, and getting new props—that are necessary to implement certain behaviors, such as altering global values (e.g., history.pushState, or window.scrollTo), in a reasonable way. React Hooks, out-of-the-box, would seem to forbid that, specifically because they try to get very close to making state-based components look like pure components, where any effects would be entirely local.

For that reason, Hooks also provides an odd-one-out hook, called useEffect. useEffect gets around Hooks limitations by basically giving you a way to execute arbitrary code in your functional component whenever you want: every render, every so many milliseconds, on mounts, on prop updates, whatever. Congratulations: you’re back to full class-based power.

The problem is that, just seeing that a component has a useEffect1 gives you no idea what it’s trying to do. Is the effect going to be local, or global? Is it responding to a life-cycle event, such as a component mount or unmount, or is it “merely” escaping Hooks for a brief second to run a network request or the like? This information was a lot easier to quickly reason about in class-based components, even if only by inference: seeing componentWillReceiveProps and componentWillMount get overrides, but componentWillUnmount left alone, gives me a really good idea that the component is just memoizing something, rather than mutating global state.

That’s a lot trickier to quickly infer with useEffect: you really need to check everything listed in its dependency list, see what those values are doing, and track it up recursively, to come up with your own answer of what life-cycle events useEffect is actually handling. And this can be error-prone not only on the read, but also on the write: since you, not React, supply the dependency chain, it’s extremely easy to omit a variable that you actually want to depend on, or to list one you don’t care about. As a result, you get a component that either doesn’t fire enough, or fires way too often. And figuring out why can sometimes be an exercise in frustration: sure, you can put in a breakpoint, but even then, just trying to grok which dependency has actually changed from React’s perspective can be enormously error-pone in a language where both value identity and pointer identity apply in different contexts.

I suspect that the React team intended useEffect to only serve as the foundation for higher-level Hooks, with things like useMemo or useCallback serving as examples of higher-level Hooks. And those higher-level Hooks will I think be fine, once there’s a standard collection of them, because I’ll know that I can just grep for, I dunno, useHistory to figure out why the pushState has gone wonky. But as things stand today, the anemic collection of useEffect-based hooks in React proper means that reaching for useEffect directly is all too common in real-world React projects I’ve seen—and when useEffect is used used in the raw, in a component, in place of explicit life-cycle events? At the end of the day, it just doesn’t feel worth it.

The compromise (for now)

What we’ve ended up doing at Bakpax is pretty straightforward: Hooks are great. Use them when it makes sense. Even complex state can stay in Hooks via useReducer. But the second we genuinely need to start dealing with life-cycle events, we go back to a class-based component. That means, in general, anything that talks to the network, has timers, plays with React Portals, or alters global variables ends up being class-based, but it can in certain places even bring certain animation effects or the like back to the class-based model. We do still have plenty of hooks in new code, but this compromise has resulted in quite a few components either staying class-based, or even migrating to a class-based design, and I feel as if it’s improved readability.

I’m a bit torn on what I really want to see going forward. In theory, simply shipping a lot more example hooks based on useEffect, whether as an official third-party library list or as an official package from the React team, would probably allow us to avoid more of our class-based components. But I also wonder if the problem is really that Hooks simply should not be the only abstraction in React for state. It’s entirely possible that class-based components, with their explicit life-cycle, simply work better than useEffect for certain classes of problems, and that Hooks trying to cover both cases is a misstep.

At any rate, for the moment, class-based components are going to continue to have a place when I write React, and Bakpax allowing both to live side-by-side in our codebase seems like the best path forward for now.

  1. And its sibling, useLayoutEffect↩︎

Falsehoods Programmers Believe About Cats

Inspired by Falsehoods Programmers Believe About Dogs, I thought it would be great to offer you falsehoods programmers believe about mankind’s other best friend. But since I don’t know what that is, here’s instead a version about cats.

  1. Cats would never eat your face.
  2. Cats would never eat your face while you were alive.1
  3. Okay, cats would sometimes eat your face while you’re alive, but my cat absolutely would not.
  4. Okay, fine. At least I will never run out of cat food.
  5. You’re kidding me.
  6. There will be a time when your cat knows enough not to vomit on your computer.
  7. There will be a time when your cat cares enough not to vomit on your computer.
  8. At the very least, if your cat begins to vomit on your computer and you try to move it to another location, your cat will allow you to do so.
  9. When your cat refuses to move, it will at least not manage to claw your arm surprisingly severely while actively vomiting.
  10. Okay, but at least they won’t attempt to chew the power cord while vomiting and clawing your hand, resulting in both of you getting an electric shock.
  11. …how the hell are you even alive?2
  12. Cats enjoy belly rubs.
  13. Some cats enjoy belly rubs.
  14. Cats reliably enjoy being petted.
  15. Cats will reliably tell you when they no longer enjoying being petted.
  16. Cats who trust their owners will leave suddenly when they’re done being petted, but at least never cause you massive blood loss.
  17. Given all of the above, you should never adopt cats.
  18. You are insane.

Happy ten years in your forever home, my two scruffy kitties. Here’s to ten more.

  1. Here, ask Dewey, he knows more about it than I do. ↩︎

  2. Because, while my cat has absolutely eaten through a power cord, this is an exaggeration. The getting scratched while trying to get my cat not to puke on a computer I was actively using happened at a different time from the power cord incident. Although this doesn’t answer the question how she is alive. ↩︎

The Death of Edge

Edge is dead. Yes, its shell will continue, but its rendering engine is dead, which throws Edge into the also-ran pile of WebKit/Blink wrappers. And no, I’m not thrilled. Ignoring anything else, I think EdgeHTML was a solid rendering engine, and I wish it had survived because I do believe diversity is good for the web. But I’m not nearly as upset as lots of other pundits I’m seeing, and I was trying to figure out why.

I think it’s because the other pundits are lamenting the death of some sort of utopia that never existed, whereas I’m looking at the diversity that actually exists in practice.

The people upset about Edge’s death, in general, are upset because they have this idea that the web is (at least in theory) a utopia, where anyone could write a web browser that conformed to the specs and (again, theoretically) dethrone the dominant engine. They know this hasn’t existed de facto for at least some time–the specs that now exist for the web are so complicated that only Mozilla, with literally hundreds of millions of dollars of donations, can meaningfully compete with Google–but it’s at least theoretically possible. The death of Edge means one less browser engine to push back against Chrome, and one more nail in the coffin of that not-ever-quite-here utopia.

Thing is, that’s the wrong dynamic.

The dynamic isn’t Gecko v. EdgeHTML v. Blink v. WebKit. It’s any engine v. native. That’s it. The rendering engine wars are largely over: while I hope that Gecko survives, and I do use Firefox as my daily driver, that’s largely irrelevant; Gecko has lost by at least as much as Mac OS Classic ever lost. What does matter is that most people access the web via mobile apps now. It’s not about whether you like that, or whether I like that, or whether it’s the ideal situation; that’s irrelevant. The simple fact is, most people use the web through apps, period. In that world, Gecko v. Blink v. WebKit is an implementation detail; what matters is the quality of mobile app you ship.

And in that world, the battle’s not over. Google agrees. You know how I know? Because they’re throwing a tremendous amount of effort at Flutter, which is basically a proprietary version of Electron that doesn’t even do desktop apps.1 That only makes sense if you’re looking past the rendering engine wars–and if already you control effectively all rendering engines, then that fight only matters if you think the rendering engine wars are already passé.

So EdgeHTML’s death is sad, but the counterbalance isn’t Gecko; it’s Cocoa Touch. And on that front, there’s still plenty of diversity. Here’s to the fight.

  1. Yeah, I know there’s an effort to make Flutter work on desktops. I also know that effort isn’t driven by Google, though. ↩︎

Messages, Google Chat, and Signal

Google is about to try, yet again, to compete with iMessages, this time by supporting RCS (the successor to SMS/MMS) in their native texting app. As in their previous attempts, their solution isn’t end-to-end encrypted—because honestly, with their business model, how could it be? And as with Google’s previous attempts to unseat a proprietary Apple technology, I’m sure they’ll tout openness: they’ll say that this is a carrier standard while iMessages isn’t, and attempt to use that to put pressure on Apple to support it—never mind the inferior security and privacy that make the open standard a woefully…erm, substandard choice.

So here’s my suggestion to Apple: you’ve got a good story going on right now that you have the more secure, more privacy-conscious platform. If you want to shut down Google’s iMessages competitors once and good, while simultaneously advancing your privacy story for your own customers, why not have iMessages use Signal when the recipient doesn’t have an iOS device? Existing Apple users would be unaffected, and could still leverage the full suite of iMessages features they’re used to. Meanwhile, Android customers on WhatsApp or Signal would suddenly have secure communication with their iOS brethren, not only helping protect Android users, but also helping protect your own iOS users. And you’d be doing all of this while simultaneously robbing Google of the kind of deep data harvesting that they find so valuable.

I doubt Apple will actually do this in iOS 12, but it’d be amazingly wonderful to see: a simultaneous business win for them, and a privacy win for both iOS and Android users. I’ll keep my fingers crossed.

Moving and backing up Google Moving Images

For reasons that I’ll save for another blog post, I decided recently to ditch pretty much the entire Apple ecosystem I’d been using for the last decade. That’s meant gradually transitioning from macOS to Ubuntu, and from iOS to Android. Of course, to ditch iOS for Android required a new phone; after some research, I opted for a Google Pixel 2.

The Pixel 2’s been a great phone and has lots of interesting features, but one of the more esoteric features is called Moving Images. These are Google’s take on Apple’s Live Photos: when you take a photo, a very small amount of video is also recorded, yielding a kind of Harry Potter-like effect. In general, I don’t honestly care all that much about the video bits of these, but every once and awhile, you capture a really unique moment by happenstance where a Live Photo or Moving Image is really special, and on those occasions, I’m incredibly thankful someone at Apple came up with this idea.

In general, I use Google Photos to manage my photo collection, in part because it hits a sweet spot on my convenience/safety metric: the web application and mobile clients are incredibly easy-to-use for day-to-day work, and keeping a local copy of all your photos is as trivial as clicking a checkbox in Google Drive and then downloading them with the Google Backup & Sync tool (or InSync or rclone on Linux). The ease of getting a local mirror of my Google Photos data is great not just for offline access, but also for both offsite backup (in case I ever lose access to my Google account) and trivial rich editing with The GIMP, Lightroom, darktable, Acorn, or any of the other heavier-duty photo editors when I want to. It’s genuinely been one of the better cloud/local hybrids I’ve used.

I was very happy with this setup until just a few days ago, when I made an annoying discovery: Moving Images are very difficult to back up. In fact, the only way I ultimately managed to get everything automatically backed up was to use a tool not from Google, but from Microsoft.

The lost 110 photographs

I wouldn’t honestly have even noticed there was a problem in the first place except that I realized that Backup & Sync failed for exactly 110 files—on all of my machines. macOS, Windows, whatever, didn’t matter, those 110 files wouldn’t download. I could click “Retry All,” I could reinstall Backup & Sync, I could even utterly remove all the downloaded data and retry from absolute scratch, but those 110 files refused to budge. Google is Google, so there was no way for me to really reach out and get genuine tech support,1 but I did poke through their forums. And promptly felt my heart drop as I found three things very quickly:

  1. I was hardly the only one with this issue.
  2. The Google Drive team would move posts on this topic to the Google Photos forums, and the Google Photos team would move them to the Google Drive forums, because each team generally said it was the other’s problem. As far as I could tell, no matter which forum ultimately ended up being the thread’s home, nothing was resolved (see e.g. this thread, which ended up in the Drive forum).
  3. Many of the affected users mentioned Pixel phones.

This caused me to look at whether there was a pattern to what wasn’t getting downloaded, and I spotted the issue instantly: all 110 files started with MVIMG, the prefix for Moving Images. At that point, I found that there had been topics going back months about Moving Images not syncing properly (e.g. this post from early January). But the good news was that multiple people were saying that newer Moving Images were backing up properly, and it was trivial for me to verify that, indeed, more recent Moving Images I’d taken had downloaded, and some spot-checks showed happy little JPEGs all right where I wanted them to be on my local disk.

Okay, I thought to myself. That stinks, but it’s just those 110 photos; new ones are downloading just fine. So, worst-case, you download 110 photos by hand. Not the end of the world.

I went to sleep and didn’t think more about it.

The “moving” part of Moving Images is optional

It wasn’t until the next morning that I realized something was wrong. When I’d spot-checked more recent Moving Images to verify they had backed up, I of course didn’t actually check on the actual “Moving” part of the Moving Image; while Moving Images are technically JPEGs, the video is stored in such a way that nothing I’ve got can (currently) see it. That didn’t faze me too much, mind—changes were overwhelmingly high that someone else would reverse-engineer the format, and failing that, the chance the thing was just an MPEG concatenated to, or stored inside, a JPEG was extremely high. That’s well inside the realm of things I’ve reverse-engineered in the past. But it did mean that I hadn’t explicitly verified whether a video stream was present.

Over breakfast, a little detail I’d missed finally registered: the files were just too damn small. The Pixel 2 has a 12 megapixel camera. Photos it takes, even with really good compression, really ought to be at least a couple megabytes by themselves; throw in video, and they should be at least 6-10 MB. Yet every file I was looking at was, tops, in the 4 to 5 MB range. That was simply insufficient to store both a high-resolution photo, and a video stream. Something was up.

I picked one of the Moving Images at random. On my Pixel 2, and on the Google Photos website, it showed up as 6.4 MB; my local copy was only 3.4 MB. Another Moving Image showed the same pattern: 7.2 MB on Photos and on my phone, but only 3.7 MB locally. Indeed, a quick sanity check seemed to reveal that all the Moving Images had suffered the same fate. And it wasn’t local to just the official Backup & Sync tool, either: InSync and rclone both showed the exact same behavior, too. Yet downloading the pictures manually from the Google Photos website gave the original, larger image. The only conclusion I could reach: the Google Drive service itself was stripping out the Moving part of the Moving Image.2

API? What API?

My first thought was I’d just write my own backup client. After all, while the Drive integration was nice, all I really wanted was automatic offsite backup. While writing something myself wasn’t quite my first pick, I didn’t anticipate it’d be that hard, and since I could download the full, untrimmed files from the Photos website, I knew the raw files existed; it was just a matter of using the proper Google Photos API.

Except…well, there is no Photos API, as far as I can tell. The Picasa Web Albums API has been deprecated since Picasa sunset in 2016, and Google doesn’t list a Photos API anywhere on its developer portal. In other words, the Drive API seemed to be the only official way to go. But I knew from InSync and rclone that the Drive API was exactly where the problem lay in the first place.

Okay, back to the drawing board.

Backup backup options

The second idea I had was to try another photo synchronization service. The raw data was obviously on the phone; I just needed something that could get them off. My first stop was Dropbox: I’d used it for years previously, I knew they had a nice Linux client, and I still used it actively.

Dropbox completely failed here, on two levels: first, it suffered the same trimming issue Google Photos did, so in a narrow sense, it obviously didn’t solve my problem. No biggie.

But Dropbox also failed because it has become downright slimy when it comes to letting you downgrade your account. When I was in Dropbox, I realized I’d fallen below the storage threshold for a free account, so I decided to cancel my paid membership. Dropbox made this incredibly difficult: first, when you click on “Change Plan,” your only option is to upgrade; there is no way to downgrade. You instead have to scroll to the very bottom of the window and click a tiny “Cancel” link. After that, you then have to choose to cancel three or four more times, being interrupted to be told why leaving’s a bad idea on screens where the default button keeps alternating between the “continuing closing my account” option and the “haha no actually I totally want to keep my account, thank you for asking” choice. It took me a couple of tries before I finally extricated myself. Never again, Dropbox. If you have to play that dirty to keep customers, then I’m definitely not sending any business your way.

My next thought was to see if someone had written a photo uploader for Upspin, but they haven’t, and that’s considerably more time than I’ve got right now, so that was it for that idea. I also thought about using Perkeep, since that does have an Android photo uploader, but my Perkeep installation is behind my firewall, and AT&T’s modem prevents my old OpenIKED-based VPN setup from working, so that route was also out.

The final tool I reached for before giving up was Microsoft OneDrive, and I was pleasantly surprised to find that OneDrive just worked. As far as I can tell, OneDrive uploads the unaltered original files, verbatim; if I copy the raw file off my phone via USB, the hashes match.

That said, while I have had very good experiences with OneDrive in the past, simply moving to OneDrive isn’t really an option for me right now: my family all heavily use Google Photos, and we make extensive use of shared albums. Getting everyone moved onto a new service just isn’t feasible, so I was going to have to find a way to make both OneDrive and Google Photos play together somehow.

Time for a short shell script.

The “solution”

I ended up putting together a process that is very gross, but does work: first, I have both Google Drive (via rclone) and OneDrive (via the excellent open-source onedrive client) syncing locally. I create a copy of the Google Photos folder structure in a different location, and then hardlink all of the photos from the InSync folder to the copy. Next, I look for any photos in the copy whose name start with MVIMG_. For each photo I find, I look for a corresponding, larger file in the Microsoft OneDrive camera roll, and, if I find one, move that image over to the new folder structure in place of the Google Drive one.

It’s not ideal, and the resulting Ruby script is not exactly the best code I’ve ever written, but it does work.

Moving forward

Currently, I’m in an unhappy place: I’m generally still using Google Photos, but I’ve also got camera shots going to OneDrive, and I have a gross Ruby script that tries to sanitize this mess. Further, I’m not actually fully confident that these larger files do in fact have the video information I need; I’ll need to learn more about the JPEG file format to figure out if my hunch is correct—and if so, to figure out how to extract the data.

Meanwhile, I’m going to hope that Google either just makes an API for doing this, or otherwise, fixes the Drive API to allow fetching the original files. But at least I don’t have to worry about losing any raw data in the meantime.

  1. This is, strictly speaking, in my particular case, a lie; I know enough people at Google that I can usually just play a game of telephone until I find someone who both works on a relevant team and cares enough to help resolve my problem. But a) normal people cannot do this, and b) this actually was not helpful this time around. ↩︎

  2. To be clear here, it’s possible that’s not quite what’s happening; it’s tricky for me to tell, since I haven’t yet reverse-engineered the file, and Google hasn’t (as far as I can tell) documented what they’re doing. But Photos/Drive editing the file between my phone and my machine means regardless that it’s not trustworthy as a backup option. ↩︎

Commit SHAs as dates

I’ve been going through a pile of old bitquabit posts. While many of them hold up over time, the more technical ones frequently don’t: even when I was lucky and happened to get every technical detail right, and every technical recommendation I threw out held up over time (hint: this basically never happens), they were written for a time that, usually, has passed. Best practices for Mercurial in 2008 are very much not best practices now. But it’s a bit tricky: whether something I wrote is genuinely out-of-date has less to do with how much raw time has passed, than how much churn in the project has happened.

To that end, I was happy to see that some of the blogs I follow have started using Git commit SHAs to date their post, alongside the calendrical date—serving as a kind of vector clock for the passionate. If you’re writing technical posts for an open-source project, this seems ideal to me: for casual observers, they can go with the calendrical date, and for people deeply involved in that arena or project, they can instead key off what has happened since the commit in question.

I’m not going to retrofit all my old posts, but it’s something I’ll keep in mind going forward.

Automating Hugo Deployments with Bitbucket Pipelines

As I mentioned in a recent post, I manage my blog using a static site generator. While this is great to a point—static site generators can handle effectively infinite traffic, they’re stupidly cheap to run, and I can use whatever editor I feel like—the downside is that I lose tons of features I used to have with dynamic blog engines. For example, while it’s almost true that I can use any editor I want, I don’t have a web-hosted editor like I would in WordPress or MovableType, and I likewise can’t trivially add any sort of dynamic content. Most of what I lose I can live without, but one that is genuinely annoying, and which has even bitten me in the past, is that I can’t publish without being on a computer that has both my SSH keys, and the publishing toolchain installed. Not only is that inconvenient; it means that publishing output can vary depending on which machine I use for a given publishing run.1

There’s a pretty easy fix for that: add continuous deployment. If it’s good enough for real software, it’s good enough for a personal blog. I can set up a single, consistent deployment environment on some server, drive all the deploys through that, and call it a day. The problems here being that a) setting up a continuous integration server is annoying, and b) I am lazy. There are cloud-hosted CI servers, but most of them either are overly complex, or are too expensive for me to justify using for my personal blog.

Enter Bitbucket. I’m already using them, since they’re by far and away the best Mercurial hosting game in town these days, and they recently2 added a new feature called Bitbucket Pipelines that fits all my requirements: cloud-hosted, free, easy-to-use, cheap, and it didn’t cost anything.3

And I’m glad I looked, because getting everything running turned out to be stupidly easy.

Step one: write the Dockerfile

Bitbucket Pipelines wants to base your deployment on a Docker image, so I had to write one. Thankfully, it’s so easy to make Docker images these days that pretty much everyone is making them—even when there is no conceivable reason why they should. So let’s set one up.

To deploy my blog, I need at least four things: Hugo, Pygments, rsync, and SSH. It took me a couple tries to get the Dockerfile just right (mostly because I straight-up forgot rsync and SSH on the first go), but the result is literally five lines, total:

FROM alpine:3.6

RUN apk add --no-cache bash git go libc-dev python py2-pip rsync openssh-client
RUN pip install pygments
RUN go get -u github.com/gohugoio/hugo

About the only thing remotely interesting here is that I’m using Alpine Linux, which I selected based on it seemed to be what the cool kids were using these days and it was one of the smallest base Docker images I could find. I’m not honestly sure if bash is needed (I suspect /bin/sh would’ve been just fine), but I originally wrote my deployment script for bash, and I’m too lazy to figure out if I used any bashisms, so let’s just toss that in there anyway. What’s a paltry 34 MB between friends?

Tons of places host Docker images for free these days, and Bitbucket can use any of them; I kept it simple and pushed it to my Docker Hub account.4

Step two: write the build script

I actually already had a build script,5 so all I really had to do was tweak it slightly to be run on something other than my personal machine. The result’s genuinely not interesting, but for completeness, the functional part of it looks like this:


# Normal boilerplate (see e.g. https://sipb.mit.edu/doc/safe-shell/)
set -euo pipefail

# Add $GOPATH to the path so Hugo will be present
export PATH=$(go env GOPATH)/bin:$PATH
hugo --cleanDestinationDir
rsync -av --delete public/ publisher@bitquabit.com:/var/www/blag/

Again, nothing interesting here. We’re at exactly ten lines, and even that only because I added some comments and some blank lines for readability. I called this file build and stored it unceremoniously in the root of my blog repository.

Step three: test it…if you feel like it

Since we’re going to deploy files to a real server in an automated fashion, the next step is to test everything.

Or not. It’s your server; I’m not gonna tell you what to do.

Myself, I decided to half-ass it a bit. Pipelines just launches your Docker image, copies your project into the container, sets your project to be the current directory, and begins running your script. I can do that:

$ docker run -it --volume=C:/Users/b/src/blag:/blag --entrypoint=/bin/bash bpollack/blag-builder:latest
$ cd /blag
$ ./build

The first line says to run a Docker container we built interactively (-i) on my terminal (-t), mount the Windows directory C:\Users\b\src\blag at /blag in the container, and then launch bash once the container is ready. In the next two lines, I demonstrate my amazing CS skills to change to the appropriate directory and run the script, proving that, even in this advanced day and age, I can still play the part of a computer.

This of course failed at the push step due to SSH keys not being set up (more on that in a second!), but otherwise seemed to work fine, so it’s good enough for me. Onwards!

Step four: create the pipeline

The pipeline spec is really simple: you give it a Docker image (which we just made), a condition of when to run (I’ll just have it run whenever there’s a new changeset, which is the default), and what steps to run when the condition is met (in our case, we need to run one single step, which is the build script we just wrote). So that file, in its entirety, is:

image: bpollack/blag-builder:latest

    - step:
          - ./build

Granted: being Yaml, this looks like the result of an editor with broken indentation rules. But it’s at least pretty self-explanatory: we give it a Docker image (it defaults to using Docker Hub, which is great, because so did we), we give it one pipeline, called default, and give it the sole job of running a one-line script that calls our real build script, which we wrote together in the previous heading after much struggle. Commit this as a file called bitbucket-pipelines.yml in the root of your repository and push.

Step five: add relevant SSH keys

Congratulations! If you did everything perfectly at this point, Bitbucket will create your pipeline, run the build, and it will fail!…because you don’t allow random people to push stuff to your server over SSH.6 Fair enough. For reasons I’m not honestly entirely clear about, Bitbucket won’t let you specify SSH keys to use for Pipelines until at least one pipeline exists. But now that we’ve got a pipeline—it’s the one that just failed—you’re good.

In your repository, click on the Settings tab, and then, under the Pipelines heading, there’s an entry called SSH Keys. Still with me? Good. These are SSH keys that will be loaded into your Docker container right before your script runs, and which will be used to push code to your server. I recommend following their advice, generating a key with them, and then adding that key to the ~/.ssh/authorized_keys file in the appropriate user account. You’ll also need to tell it what servers you’ll be using these keys with so that Bitbucket will detect if your server gets swapped out and can avoid deploying your precious secrets to some nefarious machine.

(Incidentally, I recommend using those Bitbucket keys only with a heavily locked-down account that’s dedicated purely to handling the deploy, but how to do that is a bit outside the scope of this particular post.)

Step six: you were actually done at step five

That’s it; we’re done. You do need to either re-run the pipeline manually at this point or push a dummy changeset to make sure, but everything should honestly Just Work™.

That’s honestly it; a hair over twenty lines of code got you free continuous delivery. You can get more fancy at this point if you’d like (I’m probably going to make sure the pipeline runs only when certain bookmarks are moved, rather than on every push, for example), but that’s the fundamentals. Three short files, each ten lines or less.

  1. I briefly had what I guess could qualify as an outage when I accidentally ran a deploy on a machine that didn’t have Pygments installed—which promptly deleted every single code snippet on the site. Oops. ↩︎

  2. Relatively speaking; the feature went into beta in March 2016. ↩︎

  3. It’s not free-free, but you get 50 minutes of build time with the free account, and building my blog with Pipelines takes about 16 to 25 seconds, so I figure I’ll be fine for awhile. ↩︎

  4. I won’t stop you from using this image, but I really discourage you from doing so; I make zero guarantees I won’t do horrible things to it in the future. ↩︎

  5. Two, actually—one for Windows and one for Unix—but since the Windows Subsystem for Linux has stabilized, all the Windows one does is call the Unix one. ↩︎

  6. I sincerely hope. ↩︎