Adam Rice

My life and the world around me

Category: technology (page 1 of 12)

Dealing with graphics in MS Word

Microsoft’s Word’s ubiquity is rivaled only by its badness. Since we’re stuck using it–and often using files created by other people in it–we need to find coping mechanisms.

One especially vexing problem in Word is the way it deals with placed graphics. This post isn’t an exhaustive tutorial on how to work with graphics in Word—it lays out one method that will work in most cases, and explains how to make that work.

Let’s say you receive a file that looks something like this, with a placed photo and some text boxes and arrows laid over it to call out features.

a picture of three adorable cats in a Microsoft Word document
A typical document

You edit the file, do something seemingly innocuous, and you wind up with something like this

a picture of three adorable cats in a messed-up Microsoft Word document
A messed-up document

Obviously you can’t let the file go out into the world like this, and because you are a good person, you want to leave things better than you found them. So how do you fix this? Or if you’re required to create files like this, how do you prevent this from happening in the first place?


It’s easy to get into trouble with Word any time you try using its page-layout features. If at all possible, it’s best to treat every document as a continuous river of text, rather than isolated blocks. The problem with images is that Word gives you numerous options for treating images as isolated blocks, and exactly one option for treating them as part of that river. When you mix externally created images and graphics that are created in Word, things get complicated. And these are overlaid on one another, things get even more complicated.

In the image shown above, there’s a photo that was created externally, and three text boxes and arrows that were created within Word. So the first thing to understand is how Word treats these differently: the photo is a picture and the arrows & text boxes are shapes. They have different formatting options available to them. However, interestingly, you can crop a picture in Word using its “picture format” tools, and that turns it into a shape (!).

Most of the trouble you run into with these hybrid images revolve around placement options. Word gives you two sets of parameters for dealing with pictures/shapes in text: positioning and text wrap

Microsoft Word's positioning pane
Word’s positioning pane
Microsoft Word's text-wrap pane
Word’s text-wrap pane

If a visual element has the positioning in line with text, then it behaves like a typed character–it can sit on a line with other characters, it moves around with other elements, etc. And I argue that this should be your goal for most or all visual elements you use in Word. You can set them on their own line, use other techniques to marry them to captions, center them, etc.

With all the other positioning options, the element is anchored to a spot on the page–a certain distance from a corner, for example. If you anchor the element, you use the wrapping options to tell Word how to wrap text around (or over, or under) the element. There may be legitimate reasons to do this, but Word is a rotten tool if that’s what you’re trying to do. I often see files where someone has placed an image with fixed positioning that they really just want inline with text–and then they insert a bunch of carriage returns to put the text down below it. This will break as soon as the text above gets a little longer or shorter.

Also, just for fun, if you set the wrap to “in line with text,” Word automatically does the same for the positioning, and vice-versa. This kind of makes sense, but can be confusing.

To simplify your life, treat each graphic as a standalone block, on its own line, flowing with the text.

This gets complicated when you’re combining a picture with shapes. By default, the picture is placed “inline”. By default, a shape is placed with positioning based on…something—positioning can be relative to the page, margin, paragraph, line, with separate options for horizontal and vertical position. Ain’t nobody got time for that.

So we’re back to inline positioning as the right way.

But with the Orientalist mysticism that you only find in cheesy action movies, Word forces you to do things the wrong way before you can do them the right way. Here’s the trick: we need to manually position the picture and the shapes relative to each other. And Word doesn’t let you manually position elements that have inline positioning–again, it does make sense, but is confusing.

First, make sure that all the visual elements have some kind of positioning that isn’t inline–it doesn’t really matter what.

Second, get all the shapes lined up correctly over the picture that acts as the backdrop. If some of the shapes are getting hidden behind the picture, select the picture and then execute Picture Format > Arrange > Send to Back.

Third, I like to group all the shape elements. This is probably unnecessary. Shift-click on all the elements in turn to select them and then execute Shape Format > Arrange > Group. The image below shows the shape elements grouped together, with a frame around them. You can still separately manipulate the elements in a group–it’s possible to move a grouped element unintentionally; if you need to move the group, you need to grab it by the group’s frame.

grouped shapes
Grouped shapes in Word

Fourth, shift-click to select the grouped shape elements and the background picture, and group those.

Fifth, set the positioning of these grouped elements to “inline with text.” Phew! It’s faster to do than to read.

A lightweight documentation system

Background

Not long after I took on my current role in AAR LLC, I inherited the task of producing the “binders” that the organization prints up every event cycle–basically, operations manuals for the event.

There was a fair amount of overlap between these binders, and I recognized that keeping that overlapping content in sync would become a problem. I studied documentation technologies and techniques, and learned that indeed, this is considered a Bad Thing, and that “single sourcing” is the solution–this would require that the binders be refactored into their constituent chapters, which could be edited individually, and compiled into complete documents on demand.

The standard technology for this is DITA, but that involves a lot of overhead. It would be hard for me to maintain by myself, and impossible to hand off to anyone else. What I’ve come up with instead is still a bit of a work in progress. It still has a bit of a tech hurdle to overcome–it does involve using the command line–but should be a lot more approachable.

The following may seem like a lot of work. It’s predicated on the idea that it will pay off by solving a few problems:

  • You are maintaining several big documents that have overlapping content
  • You want to be able to publish those documents in more than one format (web, print, ebook)
  • You want to be able to update your materials easily, quickly, and accurately.

The following is Mac-oriented because that’s what I know.

Installation

Install Homebrew

Homebrew is a “package manager” for MacOS. If you’ve never used the command-line shell before, or have never installed shell programs, this is your first step. Think of it as an App Store for shell programs. This makes installing and updating other shell apps much easier.

To install, open the Terminal app and paste in

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Note: Don’t paste in shell commands you find on the Internet unless you know what you’re doing or you really trust me. But this is exactly what the nice folks at Homebrew will tell you to do.

Aside: there are a couple of equivalents for Windows: Scoop and Chocolatey. I don’t know anything about them beyond the fact that they exist. As of this writing, neither includes installers for both the following programs, so they may not be much help here.

Install Pandoc

Pandoc is a swiss-army knife tool for converting text documents from one form to another. In the Terminal app, paste in

brew install pandoc

Homebrew will chew on that for a while and finish.

Install GPP

GPP is a “generic preprocessor,” which means that it substitutes one thing for another in your files. In the Terminal app, paste in

brew install gpp

Again, Homebrew will chew on that for a while and finish.

Learning

Learn some shell commands

You’ll at least need to learn the cd and ls commands.

This looks like a pretty good introductory text.

Learn Markdown

Markdown was created by John Gruber to be a lightweight markup language–a way to write human-readable text that can be converted to HTML, the language of web pages. If you don’t already know the rudiments of HTML, the key thing to remember about it is that it describes the structure of a document, not its appearance. So you don’t say “I want to this line to be in 18-pt Helvetica Bold,” you say “I want this line to be a top-level heading.” How it looks can be decided later.

Since then, others have taken that idea and run with it. The author of Pandoc, John MacFarlane, built Pandoc to understand an expanded Markdown syntax that adds a bunch of useful features, such as tables, definition lists, etc. The most basic elements of Markdown are really easy to learn; it has a few less-intuitive expressions, but even those are pretty easy to master, and there are cheat-sheets all over the Internet.

Markdown is plain text, which means you can write it in anything that lets you produce text, but if you do your writing in MS Word (aside: please don’t), you need to make sure to save as a .txt file, not a .doc or .docx file. There are a number of editors specifically designed for Markdown, that will show a pane of rendered text side-by-side with what you’re typing; there’s even a perfectly competent online editor called Dillinger that you can use.

I’ve gotten to the point where I do all my writing in Markdown, except when required to use some other format for my work. There are a lot of interesting writing apps that cater to it, writing is faster, files are more portable and smaller.

Organization

Refactor files and mark them up

Getting your files set up correctly is going to be more work than any other part of this. You’ll need to identify areas of overlap, break those out into standalone documents, decide on the best version of those (assuming they’re out of sync), and break up the rest of the monolithic documents into logical chunks as well. I refer to the broken-up documents as “component files.”

Give your files long, descriptive names. For redundancy, I also identify the parent documents in braces right in the filename, eg radio_channel_assignments_{leads}_{gate}.md. Using underscores instead of spaces makes things a little easier when working in the shell. Using md for the dot-extension lets some programs know that this is a Markdown file, but you could also use txt.

Then you’re going to mark these up in Markdown. If your files already have formatting in MS Word or equivalent, you’re going to lose all that, and you’ll need to make some editorial decisions about how you want to represent the old formatting (remember: structure, not appearance). Again, this will be a fair bit of work, but you’ll only need to do it once, and it will pay off.

Organize all these component files in a single directory. I call mine sources.

Create Variables

This is optional, but if you know that there are certain bits information that will change regularly, especially bits that appear repeatedly throughout your documents, save yourself the trouble of finding and replacing them. Instead, go through your component files and insert placeholders. Use nomenclature that will be obvious to anyone looking at it, like THE_YEAR or FLAVOR_OF_THE_MONTH. You don’t need to use all caps, but that does make the placeholders more obvious. You cannot use spaces, so use underscores, hyphens, or camelCasing.

Now, create a document called variables.txt. Its contents should be something like this:

#define THE_YEAR 2018
#define FLAVOR_OF_THE_MONTH durian
…

And so on. Each of these lines is a command that GPP will interpret and will substitute the first term with the second. This lets you make all those predictable changes in one place. Save this in your sources directory.

You can get into stylistic problems if you begin a sentence with a placeholder that gets substituted with an uncapitalized replacement. There may be a good solution, but I haven’t figured it out. You should be able to write around this in your component docs.

Create TOCs

In order to rebuild your original monolithic documents from these pieces, you’ll want to create what I call a TOC for each document, or table-of-contents files. This does not actually insert a table of contents anywhere, it simply defines what the constituent files are and when you run GPP, tells it to assemble its output file from them.

I like to keep each TOC in a separate directory that’s at the same level as my sources directory (This gives me a separate directory to contain my output files.), so my directory tree looks like this:

My Project
     gate
        gate-toc.txt
     leads
        leads-toc.txt
     sources
        variables.txt
        radio_channel_assignments_{leads}_{gate}.md
        …

The contents of each TOC file will look something like this:

#include ../sources/variables.txt
#include ../sources/radio_channel_assignments_{leads}_{gate}.md
…

Because the TOC file is nested in a directory adjacent to the sources directory, you need to “surface” and then “dive down” into the adjacent directory. The leading ../ is what lets you surface, and the sources/ is what lets you dive down into a different directory.

Compilation & conversion

So you’ve got your files refactored and marked up, you’ve got your variables set up, you’ve got your TOCs laid out. Now you want to get back what you had before. Now it’s time for the command line.

Open the Terminal app, type cd followed by a space, drag the folder containing the TOC file you want to compile into the Terminal window (this will insert the correct path), and hit “return”. Use the ls command to confirm that the only file you can see is the TOC file you want to compile.

Now it’s time to run the following command:

gpp source-toc.txt -o source.md

This says “tell GPP to read in the file source-toc.txt, run all the commands in it, and create an output file based on it called source.md”. Make whatever filename substitutions are appropriate. The output file will be in the same directory as the TOC file. This will be a big Markdown file that is assembled from all the component files in the TOC, with all the variable substitutions performed.

Now that you have a Markdown file, the world is your oyster. Some content-management systems can interpret Markdown directly. WordPress requires the Jetpack plugin, but that’s easily installed. So depending on how you’ll be using that document, you may already be partly done.

If you want to convert it to straight HTML, or to an MS Word doc, or anything else, now it’s time to run Pandoc. Again, in the Terminal app, type this in:

pandoc source.md -s -o source.html

This says “tell Pandoc to read in the file source.md and create a standalone (-s) output file called source.html”. Pandoc will create HTML files lacking headers and footers if you leave out the -s. It figures out what kind of output file you want from the dot-extension, and can also produce MS Word files and a host of others. It uses its own template files as the basis for its output files, but you can create your own template files and direct Pandoc to use those instead.

I do my print documents in InDesign, and Pandoc can produce “icml” files that InDesign can flow into a template. Getting the styles set up in that template takes some trial and error, but again, once you’ve got it the way you like it, you don’t need to mess with it again.

Shortcomings and prospects

The one thing this approach lacks is any kind of version control. In my case, I create a directory for every year, and make a copy of the source directory and the rest inside the year directory. This doesn’t give me true version control–I rely on Time Machine for that–but it does let me keep major revisions separate. Presumably using Git for my sources would give me more granular version control.

Depending on what your output formats are going to be, placed images can be a bother. I haven’t really solved this one to my satisfaction yet. You may want to use a PDF-formatted image for the print edition and a PNG-formatted image for the web; Pandoc does let you set conditionals in your documents, but I haven’t played with that yet.

In fact, I haven’t really scratched the surface of everything that I could be doing with GPP and Pandoc, but what I’ve documented here gives me a lot of power. I’ve also recently learned of a different preprocessor simply called PP, which subsumes GPP and can also turn specially formatted text into graphical diagrams using other shell programs, such as Graphviz. I’m interested in exploring that.

The walled gardens of shit

Over a century ago, King Gillette pioneered the razors and blades business model. The DMCA led to a new twist on this: companies have been trying to force you to buy their blades in particular by slapping microchips on them–even when those things don’t really have any need of a microchip–because that makes it illegal to reverse engineer.

This gave us the Keurig coffee machine, which has been successful, but has been deservedly criticized–even by its inventor–for its wastefulness. Keurig attempted to add DRM to their pods, although that backfired.

Catering to the herd mentality of the investor class (“It’ll be like Amazon, but for X!” “It’ll be like Facebook, but for X!” “It’ll be like Uber, but for X!”), this has led to…

The Juicero, a massively over-engineered $400 (marked down from $700) gadget that squeezed $8 DRM-laden bags of fruit pulp into juice. It flopped.

Then the Teaforia, a $400 gadget (marked down from $1000) that makes tea from DRM-laden pods that cost $1 each or more. It flopped.

Now this thing, a spice dispenser that uses DRM-laden spice packets that cost about $5 a pop (spices obviously vary in prices, and it’s not clear how much comes in one of their packets, but I just bought 4 tbsp of cinnamon for $0.35).

These Keurig imitators represent an intersection of at least two bad trends: the Internet of Shit, where stuff that has no need of ensmartening is gratuitously connected to the Internet–a logical consequence of sticking unnecessary DRM-enabling chips on things, with those chips getting cheaper and more powerful–and the walled gardens of yore, like AOL–which companies like Facebook and Google have been attempting to reconstruct on top of the Internet ever since. So now we’ve got walled gardens of shit, filling up with their own waste products. Happily, the market seems to be rejecting these.

Big-number cheat sheet and BetterTouchTool

BetterTouchTool is one of my favorite Mac utilities. A real sleeper: originally it just let you create new trackpad gestures (or remap existing ones), and that was useful enough on its own, but it’s been beefed up with more and more interesting features. One feature I just discovered is that it can display a floating window with any HTML you want. This is a perfect way to show my Big Number Cheat Sheet, which is handy for checking your work when dealing with, well, big Japanese numbers.

To use this, open up BTT, add a new triggering event (can be triggered by a key command or text string, trackpad, whatever), and add the action Utility Actions > Show Floating Web View/HTML menu. Give it a name, set it to a width of 500, height of 750, and paste the following in directly. (Posting this online introduces a space between the opening < and !DOCTYPE — that should be deleted.) Be sure to enable “show window buttons” and/or “close when clicking outside” or the window won’t go away.

< !DOCTYPE html>
<html>
<head>
    <meta charset="utf-8" />
    <title> </title>
    <style> 
        body {
        background-color: #fff;
        font-family: helvetica;
        font-size: 14/18;
        }
        table {
        border-collapse: collapse;
        }
        tr, td, th {
        border: none;
        }
        tr {
        border-bottom: 1px solid #ddd;
        }
        table tr td:nth-child(1), table tr th:nth-child(1) {
        width: 7em;
        padding: 0.5em;
        text-align: right;
        }
        table tr td:nth-child(2), table tr th:nth-child(2) {
        width: 12em;
        padding: 0.5em;
        text-align: left;
        }
        table tr td:nth-child(3), table tr th:nth-child(3) {
        padding: 0.5em;
        text-align: left;
        }
        tr:hover {
        color: #ddd;
        background-color: #333;
        }
    </style>
</head>
<body>
<h1>
    Big number cheatsheet 
</h1>
<table>
    <tr>
        <th> 和 </th>
        <th> English </th>
        <th> Number </th>
    </tr>
    <tr>
        <td> 一万 </td>
        <td> ten thousand </td>
        <td> 10,000 </td>
    </tr>
    <tr>
        <td> 十万 </td>
        <td> one hundred thousand </td>
        <td> 100,000 </td>
    </tr>
    <tr>
        <td> 百万 </td>
        <td> one million </td>
        <td> 1,000,000 </td>
    </tr>
    <tr>
        <td> 千万 </td>
        <td> ten million </td>
        <td> 10,000,000 </td>
    </tr>
    <tr>
        <td> 一億 </td>
        <td> one hundred million </td>
        <td> 100,000,000 </td>
    </tr>
    <tr>
        <td> 十億 </td>
        <td> one billion </td>
        <td> 1,000,000,000 </td>
    </tr>
    <tr>
        <td> 百億 </td>
        <td> ten billion </td>
        <td> 10,000,000,000 </td>
    </tr>
    <tr>
        <td> 千億 </td>
        <td> one hundred billion </td>
        <td> 100,000,000,000 </td>
    </tr>
    <tr>
        <td> 一兆 </td>
        <td> one trillion </td>
        <td> 1,000,000,000,000 </td>
    </tr>
    <tr>
        <td> 十兆 </td>
        <td> ten trillion </td>
        <td> 10,000,000,000,000 </td>
    </tr>
    <tr>
        <td> 百兆 </td>
        <td> one hundred trillion </td>
        <td> 100,000,000,000,000 </td>
    </tr>
    <tr>
        <td> 千兆 </td>
        <td> one quadrillion </td>
        <td> 1,000,000,000,000,000 </td>
    </tr>
    <tr>
        <td> 一京 </td>
        <td> ten quadrillion </td>
        <td> 10,000,000,000,000,000 </td>
    </tr>
</table>
</body>
</html>

Smartphones, image processing, and spectator sports

I’ve done a couple of translations recently that focus on wireless communications, and specifically mention providing plentiful bandwidth to crowds of people at stadiums. Stadiums? That’s weirdly specific. Why stadiums in particular?

My hunch is that this is an oblique reference to the 2020 Tokyo Olympics. OK, I can get that. 50,000 people all using their smartphones in a stadium will place incredible demands on bandwidth.

Smartphones are already astonishingly capable. They shoot HD video. Today’s iPhone has something like 85 times the processing power of the original. I can only wonder what they’ll be like by the time the Tokyo Olympics roll around.

So what would a stadium full of people with advanced smartphones be doing? Probably recording the action on the field. And with all that bandwidth that will supposedly be coming online by then, perhaps they’ll be live-streaming it to a service like Periscope. It’s not hard to imagine that Youtube will have a lot of live-streaming by then as well.

This by itself could pull the rug out from underneath traditional broadcasters. NBC has already paid $1.45 billion for the rights to broadcast the 2020 Olympics in the USA. But I think it could be much, much worse for them.

In addition to more powerful smartphones, we’ve also seen amazing image-processing techniques, including the ability to remove obstructions and reflections from images, to correct for image shakiness, and even to smooth out hyperlapse videos. Youtube will stabilize videos you upload if you ask nicely, and it’s very effective. And that’s all happening already.

So I’m wondering what could be done in five more years with ten or so smartphones distributed around a stadium, recording the action, and streaming video back to a central server. The server could generate a 3D representation of the scene, use the videos to texture-map the 3D structure, and let the viewer put their viewpoint anywhere they wanted. Some additional back-end intelligence could move the viewpoint so that it follows the ball, swings around obstructing players, etc.

So this could be vastly more valuable than NBC’s crap story-inventing coverage. It might be done live or nearly live. It would be done by people using cheap personal technology and public infrastructure. The people feeding video to the server might not even be aware that their video is contributing to the synthesized overall scene (read your terms of service carefully!).

If that happened, the only thing you’d be missing would be the color commentary and the tape delay. Smartphones could kill coverage of sporting events.

Of course, the Olympics and other spectator sports are big businesses and won’t go down without a fight. At the London Olympics, a special squad of “brand police” [had] the power to force pubs to take down signs advertising “watch the games on our TV,” to sticker over the brand-names of products at games venues where those products were made by companies other than the games’ sponsors, to send takedown notices to YouTube and Facebook if attendees at the games have the audacity to post their personal images for their friends to see, and more. What’s more, these rules are not merely civil laws, but criminal ones, so violating the sanctity of an Olympic sponsor could end up with prison time for Londoners. Japan could do much the same in 2020. But if these videos were being served by a company that doesn’t do business in Japan, the show could go on. More extreme measures could be taken to block access to certain IP addresses, deep-packet inspection, etc. Smartphones could use VPNs in return. It could get interesting.

Old-school information management

Applied Secretarial Practice

I recently picked up the book “Applied Secretarial Practice,” published in 1934 by the Gregg Publishing Company (the same Gregg of Gregg shorthand). It’s fascinating in so many ways—even the little ways that language has changed. Many compound words were still separate, e.g. “business man.” The verb “to emphasize” seemingly did not exist, and is always expressed as “to secure emphasis.” And the masculine is used as the generic third-person pronoun rigorously, even when referring to secretaries, who were universally women at that time.

There’s a whole chapter on office equipment, most of which is barely recognizable today, of course. The dial telephone was a fairly recent innovation at that time, and its use is explained in the book.

But what really strikes me is that, out of 44 chapters, 8 are on filing. You wouldn’t think that filing would be such a big deal (well, I wouldn’t have thought it). You would be wrong. What with cross-references, pull cards, rules for alphabetizing (or “alphabeting” in this book) in every ambiguous situation, different methods of collation, transfer filing, etc, clearly, there’s a lot to it.

It got me thinking about how, even though I have pretty rigorous file nomenclature and folder hierarchies on my computer, I’m not organizing my files with anything like the level of meticulous care that secretaries back then practiced as a matter of course. For the most part, if I want to find something on my computer (or anywhere on the Internet), I can search for it.

And that reminded me of a post by Mark Pilgrim from years ago, Million dollar markup (linking to the Wayback Machine post, because the author has completely erased his web presence). His general point was that when you control your own information, you can use “million dollar markup” (essentially metadata and structure) to make that information easier to search or manipulate; a company like Google has “million dollar search” to deal with messy, disorganized data that is outside their control. Back in 1934, people had no choice but to apply million-dollar markup to their information if they wanted to have any hope of finding it. The amount of overhead in making a piece of information retrievable in the future, and retrieving it, is eye-opening.

Consider that to send out a single piece of business correspondence, a secretary would take dictation from her boss, type up the letter (with a carbon), perhaps have her boss sign it, send the original down to the mailroom, and file the copy (along with any correspondence that letter was responding to). It makes me wonder what would have been considered a reasonable level of productivity in 1934. I’ve already sent 17 pieces of e-mail today. And written this blog post. And done no extra work to ensure that any of it will be retrievable in the future, beyond making sure that I have good backups.

Economics of software and website subscriptions

It’s a truism that people won’t pay for online media, except for porn. That’s a little unfair. I’m one of many people who has long had a pro account on flickr, which costs $25/year. Despite flickr’s ups and downs, I’ve always been happy to pay that. It also set the bar for what I think of as a reasonable amount to pay for a digital subscription: I give them $25, they host all photos that I want to upload, at full resolution. Back when people still used Tribe.net, they offered “gold star” accounts for $6/month, which removed the ads and gave you access to a few minor perks, but mostly it was a way to support the website. The value-for-money equation there wasn’t quite as good as with flickr, in my opinion, but I did have a gold-star account for a while.

Looking around at the online services I use, I see there are a few that are offering some variation on premium accounts. Instapaper offers subscriptions at $12/year, or about half of my flickr benchmark. The value for money equation there isn’t great—the only benefit I would get is the ability to search saved articles—but it’s a service I use constantly, and it’s worth supporting. Pinboard (which has a modest fee just to join in the first place) is a bookmarking service that offers an upgraded account for $25/year; here, the benefit is in archiving copies of web pages that you bookmark. I can see how this would be extremely valuable for some people, but it’s not a high priority for me. I use a grocery-list app on my phone called Anylist that offers a premium account for $10/year; again, the free app is good enough for me, and benefits of subscribing don’t seem all that relevant.

In terms of value for money, none of these feel like great deals to me. Perhaps because the free versions are as good as they are, or perhaps because the premium versions don’t offer that much more, or some combination of the two. But I use and appreciate all these services, and maybe that’s reason enough that I should subscribe.

At the other end of the scale, there’s Adobe, which has created quite a lot of resentment by converting its Creative Suite to a subscription model, for the low, low price of $50/month. This offends designers on a primal level. It’s like carpenters being required to rent their hammers and saws. The thing is that $50/month is a good deal compared to their old packaged product pricing, assuming that you would upgrade roughly every two years. The problem is that the economic incentives are completely upside down.

Once upon a time, Quark XPress was the only game in town for page layout, and then Adobe InDesign came along and ate their lunch. Quark thought they had no competition, and the product stagnated. Now Adobe Creative Cloud is pretty much the only game in town for vector drawing, photo manipulation, and page layout.

With packaged software, the software company needs to offer updates that are meaningful improvements in order to get people to keep buying them. Quark was slow about doing that, which is a big part of the reason that people jumped ship. With the subscription model, Adobe uses the subscription as a ransom: when customers stop subscribing, they lose the ability to even access their existing files. Between the ransom effect and the lack of meaningful competition, Adobe has no short-term incentive to keep improving their product. In the long term, a stagnant product and unhappy customers will eventually encourage new market entrants, but big American companies are not noted for their long-term perspective.

I think that’s the real difference here, both psychologically and economically: I can choose to subscribe to those smaller services, or choose not to. They all have free versions that are pretty good, and if any of them wound up disappearing, they all have alternatives I could move to. With Adobe, there are no alternatives, and once you’re in, the cost of leaving is very high.

Word processors and file formats

I’ve always been interested in file formats from the perspective of long-term access to information. These have been interesting times.

To much gnashing of teeth, Apple recently rolled out an update to its iWork suite—Pages, Numbers, and Keynote, which are its alternatives to the MS Office trinity of Word, Excel, and Powerpoint. The update on the Mac side seems to have been driven by the web and iPad versions. Not only in the features (or lack thereof), but in the new file format, which is completely unrelated to the old one. The new version can import the files from the old one, but it’s definitely an importation process, and complex documents will break in the new apps.

The file format for all the new iWork apps, Pages included, is based on Google’s protocol buffers. The documentation for protocol buffers states

However, protocol buffers are not always a better solution than XML – for instance, protocol buffers would not be a good way to model a text-based document with markup (e.g. HTML), since you cannot easily interleave structure with text. In addition, XML is human-readable and human-editable; protocol buffers, at least in their native format, are not. XML is also – to some extent – self-describing. A protocol buffer is only meaningful if you have the message definition (the .proto file).

Guess what we have here. Like I said, this has been driven by the iPad and web versions. Apple is assuming that you’re going to want to sync to iCloud, and they chose a file format optimized for that use case, rather than for, say, compatibility or human-readability. My use case is totally different. I’ve had clients demand that I not store their work in the cloud.

What’s interesting is that this bears some philosophical similarities to the Word file format, whose awfulness is the stuff of legend. Awful, but perhaps not awful for the sake of being awful. From Joel Spolsky:

The first thing to understand is that the binary file formats were designed with very different design goals than, say, HTML.

They were designed to be fast on very old computers.

They were designed to use libraries.

They were not designed with interoperability in mind.

New computers are not old, obviously, but running a full-featured word processor in a Javascript interpreter inside your web browser is the next best thing; transferring your data over a wireless network is probably the modern equivalent of a slow hard drive in terms of speed.

There is a perfectly good public file format for documents out there, Rich Text Format or RTF. But curiously, Apple’s RTF parser doesn’t do as good a job with complex documents as its Word parser—if you create a complex document in Word and save it as both .rtf and .doc, Pages or Preview will show the .doc version with better fidelity. Which makes a bit of a joke out of having a “standard” file format. Since I care about file formats and future-proofing, I saved my work in RTF for a while. Until I figured out that it wasn’t as well supported.

What about something more basic than RTF? Plain text is, well, too plain: I need to insert commentary, tables, that sort of thing. Writing HTML by hand is too much of a PITA, although it should have excellent future-proofing.

What about Markdown? I like Markdown a lot. I’m actually typing in it right now. It doesn’t take long before it becomes second nature. Having been messing around with HTML for a long time, I prefer the idea of putting the structure of my document into the text rather than the appearance.

But Markdown by itself isn’t good enough for paying work. It has been extended in various ways to allow for footnotes, commentary, tables, etc. I respect the effort to implement all the features that a well-rounded word processor might support through plain, human-readable text, but at some point it just gets to be too much trouble. Markdown has two main benefits: it is highly portable and fast to type—actually faster than messing around with formatting features in a word processor. These extensions are still highly portable, but they are slow to type—slower than invoking the equivalent functions in a typical WYSIWYG word processor. The extensions are also more limited: the table markup doesn’t accommodate some of the insane tables that I need to deal with, and doesn’t include any mechanism for specifying column widths. Footnotes don’t let me specify whether they’re footnotes or endnotes (indeed, Markdown is really oriented toward flowed onscreen documents, where the distinction between footnotes and endnotes is meaningless, rather than paged documents). CriticMarkup, the extension to Markdown that allows commentary, starts looking a little ungainly. There’s a bigger philosophical problem with it though. I could imagine using Markdown internally for my own work and exporting to Word format (that’s easy enough thanks to Pandoc), but in order to use CriticMarkup, I’d need to convince my clients to get on board, and I don’t think that’s going to happen.

I can imagine a word processor that used some kind of super-markdown as a file format, let the user type in Markdown when convenient, but added WYSIWYG tools for those parts of a document that are too much trouble to type by hand. But I’m not holding my breath. Maybe I should learn LaTeX.

PDFs, transparencies, and Powerpoint

This post is for anyone who gets frustrated trying to place vector art in a Powerpoint deck.

Gwen had a project to produce a set of ppt templates, using vector art provided by the client. Copying from Illustrator and pasting into Powerpoint, it looked fine, but saving and reopening the file showed that the vector art had been rasterized—badly.

We tried a few variations on this. Saving as PDF and placing the PDF had the same result. Saving as EMF and placing that did keep it as vector artwork, but the graphic wound up being altered in the process.

Other graphics created in Illustrator could be pasted or placed just fine, so there had to be something about this particular graphic. Although it was relatively simple, it included a couple of potential gotchas: it had one element with transparency set on it, and another with a compound path.

It was pretty easy to release the compound path and reconstruct around it—a big O with the center knocked out to expose the background. I’m pretty sure that wasn’t the problem, but it wound up helping anyhow, as I’ll discuss.

Dealing with the transparency was a little more of an issue: a transparent chevron floated over a couple of different solid colors, including the big O. To fix this, I used Pathfinder’s Divide tool to segment that chevron into separate pieces for each color it was floating over, and then set solid colors for each segment rather than relying on changing the opacity. Experimentation showed that the transparency was what triggered the rasterization.

Reproducing this process showed some artifacts in Powerpoint if the compound path was still present, so that wasn’t the problem, but it was a problem. Admittedly, this was only feasible because the image was simple, and the transparent element only covered three solid color fields, with no gradients or pattern fills—it would still be possible with those complications, but it would take a lot longer to approximate the original appearance.

Update: And if I was better at Illustrator, I would have realized the “Flatten Transparency…” command does exactly this, in one step. That would be the way to go.

This experiment was performed using two Macs, with Excel 2011 and both the CS4 and Creative Cloud versions of Illustrator.

Phone report

Gwen and I decided to update to the new iPhone 5, and along with that, I decided to switch carriers to Verizon. We’d previously been with AT&T, and Verizon was the one service that neither one of us had ever tried.

AT&T has notoriously bad service in San Francisco and New York from what I understand, but I had never had any trouble with them in Austin—except when there’s a big event in town that brings an influx of tens of thousands of visitors (and they’ve actually gotten pretty good about dealing with that). They do have lousy service out in the sticks—when I was riding the Southern Tier, I went a couple of days at a time without a signal. Verizon has better coverage in remote areas, including the site where Flipside is held, and now that I’m on the LLC, it will be more important for people to be able to reach me easily out there.

But so far, Verizon in Austin is not so great. I had no signal at all when I was inside Breed & Co on 29th St the other day. And Gwen had no data signal at Central Market on 38th St. And sound quality on voice calls seems to be worse than AT&T’s (this could be the phone itself, but I suspect it’s a voice codec issue). Usually, when I am getting a signal, it’s with LTE data, which is very fast. So there’s that.

And while I always felt that AT&T regarded me as an adversary, Verizon seems to regard me as a mark, which is even more galling than the poorer coverage. Immediately after signing up, I started getting promotional text-message spam from them. Apparently this can be disabled if you do the electronic equivalent of going into a sub-basement and shoving aside a filing cabinet marked “beware of the panther.” We also have those ARPU-enhancing “to leave a callback number…” messages tacked onto our outgoing voicemail greetings; some research showed that there are ways to disable this that vary depending on what state you live in (!), but none of them have worked for me so far. I’ve put in a help request. And every time I log into their website (mostly to put in help requests to deal with other annoying aspects of their service), they pop up some damn promotion that’s irrelevant to me. Like “get another line!”. Out of all the mobile carriers, the only one that I liked dealing with was T-mobile—but they’ve got the poorest coverage in Austin (I had to walk 2 blocks away from Gwen’s old place to get a signal), or anywhere else for that matter. As a friend who worked in the mobile-phone industry for years put it “They all suck.”

No complaints about the phones. I haven’t really tried out some of the new hardware features, like Bluetooth 4.0. The processor is much faster. The screen is noticeably better than on the iPhone 4, in addition to being bigger. People bitch about Apple’s Maps app. In Austin, I haven’t had any trouble with it, and in any case, Maps+ is available to give you that Google Maps feeling (in Iceland, I found that neither Apple Maps nor Google Maps had a level of granularity down to the street address—the best they could do was find the street).

Older posts

© 2018 Adam Rice

Theme by Anders NorenUp ↑