April 2012 – Survivor to Software

Fear Lying Upon a Pallet

Almost all of of my recent work has been using NoSQL solutions, my favorite of which is Couchdb. Easily the best feature of Couch is the RESTful JSON API that it uses to provide data. Because you get your DB queries back to you directly as JavaScript objects, you don’t have to worry about application servers or middle tier systems for N-Tier development. This is HUGE and make the whole web development (and given that most mobile applications are actually web apps) must cleaner, faster, and more functional for the end user.

Couch does have a couple weaknesses. The one that has been giving me the most headaches is the lack of documentation for their parameters that the server can handle as part of the JSON View (map/reduce) phase. So here are a number that I have found useful over the last few months. I will update this list as I find more.

key=abc The most commonly passed option to a given couchdb view. This provides a way to select a single unique (well, I guess probably unique) key for a given view. That said, view keys DON’T HAVE TO BE UNIQUE in couchdb. Meaning, that if more than one result returns with abc this will also return those multiple results.
keys=[abc,123,ACC] A JSON encoded list of keys to use in a given map/reduce function. Basically the same as above but without the need to call multiple network queries.
startkey=abc Used with endkey=abC to provide reange selection for a given view. startkey will accept (as valid input) anything that would be valid in a standard couchdb view key, even JSON objects. So think startkey=[a,]&endkey=[a,{}] to get a range of all keys=[a,somethingElse].
endkey=abC Counterpart of startkey, see the above reference. One thing to note, it is generally better to specify another object and the end of a range if you want to inclusively select a range. So {} is a better end range value than ZZZZZZZZ is.
limit=100 Select on the first N number of results. This parameter is particularly useful for paginated return results (like “showing 1-100 of 399.) Reduces network bandwidth for a given request. Because map/reduce functions are cached upon request, the response time for the server isn’t any faster, but there is less data to download.
skip=100 Work with the above parameter limit to return a group result set. For example you can limit the return result to 100 documents starting from 101 going through 200 (think email counts in gmail) with the ?limit=100&skip=100.
descending=true Reverses the return result order. Will also work with limit, skip, startkey, etc…
group=true The default result for a given map/reduce function (which has been re-reduced) is a total, i.e. a single number. In my case this is seldom the result I am actually looking for so this command provides the bridge between the full re-reduce and what I is most commonly sought, the groups result. Your final results when this option have been passed it to return the reduced functions grouped by the map keys. Instead of a single row with {key:null, value:9999} you will get multiple rows with the key being the name of the map key i.e [{key:”bob”,value:”444″},{key:”tom”,value:555}]. If you created design documents and view them inside of Futon, group=true is the default. Which can be a little confusing when you actually try and make a JSON request and find you get a different result.
group_level=2 An alternative to the above parameter is the group_level option which will actual group the resulting reduce by the number of levels specified IF you key is an array matching at least that many arguments. While the example above is for two levels the number can be as many array places as your key has. This become particularly helpful when working with years and dates. For a detailed example checkout this post. That said, group=true is the functional equivalent of group_level=exact.
reduce=false Turn OFF the reduce function on a given map/reduce query. This is the default if not reduce is defined but you can override it on views that DO have a reduce function if you only want the results of the map.
include_docs=true For map queries (that do not have a corresponding view) this option will include the original document in the final result set. This means the structure of your JSON rows object will be {_id, key, value, doc} instead of the normal {_id, key, value}. This will save you an additional JSON request if you are using the map query as a lookup for a particular database query.

Landed on Us

The new graphical boot splash for Linux is a program call Plymouth. It provides feature like kernel mode graphics, flicker free boot messages, full boot logging, and… animation. The install is pretty simply, as the root user do the following:

yum -y install plymouth-theme-*
plymouth-set-default-theme –list (to see a list of all installed plymouth themes)
plymouth-set-default-theme nameOfMyTheme -R

Of particular note is that the -R is different from earlier installs of plymouth that required you run the command plymouth-rebuild-initrd. Most tutorials online list the old way of rebuilding plymouth and following them will leave you with an unchanged system.

One of the nice features of plymouth is that the boot splash is loaded before the actual boot process when the initial RAM disk image is being loaded. This means you get the pretty boot image while you are doing things like entering your hard drive decryption pass phrase.

that the ripest might fall

Git is simply amazing when it comes to branching and merging. I probably have half a dozen branches, each with a unique feature I am working on. Git makes combining and working with branches so easy that it simply seems natural to store test-functionality on different branches, across multiple repositories, and between different developers.

…and HOLY CRAP is it fast!

That said I hit a problem today that I had to hunt down the answer to, so I am posting it here for easy reference in the future. The basic issue is that whenever you create a new branch that is then pushed to someone else as a remote branch, git automatically (as would be logical) associates those branches together because they share the same name. This makes future pushes easy and allows other users to continually get updates when you have commits you want to make available.

The problem occurs when you try to PULL from that new remote branch (because the users or repository has made some of their own changes.) Git does NOT automatically associate pulls with the remote branch of the same name, even though this associates pushes. So how do you fix this? The error message says to modify your config file and look at the git-pull man page, but that could quickly cause insanity based solely on the extent of the complexity of the command set for git. I probably spent an hour looking through documentation.

The answer was, like you didn’t know where this was going, Google. Ultimately you can the problem with a fairly simple command that WILL AUTOMATICALLY update your config.

git branch --set-upstream yourBranchName origin/yourBranchName

And that is it! Hopefully that saves someone the time that I lost.

Another fix I ran across comes from an initial pull from a remote repository that is empty. Because the original repository (empty) shares no files in common with my local repository (that I have added files to.) Git gets upset. The error generated looks something like this:

No refs in common and none specified; doing nothing.
Perhaps you should specify a branch such as ‘master’.
fatal: The remote end hung up unexpectedly

It can be fixed with the following command:

git push origin master

Then all remaining pushes should be fine as now you have a shared reference to use for future pushes.

Geniuses remove it

I am starting to believe that in the Poettering household, simplicity was considered a cancer that must be tortured and destroyed with extreme vigor. Systemd is quickly becoming thoroughly ubiquitous in Linux systems everywhere. While Systemd tries to do everything for everybody (it is supposed to eventually replace sysvinit, chkconfig, automount, logging, cron, and a whole host of other things) ultimately the primary intent of Systemd is to speed up the boot process. It does this job exceedingly well. This concern about boot time is a direct response to the speed of which other Unix based OSes boot and reference material even explicitly points to Apple as a reference.

That said, sysvinit did have one thing going for it… IT WAS SIMPLE. Heck, just getting a list of available services is a pain in the ass now and generally requires looking up documentation just to remember how to do it. Simple actions in systemd are annoyingly complex with a cheat sheet that looks like it was written by a Perl regular expression programmer on acid. I will be the first to admit that systemd-analyze plot is pretty awesome and, considering that systemd was designed by the same guy who created PulseAudio, we should probably be thankful that it isn’t even MORE complex. But still, something just seems wrong about using an all-for-everything program on an OS that was designed to be simple and efficient.

to create a space for them

OK, first I have a new favorite quote:

Concentrated power is not rendered harmless by the good intentions of those who create it.
–Milton Friedman

And second… Well, this was a topic I had not expected to be posting about again but the last couple weeks I have found myself spending more and more time building RPM packages for Fedora. Thankfully the development stack (and documentation) for Fedora is noticeably better than it was for Redhat 9. So, in my usual fashon, I am listing some of the more useful information I have RECENTLY come across for building RPM packages on Fedora 16.

Recommended Method for adding Users & Groups — A Fedora wiki page that discusses the best way to add new users to a system during the rpm install process. There is no recommendation for REMOVING users during uninstall. Additionally, rpmlint will scream about un-registered users if you don’t provide reference users for rpmlint. This bugzilla report discusses how to best alleviate that problem.
Packaging Tricks — A stupidly useful Fedora wiki article discussing common issues/fixes for doing package builds. Some of them are simply look-up problems (like knowing group package groups are available.) Some of the information is much more advanced package configuration tips (like converting badly encoded files to UTF-8.) All are really helpful.
Frequently Made Mistakes — In the same vein as the Packaging Tricks but specifically focused on problematic RPM methodology. One correction on this page. The correct location for checking SPEC files from other Fedora packages is not correctly listed (Fedora doesn’t use CVS anymore.) The correct location is in their git repository.
Creating Sub-packages — Is a very early stage draft document on the Fedora Documentation website that discusses how to best create multiple sub-packages from a given SPEC file. I had been needing good documentation on this process and this seems to be the start of it.
RPM Groups — Raw list of valid RPM package groups.
How to Make RPM Packages — Exactly what the name implies. Probably the best starting point for Fedora Linux software packagers.
rpm –showrc — This command will list all the current Macros defined for the rpm build environment. It even includes your custom local setup. It is a great place to grep for path information and to verify directory locations for installation. It has probably been around forever but I honestly didn’t know about it until a couple days ago.
rpmdev-setuptree is one of several tools available in the rpmdevtools package (yum install rpmdevtools.) Running this command will setup a local build directory in THAT USERS home directory (as you should NEVER build packages as root using the system wide build directory.) Additionally it will create a stock .rpmmacros config file. You will still want to define your own %packager and %vendor macros.
Package Guidelines – The definitive guide from Redhat on creating Fedora/Redhat rpm files for distribution.
RPM Dev Tools – Web listing of some of the new automated packing tools for RPM based distributions. Things like creating your default build environment and spec file format checker.
CPAN2RPM – A tool for building rpm files from the Comprehensive Perl Archiving Network. While tools like cpanplus work well for package installation, I prefer the flexibility and consistency of rpm packages and this is a nice way to be able to use rpm files for CPAN modules.
cpanspec – Another tool for building spec files (and therefore rpm packages) from cpan repository information. Generally I use cpan2rpm to create a basic package and then modify the spec file to work anyway, so this might be a better option.