Trader Tech Talk 007: Preserve the code!

In Episode 7, John brings us good programming information on using source code control to preserve, backup and maintain your source code.  These programming resources can be used for any programming project to help you stay more organized, help resolve bugs and defects more quickly and release more stable versions of your program.

In this episode you will learn:

  • Two excellent FREE source code control packages
  • How to back up your source code online
  • Which tool to use to find what changed recently in your code

Resources

 Listener feedback!

Radoslav Horak wrote in to say
“Hi John, some years ago I found on the Internet Rob Mitchel, a World Cup championship futures trading winner, that explores seasonalities and other types of market predictions here: http://eminiforecaster.com/blog/?s=Moon

Chris Davison sent a tweet about his blog called Building a Trading Robot, and it can be found at http://buildingatradingrobot.wordpress.com/

Timothy Chuma wrote in to tell me about a tool called Squirro which can be found here: http://www.censeocomputing.com/request-a-squirro-demo.html  Squirro does sentiment analysis, and is a tool that does exactly what we talked about in episode 2 of Trader Tech Talk.

Bill Martensson emailed to tell me about FXCM’s Java and C++ API: http://www.dukascopy.com/client/javadoc/

Evan Raine sent over a tweet to say “Great podcast John! A niche that needed filling. Looking forward to more episodes, books, or anything you produce.”

Thank you all for your kind comments!  Keep listening!

Transcript

This episode is about programming best practices.  I have been writing code for a long time, and make use of some common industry standard tools that I don’t see used a whole lot in the trading world.  One of those tools is source code control.

What is source code control?  It’s a tool that allows you to easily save a backup version of your program for retrieval or comparison later.  But there are so many more aspects to source code control.

Would you like to be able to back up your code offsite with a couple clicks?  Source code control can do that.  Would you like to be able to return to a previous version of your program quickly?  Source code control can do that.  Would you like to be able to branch your program into two different versions to test an idea, and then merge the new idea back into the main program?  Source code control can do that. Would you like to be able to work with a team member on the same program at the same time, without modifying each others’ work?  Yep, source code control is for that too.  How about being able to view today’s version of the program and yesterday’s on the same screen, and see what changed?  Source code control again.  And then how about an automatic backup off-site to an online repository?  Once again, source code control is the answer.

The basics of source code control are simple.  Using the tool, you create a repository, which is just a special folder for your source code.  You then begin creating your program in the normal way; or, if you already have source code to add, you use the tool to add your source code to the files it is keeping track of.   In older versions of source code, you would “check out” your source code to indicate you are working on it, and then “check in” your code to indicate you are finished with a version of it.  Each time you check out and back in, you create a revision of your code, even if the change was only in one line. You can consider each revision a separate backup of your software.   Newer tools only require you to “commit” your code; the checking out is no longer necessary.

The tool then allows you to view or recover source code from many versions ago, it allows you to see the difference on the screen between version A and B, and it allows you to do branches and merges.  A branch is a second version of your code that deviates from the main version; and a merge is “blending” two versions of your code back into one.

Source code control has been around in many forms for many years.  One of the earliest versions I used was RCS which required “check in” and “check out”.  That was in the early 90’s.  Microsoft got into the source code revision business around then too when they acquired an add-on called “Source Safe”.  It allowed programmers and especially teams of programmers to work on code together, release a solid tested revision of the software, and then track changes and updates throughout the life cycle of a software project.

A later tool I used was CVS, which was a new development in that the re-visioning system was “concurrent”; that is multiple programmers could be working on the same files at the same time, and CVS would figure things out when the code was committed; as long as programmers weren’t working on the same line of code, their work could be safely merged into a single version.

Most recently, there are two or three common source code control tools: mercurial and git, and also bazaar (which I have not used.)   The innovations that these new tools bring to us are the ability to commit your code to a repository in the cloud (offsite somewhere), as well as the ability to do merges and commits independently; it gets a bit technical, but just know that the new tools are very good and very flexible.

Let me also point out that almost all of the tools I have used for source code control are open source.  People have varying opinions of open source projects, but the source code control projects are well supported and are considered industry standard tools. Open source means they are free (please donate if you can), and have documentation, tutorials and add-ons freely available online.

There are some subtle differences between mercurial and git, but essentially they are similar in what they do and how they do it.

Let’s run through an example: You’re writing a Metatrader automated strategy for your client, who is trying to get you to code his manual trading strategy into an automated one.   You first set up the mercurial repository, and begin writing code.  At the end of day 1, you have something written, but it doesn’t work yet.  You commit day 1’s code along with a comment describing the day’s work.  On day 2, you finish up the simple version of your client’s strategy, and do some testing.  You commit code for day 2 and a comment, and send it to the client. Your client tests your code, to make sure it works the way he expects.  On day 3, 4 and 5, you work on the code, and commit it at the end of each day with a comment.  Then on day 6, the customer calls you and says that something has changed.  One thing that used to be working no longer is, and he wants it back the way it was.  No problem, you say; you just use a mercurial command to revert your MT4 code back to what you had on day 5, and continue work from there.

On day 7, your customer asks you to try an idea he’s been thinking of; sure, you say, no problem. You create a branch for your program, and write some code to test his idea here in this branched version.  For a few days, you work only on this special branch, and by day 10, you customer is happy with this special idea.  Meanwhile you need to get caught up on the main program, so you work on the main branch for a few days, and by day 14, it’s time to put it all together. You merge the branched version into the main version, combining the original code with the special branch idea.

Now let’s say you need to give some of the work to an assistant programmer;  she works on some of the functions you haven’t had time to finish completely; meanwhile, you are working on the main program, doing additional back testing and tweaking of the code as you go.  Both you and your assistant are committing your code with a comment several times a day, and each time, mercurial is merging your changes into a single version.

Finally, you are ready to deliver the finished product; you invite your client to view the code online, where your committed changes have been synchronizing after each commit. Your client is able view the source code and download the compiled version of the program, and, satisfied with your work, pays you for your work.

We could go on and describe how your client might later find a bug, and how you are able to go through the history of the files and read through the comments, and check the “diffs” between the files, until you find where the bug was introduced; then you would code the fix for the bug, and commit the changes, with a comment indicating the bug, the fix, and what else might be affected.

You can easily see the advantages of using source code control.  There is very little downside to this; source code files are not altered in any way; the revision control happens in external files that only the tool sees.   The only thing that you need to do is do a bit of planning (which, if you’re organized, you should be doing anyway.)  And you need to do some reading up on the mercurial or git programs.  These tools are not to difficult to get started with, but the huge numbers of options and configurations are daunting.  Learning how to branch and merge will take some time and some testing.  But learning such a skill will benefit your programming immensely in the future.

Sure, you might be able to roll your own backup and restore feature, and you might already be commenting your changes, and keeping good track of branches and so on.  But I doubt you have the feature set of true source code control in your shop if you’re not using one of these tools. You will seriously benefit from using a system like Mercurial or Git in your programming.

Does source code control still make sense for the one-programmer shop?  Yes, most definitely.  The idea of frequent backups alone is worth it; and a frequent backup that you can immediately revert back to is even better.  Putting your code in an online repository where your client or a mentor can view it is also very useful.  And, if you have never used a diff tool, you will be amazed at how useful this is.

Let’s talk about that for a minute.  Diff means “difference”; a diff tool shows you the difference between the source file you are working on and, say, the one you last committed.  So, in this current version, let’s say you removed a variable you are no longer using, you changed another variable from an int to a double, and then you added one new function.  When you run the diff tool, it will show you highlighted in red, the line where you removed the unused variable; in green, it will highlight the new function you added, and in yellow, it will show where you changed the int to a double.  You have immediate visual confirmation of what changed. So often when searching for a bug that has just appeared, the most important thing to know is what changed between last commit and this one.  Diff is your tool.

The online repository part of mercurial is an independent site called BitBucket.org.  BitBucket works for both Mercurial or Git, so you have a choice of which system to use with the repository.  On your own local PC or Mac, you can commit your changes, and then either automatically, or manually, mirror the changes to BitBucket.  That gives you the ability to get to your source code from anywhere, which can come in handy, and it also gives you an off-site backup of your code.  You can browse through your latest changes, and see the comments you added for each revision.  And you can invite others to view the code as well.  That means if you are working for a client, you can give the client access to the code as you are working on it.  (Not everyone works that way, but if you do, this tool makes it really easy.)
One final tool that really makes things easy on the PC side is called TortoiseHG.  If you used CVS in the past, you’ll recognize the Tortoise part.  The TortoiseCVS product added a Windows Explorer plug-in for CVS, and the TortoiseHG add-on does the same for Mercurial.  It allows you to right-click in Windows to create a repository, add a file, commit a file, and pretty much any thing you’d otherwise have to do with the command line.  I rarely use the command line part of Mercurial, since my needs for source code control are quite basic.  I use the Windows Explorer plug in almost all the time.
There are other plug-ins for pretty much any source code IDE that you are using.
Are you convinced? Are you ready to get started? In a future post, we’ll go over the whole process of setting up a repository, adding files, and committing changes, both locally, and in Bitbucket.  For now, take a look at these tools, and get familiar with them. They are well worth the effort.
 

 

Did you enjoy this article?
"Get my free programmer's checklist"
The checklist gives you step-by-step instructions on how to create the perfect automated trading system.

Leave a Reply

Your email address will not be published. Required fields are marked *