Trader Tech Talk 007: Preserve the code!

In Episode 7, John brings us good programming information on using source code control to preserve, backup and maintain your source code.  These programming resources can be used for any programming project to help you stay more organized, help resolve bugs and defects more quickly and release more stable versions of your program.

In this episode you will learn:

  • Two excellent FREE source code control packages
  • How to back up your source code online
  • Which tool to use to find what changed recently in your code

Resources

 Listener feedback!

Radoslav Horak wrote in to say
“Hi John, some years ago I found on the Internet Rob Mitchel, a World Cup championship futures trading winner, that explores seasonalities and other types of market predictions here: http://eminiforecaster.com/blog/?s=Moon

Chris Davison sent a tweet about his blog called Building a Trading Robot, and it can be found at http://buildingatradingrobot.wordpress.com/

Timothy Chuma wrote in to tell me about a tool called Squirro which can be found here: http://www.censeocomputing.com/request-a-squirro-demo.html  Squirro does sentiment analysis, and is a tool that does exactly what we talked about in episode 2 of Trader Tech Talk.

Bill Martensson emailed to tell me about FXCM’s Java and C++ API: http://www.dukascopy.com/client/javadoc/

Evan Raine sent over a tweet to say “Great podcast John! A niche that needed filling. Looking forward to more episodes, books, or anything you produce.”

Thank you all for your kind comments!  Keep listening!

Transcript

This episode is about programming best practices.  I have been writing code for a long time, and make use of some common industry standard tools that I don’t see used a whole lot in the trading world.  One of those tools is source code control.

What is source code control?  It’s a tool that allows you to easily save a backup version of your program for retrieval or comparison later.  But there are so many more aspects to source code control.

Would you like to be able to back up your code offsite with a couple clicks?  Source code control can do that.  Would you like to be able to return to a previous version of your program quickly?  Source code control can do that.  Would you like to be able to branch your program into two different versions to test an idea, and then merge the new idea back into the main program?  Source code control can do that. Would you like to be able to work with a team member on the same program at the same time, without modifying each others’ work?  Yep, source code control is for that too.  How about being able to view today’s version of the program and yesterday’s on the same screen, and see what changed?  Source code control again.  And then how about an automatic backup off-site to an online repository?  Once again, source code control is the answer.

The basics of source code control are simple.  Using the tool, you create a repository, which is just a special folder for your source code.  You then begin creating your program in the normal way; or, if you already have source code to add, you use the tool to add your source code to the files it is keeping track of.   In older versions of source code, you would “check out” your source code to indicate you are working on it, and then “check in” your code to indicate you are finished with a version of it.  Each time you check out and back in, you create a revision of your code, even if the change was only in one line. You can consider each revision a separate backup of your software.   Newer tools only require you to “commit” your code; the checking out is no longer necessary.

The tool then allows you to view or recover source code from many versions ago, it allows you to see the difference on the screen between version A and B, and it allows you to do branches and merges.  A branch is a second version of your code that deviates from the main version; and a merge is “blending” two versions of your code back into one.

Source code control has been around in many forms for many years.  One of the earliest versions I used was RCS which required “check in” and “check out”.  That was in the early 90’s.  Microsoft got into the source code revision business around then too when they acquired an add-on called “Source Safe”.  It allowed programmers and especially teams of programmers to work on code together, release a solid tested revision of the software, and then track changes and updates throughout the life cycle of a software project.

A later tool I used was CVS, which was a new development in that the re-visioning system was “concurrent”; that is multiple programmers could be working on the same files at the same time, and CVS would figure things out when the code was committed; as long as programmers weren’t working on the same line of code, their work could be safely merged into a single version.

Most recently, there are two or three common source code control tools: mercurial and git, and also bazaar (which I have not used.)   The innovations that these new tools bring to us are the ability to commit your code to a repository in the cloud (offsite somewhere), as well as the ability to do merges and commits independently; it gets a bit technical, but just know that the new tools are very good and very flexible.

Let me also point out that almost all of the tools I have used for source code control are open source.  People have varying opinions of open source projects, but the source code control projects are well supported and are considered industry standard tools. Open source means they are free (please donate if you can), and have documentation, tutorials and add-ons freely available online.

There are some subtle differences between mercurial and git, but essentially they are similar in what they do and how they do it.

Let’s run through an example: You’re writing a Metatrader automated strategy for your client, who is trying to get you to code his manual trading strategy into an automated one.   You first set up the mercurial repository, and begin writing code.  At the end of day 1, you have something written, but it doesn’t work yet.  You commit day 1’s code along with a comment describing the day’s work.  On day 2, you finish up the simple version of your client’s strategy, and do some testing.  You commit code for day 2 and a comment, and send it to the client. Your client tests your code, to make sure it works the way he expects.  On day 3, 4 and 5, you work on the code, and commit it at the end of each day with a comment.  Then on day 6, the customer calls you and says that something has changed.  One thing that used to be working no longer is, and he wants it back the way it was.  No problem, you say; you just use a mercurial command to revert your MT4 code back to what you had on day 5, and continue work from there.

On day 7, your customer asks you to try an idea he’s been thinking of; sure, you say, no problem. You create a branch for your program, and write some code to test his idea here in this branched version.  For a few days, you work only on this special branch, and by day 10, you customer is happy with this special idea.  Meanwhile you need to get caught up on the main program, so you work on the main branch for a few days, and by day 14, it’s time to put it all together. You merge the branched version into the main version, combining the original code with the special branch idea.

Now let’s say you need to give some of the work to an assistant programmer;  she works on some of the functions you haven’t had time to finish completely; meanwhile, you are working on the main program, doing additional back testing and tweaking of the code as you go.  Both you and your assistant are committing your code with a comment several times a day, and each time, mercurial is merging your changes into a single version.

Finally, you are ready to deliver the finished product; you invite your client to view the code online, where your committed changes have been synchronizing after each commit. Your client is able view the source code and download the compiled version of the program, and, satisfied with your work, pays you for your work.

We could go on and describe how your client might later find a bug, and how you are able to go through the history of the files and read through the comments, and check the “diffs” between the files, until you find where the bug was introduced; then you would code the fix for the bug, and commit the changes, with a comment indicating the bug, the fix, and what else might be affected.

You can easily see the advantages of using source code control.  There is very little downside to this; source code files are not altered in any way; the revision control happens in external files that only the tool sees.   The only thing that you need to do is do a bit of planning (which, if you’re organized, you should be doing anyway.)  And you need to do some reading up on the mercurial or git programs.  These tools are not to difficult to get started with, but the huge numbers of options and configurations are daunting.  Learning how to branch and merge will take some time and some testing.  But learning such a skill will benefit your programming immensely in the future.

Sure, you might be able to roll your own backup and restore feature, and you might already be commenting your changes, and keeping good track of branches and so on.  But I doubt you have the feature set of true source code control in your shop if you’re not using one of these tools. You will seriously benefit from using a system like Mercurial or Git in your programming.

Does source code control still make sense for the one-programmer shop?  Yes, most definitely.  The idea of frequent backups alone is worth it; and a frequent backup that you can immediately revert back to is even better.  Putting your code in an online repository where your client or a mentor can view it is also very useful.  And, if you have never used a diff tool, you will be amazed at how useful this is.

Let’s talk about that for a minute.  Diff means “difference”; a diff tool shows you the difference between the source file you are working on and, say, the one you last committed.  So, in this current version, let’s say you removed a variable you are no longer using, you changed another variable from an int to a double, and then you added one new function.  When you run the diff tool, it will show you highlighted in red, the line where you removed the unused variable; in green, it will highlight the new function you added, and in yellow, it will show where you changed the int to a double.  You have immediate visual confirmation of what changed. So often when searching for a bug that has just appeared, the most important thing to know is what changed between last commit and this one.  Diff is your tool.

The online repository part of mercurial is an independent site called BitBucket.org.  BitBucket works for both Mercurial or Git, so you have a choice of which system to use with the repository.  On your own local PC or Mac, you can commit your changes, and then either automatically, or manually, mirror the changes to BitBucket.  That gives you the ability to get to your source code from anywhere, which can come in handy, and it also gives you an off-site backup of your code.  You can browse through your latest changes, and see the comments you added for each revision.  And you can invite others to view the code as well.  That means if you are working for a client, you can give the client access to the code as you are working on it.  (Not everyone works that way, but if you do, this tool makes it really easy.)
One final tool that really makes things easy on the PC side is called TortoiseHG.  If you used CVS in the past, you’ll recognize the Tortoise part.  The TortoiseCVS product added a Windows Explorer plug-in for CVS, and the TortoiseHG add-on does the same for Mercurial.  It allows you to right-click in Windows to create a repository, add a file, commit a file, and pretty much any thing you’d otherwise have to do with the command line.  I rarely use the command line part of Mercurial, since my needs for source code control are quite basic.  I use the Windows Explorer plug in almost all the time.
There are other plug-ins for pretty much any source code IDE that you are using.
Are you convinced? Are you ready to get started? In a future post, we’ll go over the whole process of setting up a repository, adding files, and committing changes, both locally, and in Bitbucket.  For now, take a look at these tools, and get familiar with them. They are well worth the effort.
 

 

Is the Market Up, Down or Flat?

When writing a trading system, one of the problems I encounter is determining what kind of market we are in, and what kind of market the software will do well in. Some trading systems do well in smooth trending markets, but do poorly in volatile markets, and other systems need the volatility to do their work.

Author Van Tharp divides market types into several different categories of bear, bull or neutral, and then a volatility qualifier for each one.  He uses a 100-day period of the S&P 500 to determine the market type.  Tharp explains his system and the reasons behind it in several of his books.

An article from the March 2013 issue of Futures magazine gives a different approach.  The author, Billy Williams, explains how to use a 20-period moving average and a 40-period moving average in his article “What’s your market type?”  In his method, he uses the weekly charts of the financial instrument he is trading to determine its market type, with these rules:

  • When the 20-period moving average is above the 40, the market is bullish.
  • When the 40-period moving average is about the 20, the market is bearish.
  • When there’s no clear winner, the market is going sideways.

Other authors I have read use different tools such as pivot points and average true range to determine the general market type, before making trading decisions.  The point is that determining the market type is critical information.  You need to know what kind of market you are in before you do anything else.

Once you determine the general market type you are in, you can make better decisions in your code; if your trading system works really well in trending markets, then you can decide to sit out a volatile period, or perhaps switch strategies until the market returns to a trending state.

Take a look at Van Tharp’s market classification system and Billy Williams’ market type rules, and use these in your code.  Both Tharp’s and Williams’ systems are easily coded, using simple math to make their classifications.

Trader Tech Talk 006: Quant Trading, Rotational Systems, and Filters

In Episode 6, John brings us part 2 of Howard Bandy’s Quantitative Trading Systems book.  Howard Bandy has written a number of books on trading systems; John reviews some of Bandy’s techniques and trading strategies from the book.

In this podcast, you will learn

  • The best 3-bar pattern to watch for in the markets
  • How sector analysis can help your trading systems
  • How to create a a rotational system
  • The two things that make a good filter for a trading system
  • How using ETFs can get you better beta

And then stay tuned for an update on the Trade Miner trade we made in Episode 5.  There’s good news on that!

Please give me a review on iTunes if you’re using iTunes!  Here’s the link.

RSS feed: http://blog.tradertechtalk.com/feed/podcast/

 

 

More JForex to like

I’ve found a few more features of JForex that I really like:

  • When you run a strategy on the JForex platform, you can choose to run it locally (on your own machine), or remotely.  Running it remotely appears to send it off to a server somewhere else to run.   Does this eliminate the need to to run it on a virtual private server?  I’m not sure.  I’m going to test this out and report back.
  • The API.  There is a whole JForex API that lets you develop completely offline using Eclipse or your favorite IDE, and you just link into the Java class files from the API.  Do you know what else this means?  I can use a debugger to step through my code! So much better than primitive print statements we are using now!
  • There’s some interesting methods we can use to make a strategy display in visual mode for back testing.  Sometimes it’s helpful to see just exactly what the strategy is doing, and by calling chart routines in the code you can optionally display the indicators and other text on the charts while the system trades.

More good things to come…

 

What is curve fitting and why is it bad?

I’ve received a few questions from readers on the topic of curve fitting, so I thought I’d talk about that a bit here.

What’s curve fitting?  Well, it’s bad.  When we back-test a trading system, we are trying to see if the trading rules we have written in our program successfully make a profit on historical data.  One of the things we do in back testing is we optimize parameters.  That means we find the best possible value for a parameter that controls something in our program.  But sometimes, we find the perfect value for historical data, but that value doesn’t work in a real trading situation.  Why not?  Well, it is possible we have fit our parameter too closely to the exact historical data, making it not flexible enough to work in new unseen data.  That’s curve fitting.  And it’s bad because it creates a trading system that seems good, but really isn’t.

One way to look at it is to think of the the signal and the noise.  Our trading system is looking for the signal (our rules for entries and exits based on indicators or patterns), and trying to ignore the noise (spikes, gaps, volatility, general market randomness).  If we tune our parameters to listen to everything, both the signal and the noise, then when we trade in a live account, it won’t work.  In the new market data, the signal is still there, but the noise is different.

How do we prevent curve fitting? There are a number of things we can do to prevent curve fitting when we are back testing.  One of the best techniques is to split up your testing data into two sections: an in-sample set, and an out-of-sample set.  You will spend time tuning parameters, optimizing your software, and getting everything perfect on the in-sample data set.  You will run your program dozens of times with this data and get it profitable, and working very well.

Then, once you have your perfect parameters, you run your program on the out-of-sample data set, and you see how it does.  The key here is that you may run the program one time on the out-of-sample set, and you may not use the out of sample results to tweak your program’s parameters.  If you do, that data set now becomes part of the in-sample set!

If the results of your out-of-sample set look good, then congratulations!  You may move onto the next phase of testing.  But if the results of the out-of-sample test do not look good, you have probably been guilty of curve-fitting during your back testing, and you need to go back and find more general parameters.  We’ll go over that in a future blog post.

Let’s assume you were successful in your out-of-sample testing.  What’s the next step?  That would be walk forward optimization.  Walk Forward Optimization is neat.  The concept is this: if you take January to Feburary of 2011 as your in-sample set, and then use March of 2011 as your out of sample set, you’ll find some good parameters to use. Save them off somewhere.  Now run the back-test again, but use February to March as the in-sample set, and use April of 2011 as the out-of-sample set.  Find your best parameters and save them off.  Keep moving the “window” of sample sets forward, and keep saving off your results.  At the end of all of the back-test runs, compare all the results you have saved off.  If they are similar, you have a great set of parameters for your system.  If not, then you have some work to do (more on that in a future post).

If this walk-forward test is automated, then this process is somewhat enjoyable.  If you have to manually run all of these tests, then it’s a bit tedious, and you have to keep really accurate notes on each run.  Unfortunately for us, the walk-forward analysis in Metatrader is a manual process, and it’s tedious.  (There is an MT4 add-on that makes it easier, but it’s not perfect.)

The AmiBroker trading platform that I have mentioned in the past does do walk forward analysis, and that is one of its features that makes it attractive to trading system developers.

But, as long as you can keep a spreadsheet of dates, parameters and profitability, you can run the walk-forward analysis manually, and your system will be much better in the long run.

So, curve fitting is bad; we want to make sure our optimized parameters are great, but general enough for new unseen data.

I mentioned a book a few weeks ago called “Trading Systems” by Emilio Tomasini; he spends a lot of time discussing curve fitting, how to avoid it, and how to properly test a trading system.  If are working on back testing, and you haven’t picked up that book yet, you’d do well to get a copy.