The Importance of Good Forex Data

For those of you that play with automated strategies, the concept of having good Forex data is a must, and something I am amazed is overlooked by so many.

Good data in your testing can be the difference between a profitable system and a massively unprofitable one, and more importantly between a system that will blow-up and one that works.

Accuracy, accuracy, accuracy…

We must spend almost 50% of our time managing and cleaning our data to ensure it is as accurate as possible. So I feel sorry and somewhat disappointed when presented with performance of a system where the same consideration has not been given.

Now I shan’t name names (although it such a major problem it may be worth checking with us), but there are a lot of providers out there claiming to provide good quality data; some free, some paid for.

For the most part we have tried all of them, and what annoys me the most is most people’s lack of ability to deal with dates and times.

Server snafu…

MT4 is a good example. See MT4 servers work off of the local server time. Therefore if I stick my broker server in Ukraine, then it is based off of the local time in Ukraine, including adjustments for summer time. Now you can change the time of your server or move your server, but if you do this, all recorded past data needs to be adjusted…

…you can see where this is going…

…99% of brokers out there don’t do this and often claim data is presented in New York ‘EST’, GMT or UTC when actually it is a mix of time zones. Without proper care and attention this makes obtaining accurate data in FX extremely difficult, as if you can’t trust the official providers and brokers, who can you trust?

I am well aware of several free sites lots of people use that suffer from this problem, including a couple of brokers who, when challenged, admitted they didn’t know what time zones their data was actually in beyond the most recent years; despite being quite comfortable in stating the time zones on the data (very dangerous for people using that data).

Well this is actually a big problem and honestly it makes data such as ours extremely expensive. To get close to our level of accuracy you would need to fork out over $50,000 to a well known Financial News provider, and then you would be 5 years shy of the data set we have.

Ok so how did we get it?

Well a little more sway with some bigger brokers, a lot of cleaning of data sets we found, and focus on pulling missing data from different areas whilst cleaning and ensuring accurate time stamps. We then built over several years systems to check accuracy and timestamps across multiple reference points. This has left us with a rather epic but accurate database, albeit the reality remains, that even we can’t be more than 95% certain of its accuracy (people that claim more than that likely have no idea how the FX market works and how impossible it is to prove your prices are accurate back more than a few years ago).

This is the problem with an OTC market. Actually, there are multiple versions of the truth. During the financial crisis regardless of price, what actually do you think you might have been filled at?

The key thing to be careful of when building systems though is how good your data is, spend time checking it against other sources (it’s always a good idea to get three different reference points) and never taking it at face value; ask the probing questions and ensure you can rely on it otherwise you are likely to find yourself making major decisions without the correct options (if someone says it is in UTC, never believe them – ask for where their servers were based and do some cross references).