Supervisory authorities are constantly creating new challenges for archive managers. Corporate governance standards are converging across the globe, and the obligation to retain data is a key element of these requirements. The Sarbanes-Oxley Act (SOA), which was passed in the United States in July 2002 after the recent accounting scandals, further boosted this development. The SOA is intended to protect investors from the fraudulent actions of managers of listed companies. One implication of this Act is that data relating to securities trading is monitored much more carefully, must be retained for defined periods, and must be quickly accessible when needed.
Experts agree that Sarbanes-Oxley will result in new international standards that will have an effect on corporate governance efforts at companies all over the world. The minimum requirements of this law also apply to the foreign branch offices of companies listed on US stock exchanges. Moreover, it is generally assumed that, in the medium term, the major auditing firms will apply their now necessarily more rigid standards to companies that are not directly subject to the provisions of the SOA. In the meantime, similar efforts have been launched at a European level.
Regardless of this fact, Germany maintains strict national requirements for data retention, such as those laid down by the Bundesanstalt fr Finanzdienstleistungsaufsicht (BAFin - German Financial Supervisory Authority) and other supervisory bodies.
New requirements were formulated after September 11 2001 in particular; for example, some companies were required to produce information about trades and transactions occurring in the previous three to four years. Tax authorities also insist on companies being not only able to produce their accounts over long periods of time, but also to be able to prove their accounts which requires storage of the source data that makes up the balance sheet. Finally, court rulings have also reinforced the rights of investors to obtain information about exchange-traded derivatives transactions executed further back in the past. All of these regulations result in different retention periods for different types of data.
Data necessarily accumulates when past activities are recorded. As retention periods lengthen and data archiving requirements intensify, the volumes involved eventually become seemingly limitless. As long as the stored files and documents were "simply lying around" somewhere, this was not a major problem. Increasingly, however, historical data must be made available rapidly for online access. This refers not only to invoices, contracts, human resources documents, etc., but also to files relating to daily business - e-mails, transaction data from trading systems, communication between participants in securities trading, and much more. These processes create massive amounts of data from which specific information must be selected and made available quickly.
"That is why Citigroup needed a completely new approach to archiving", says Szafran Athey, First Vice President and Head of CitiTech Frankfurt, a unit of Citigroup that provides IT services. "The approaches that were commonly used and adequate in the past, such as paper based archiving, database excerpts, and sequential files on tapes, were no longer adequate in view of the growing volume of data, the lengthening retention periods and the demand for quick retrieval, if nothing else than for reasons of cost."
"Maintaining tapes and storing them securely is more labour intensive, as is retrieving the data: but more importantly if information was required years later it could be difficult to restore data as software and hardware versions will have moved on and backward compatibility can not be guaranteed. For this reason, we had to keep all legacy software versions on hand. Rapid access to historical data in particular was not possible in the past."
"A user can possibly search one or two years' worth of data on tape, given time, but a decade's worth of data is nearly impossible to sift through for specific information." Due to the regulatory requirements, Citigroup needs on request to be able to produce historical data - for example all securities transactions by a particular customer over the past four years, which could amount to thousands of trades - quickly if needed. Trading systems themselves often only store a limited amount of transaction data on line, maybe only a maximum of six weeks? worth of transaction data. But data already archived must also be editable, for instance if Citigroup discovers an error in the transaction (wrong price, counterpart for example) after the trade was archived, however rare, the trade still needs to be somehow amended.
Citigroup was therefore faced with the challenge of quickly locating and editing a growing volume of data, i.e. handling it like current data - but they could not store it in the operational database due to performance reasons. Seamlessly integrating the data warehouse with the transactional database became a necessary step.
Citigroup therefore decided to replace the conventional static archive with a dynamic one. The management chose Sybase IQ as the tool to accomplish this. The most important reason, according to Szafran Athey: "The technology used here enables short response times even when querying very large data volumes, plus the data is stored very efficiently due to high compression rates." A classic relational database is optimised for efficient transactions. When the user initiates a query, certain data rows are labelled with a key (index) that can then be used to select the data. This creates overhead. Because the complete data rows are read each time, response times increase as the data volume grows.
Sybase IQ, on the other hand, is not organised horizontally, but vertically. In order to find a specific field, no data rows are searched, just the appropriate columns. This makes accessing data much faster. Each field forms a type of index that eliminates the overhead produced by separate indexing. At the same time, the database can be compressed substantially more efficiently. Szafran Athey gives some figures to make this difference clear. "Citigroup in Germany now stores trading data for four years; this amounts to 13.2 million deals. Each deal comprises 388 bytes of real data. In the classic SQL database, this data grows to 405 bytes due to the index overhead. In contrast, the Sybase IQ database stores this information in 218 bytes, which represents a compression rate of 43 percent. Our goal is to make ten years? worth of trading data available on line via IQ."
"The decisive factor is that Sybase IQ has the same application interface as a relational database, even though it stores the data completely differently", says Szafran Athey. "This means that the data can be accessed using regular SQL. Each application can therefore access current, as well as historical data seamlessly. Users can query archived data in real time, upload it back to the transactional database (in this case ASE), if needed (for example, if an error is discovered), edit the data, and archive it again. The fact that the transaction is amended in the transactional database ensures that there is a full audit trail of the amendment. Historical data is not obsolete any longer."
Additional savings result from the fact that end users in the departments can now research historical data themselves, whereas in the past IT specialists were always required to do so. The process of archiving is also now fully automated, data is moved from the ASE database to the IQ data warehouse on a daily basis if it meets complex archive criteria based on the age of the transaction, the last time the record was amended, settlement date etc.
The online historical archive was implemented principally in response to external regulatory requirements. "Now that it is in place, the archive also opens up new prospects within the company", Szafran Athey reports. "One example is standard online reports. If they were forgotten or not prepared for other reasons in the past, users could no longer generate these reports at a later date once the data was archived. Today this is no longer a problem. A large number of new analyses have become possible that no one would have thought to produce in the past due to the time and cost involved. In principle, there are almost no limits any longer on the systematic analysis of past activities - regardless of whether this involves statistics, trend analyses for marketing and CRM, or process optimisation - now that historical data can be analysed in real time."
Even dreams, such as a "tick database" in which all changes in the price of a security during a particular trading day are recorded, have become a reality. Szafran Athey: "Due to the enormous volume of data that is generated in such a case, we currently only record daily closing prices. However, this type of historical database would theoretically allow us to record all the price updates of a security not just for a day but for years. This could add value to the business and our customers." External requirements are also expected to increase. Supervisory authorities know that they can now request companies to implement procedures and processes that used to be impossible which are now feasible thanks to technical advances. Online historical archives will become a must.




reader comments