Which test data is used by my Test Harness?
Author: Calin Groza

February 6, 2010

I am working on a medium-size Java application with a test harness consisting of 180 tests grouped in 6 packages. Most of the automated tests read one or more input files and create multiple output files which are compared with “control” files. This approach provides an easy way to add more tests without coding. Over a period of three years this lead to the creation of around 350 input files and 600 control files taking 167MB. But as tests change, not all the test data is used anymore, which raise the question is: How can I find the test data files that are still used in my regression testing suite? The search for an answer lead to exploring Java 7, Spring AOP and Aspect J.

(Continue reading …)

Managing Reference Data Using the Project Paradigm
Author: Calin Groza

January 2, 2010

Having worked on Reference Data Management for a long time, I found the need for a categorization of data from the perspective of change. For example, Telco companies are using Business Support Systems (BSS) and Operational Support Systems (OSS) applications to manage terabytes of data stored in hundreds of tables, XML files and properties files. This data is changing all the time: some often, some rarely; some changes have large impact other have limited impact; the impact could be in the number of customers affected, products sold, financials. There are many aspects to consider about data and its change and each aspect is handled in a specific way.

Depending of how often data changes, we have:

  • Very rarely changing: Customer Type, Order Statuses – these are at the core of the application and while they are present in the data storage layer, it cannot change without a large impact to the application functionality
  • Rarely changing data: Service Type – entities are changing when the organization adds a new line of business, does an acquisition
  • Medium Changing: Product Type information – new products added, discounts, campaigns
  • Often Changing: Customers and Customer-Product information – new customers added to the system; customer changes the products/services provided
  • Very often changing: Customer call records (CDRs) – every time the user makes a phone call a record is added. New data is created every day, every second.

Depending on the impact of the change financially:

  • Large Impact: Changes to the price plans, discounts, campaigns – changes affecting many customers and the overall organization financial situation
  • Small Impact: Customer orders or cancels a service – changes affect only one customer
  • Changes with no financial impact: customer changes the number of rings for a voice mail to be activated – no financial impact on the customer or organization.

Depending on the impact to the service quality:

  • High risk: changing the service delivery platform for a service; changing the voice mail provider – impact to all customers
  • Low risk: modifying a service for a customer

In time, the trend has been to make the data more ready for change and thus providing more flexibility to the organizations and customers. For example, before, customers had to call the call center to make customer information changes, now the customers can change their profile on-line; before, a vendor had to make a change in the application to support a new Service Type, now the organization can make these changes using a configuration tool.

To reduce the risks associated with the data changes, organizations are using specific methods to deal with change.

(Continue reading …)

Dynamic Data in Java
Author: Calin Groza

December 15, 2009

Java is a static, strongly typed language. Every variable has an explicitly defined type which cannot be changed during the execution. Conversions from one-type to another have to be done explicitly using casting. This feature helps in safer application development: errors can be caught earlier, during the development and compilation, rather than run-time. But this is also a constraint which makes harder to develop some applications, especially tools and integration components. For example, I have been working for a while now in the development of tools to compute the difference between business entities. This will not be a hard task for specific Java classes such as com.intspc.order.PurchaseOrder or com.intspc.crm.Customer. But it is harder to write a generic diff tool, for an (almost) arbitrary class. This requires a dynamic data model … within a strongly typed language.

Major tools companies: IBM, BEA (now Oracle), Oracle, SAP and others have been working on a specification for handling dynamic data in Java (there are similar APIs for C, C++ and COBOL). It is called “Service Data Objects for Java Specification.” The latest version is 2.1.0 and available at: http://osoa.org/download/attachments/36/Java-SDO-Spec-v2.1.0-FINAL.pdf. This has been released in November 2006 and there are several vendors who are providing tools based on this framework: IBM, Rogue Wave, BEA. I looked at the Apache Tuscany which is an open-source implementation of the SDO specification. The latest version of Tuscany is 2.0.0-M4.

For a while I have been working on an application that manages reference data which has a few packages that deal with handling dynamic data. This happened without knowing about the SDO specification, and dealing with more specific requirements. In this context, reading the specification and seeing the Tuscany implementation was a very interesting experience because it emphasized design and implementation decisions I made sometimes conscious and sometimes without paying a lot of attention. My implementation of the Dynamic Data capabilities is a superset of a subset of SDO specification. This is a common situation when multiple teams create applications for a specific domain. I will refer to my implementation of Dynamic Data in Java as DData.

(Continue reading …)

Application Deployment in Large Systems
Author: Calin Groza

October 17, 2009

Application design and Environment management touch each other when the components designed and developed are deployed on servers. A seemingly a simple tasks gets complicated when tens of interconnected applications are deployed in multiple environments made of hundreds of servers. This article is about how to systematically approach a large deployment. Some ideas are:

  • develop a conceptual model for the application deployment based on UML,
  • identify application design deliverables and environment management tasks involved in the deployment
  • use tools to manage the inventory of the applications, servers and their association.

UML has the Deployment View to describe the link between the logical components clip_image002and the run-time resources that will execute the code. The basic concepts are Node – the run-time computational resource, and Artifact – a physical entity such as a file. The Artifact is a “manifestation” of a component. In the real world, these concepts are not used by the implementation team, instead they operate with “servers”, “databases”, “Weblogic Servers.” UML2 provides ways to accommodate these, higher-level, entities. One way is to use the concept of “execution environment.” Databases and J2EE application container are execution environments and can be represented as Nodes in the deployment diagrams. To show that a database runs on a server I am using associations between nodes qualified with the stereotype <<hostedOn>>. The picture on the right shows how to represent the Weblogic, Database and the Unix Servers in UML.

(Continue reading …)

Next Page »

Categories