BlogJava 首页 新随笔 联系 聚合 管理
  400 Posts :: 0 Stories :: 296 Comments :: 0 Trackbacks

Data Relations and Relatives

Dino Esposito

July 12, 2001

Download ViewManager.exe.

A large number of applications need to render data that is somehow related to other data. A well-known example that illustrates this is given by the very popular Customers table that needs to link to an equally famous Orders table. What do they have in common? An even more famous CustID field, of course. How do you typically solve the problem of rendering all the orders for a given customer? According to the constraints and the requirements of your application, you might be applying a number of feasible solutions.

In .NET, a new valuable tool can be added to your programmer's toolkit. This tool is the DataRelation object. It basically represents a parent/child relationship set between two tables. The DataRelation per se is not such a big deal. It gains a lot of importance and usefulness, though, when you look at it in light of the support that the DataSet and other ADO.NET objects and controls have for relations.

Typical Approaches

When you need to fetch data from related tables, a popular solution is to you use a plain old INNER JOIN SQL command. It merges the column of the two input tables that you need to work with into a single resultset . The following code creates a final resultset where you find two columns from Customers and three columns from Orders.

SELECT c.Name, c.City, o.Date, o.TotalPrice, oShipAddress 
FROM Customers AS c
ON c.CustID = o.CustID

The INNER JOIN statement involves the database server and ends up returning rows with a certain quantity of duplicated data. When you run the above query, you are aimed to obtain and process all the orders for a certain customer. So, you don't need to repeat the information—address, city, and the like—you want to return about the customer. Nevertheless, this is exactly what you get back in the form of tabular structure from the resultset.

In ADO, the data shaping service lets you create hierarchical and, hence, even irregularly shaped recordsets. By using data shaping, the Customers/Orders relationship would have been expressed in terms of one row with all the customer information, plus an extra field pointing to a child recordset. The child recordset would feature one row for each order associated with the specified customer ID. The structure of the order rows is determined by the fields you want to query from the Orders table.

ADO data shaping requires you to write queries with a special language called the SHAPE language.

  {SELECT Name, City, Custid FROM Employees} 
  {SELECT CustId, Date, TotalPrice, ShipAddress FROM Orders} AS oOrders 
  RELATE CustId TO CustId)

Next, it executes all the necessary queries on the database within a single connection. The results are then shaped in terms of hierarchical recordsets on the way to the client thanks to a special OLE DB service.

INNER JOINs and data shaping have to do more with the way in which you fetch and store the related data. What about retrieving and showing this data in a client application?

What an INNER JOIN statement returns is a tabular structure and the extraction of needed information is completely up to you. With data shaping, the information at least comes in with a layout that lends itself quite well to be displayed the way it should be.

The customer information is distinct from the list of orders. You access it as a normal recordset field with a particular name.

Set rsCustomerOrders = oRS.Fields("oOrders").Value

To access the orders for a given customer, select the corresponding row on the Customers table and then access the field whose name matches the previously set relation.

ADO.NET Sort of Data Shaping

To represent parent/child data relationships in ADO.NET, you are expected to use the DataRelation object. If you're familiar with ADO data shaping, you'll soon recognize, under the hood of a DataRelation object, the SHAPE language code snippet that I just showed above.

In ADO.NET, a DataRelation object is used to establish an in-memory relationship between two DataTable objects. The relationship sets on matching values found in one column the two tables have in common. A column in ADO.NET is represented by the DataColumn object.

Let's see how to code in ADO.NET the Customer/Orders relationship seen earlier.

DataColumn dcCustomerCustID, dcOrdersCustID;
// Fill in the two DataColumn objects
dcCustomerCustID = DataSet1.Tables["Customers"].Columns["CustID"];
dcOrdersCustID = DataSet1.Tables["Orders"].Columns["CustID"];
// Create the relationship between the two columns
DataRelation relCustomerOrders;
relCustomerOrders = new DataRelation("CustomerOrders", 
dcCustomerCustID, dcOrdersCustID); 

A freshly created DataRelation object is rather useless if you don't add it to a DataSet.


The DataSet object contains a Relations data member, which is a DataRelationCollection object where all the relations involving DataSet's tables are kept.

Notice that any relation is created between matching columns in two tables within the same DataSet. For this to happen, the .NET type of the columns must be identical. The .NET type of a column is given by the value returned by its DataType property.

When you have a parent/child relationship set between two tables, deleting or updating a value in the parent table can affect the rows of the child table. The impact on the child rows manifests in one of the following ways:

  • The child rows must be deleted or updated with cascading changes.
  • Values in child columns must be set to NULL values.
  • Values in child columns must be set to default values.

If you don't manage this directly through a ForeignKeyConstraint policy, the operation originates an exception.

So, if you're going to create an in-memory relation for cached, disconnected data that you plan to modify, make sure you first define a ForeignKeyConstraint object on the parent table. This ensures that any change that could affect the related tables is properly managed. You create a constraint like this:

ForeignKeyConstraint fkc; 
DataColumn dcCustomersCustID, dcOrdersCustID;
// Get columns and create the constraint
dcCustomersCustID = DataSet1.Tables["Customers"].Columns["CustID"];
dcCustomersCustID = DataSet1.Tables["Orders"].Columns["CustID"];
fkc = new ForeignKeyConstraint("CustomersFK", 
dcCustomersCustID, dcOrdersCustID);
// Shape up the constraint for delete and update
fkc.DeleteRule = Rule.SetNull;
fkc.UpdateRule = Rule.Cascade;

A ForeignKeyConstraint is created on the parent table using the common column that the parent and child table share. To specify how a child table behaves whenever a row on the parent table is deleted or updated, you use the DeleteRule and UpdateRule fields. In this case, I set all the values on the child row to NULL when the corresponding parent row is deleted. Furthermore, any update simply trickles down from the parent row to the child row.

A DataTable object maintains its collection of ForeignKeyConstraint objects in a ConstraintCollection class that is accessible through the DataTable's Constraints property. As a final note, bear in mind that constraints are not enforced on tables if you set the EnforceConstraints property to false.

// Add the constraint and enforce it  
DataSet1.EnforceConstraints = true;

Upon creation, ADO.NET verifies that the DataRelation object can be effectively created. This basically means that it checks whether or not all the involved columns are really part of the given tables. According to the syntax, in fact, you could pass the DataRelation's constructor a DataColumn object that you create on the fly with the right type and name but not the "right" column.

The DataRelation object and the involved DataTable objects are disjointed and independent objects until the relation is added to the DataSet's Relations collection. When this happens, ADO.NET prohibits any changes on the tables that could invalidate the relation. For example, changes on columns are disallowed, as well as moving the tables from one DataSet to another.

The DataRelation object also features a method called CheckStateForProperty that allows you to verify the validity of the relation before you add it to a DataSet. The controls operated by this method include checking whether parent and child tables belong to different DataSet objects, whether the columns type matches, and makes sure that parent and child columns aren't the same column.

You can call this method even if the DataRelation doesn't yet belong to a DataSet—the same DataSet to which the involved tables belong. CheckStateForProperty doesn't return a Boolean value to mean success or failure. In case of error, you will be notified through a DataException exception.

Accessing Child Records

Given a parent/child data relation, how can you access the child rows associated with a parent row? In other words, assuming that you have the same Customers and Orders tables in the DataSet , how can you get the orders for a given Customers row?

In the code accessing the related data, you first obtain a DataRow object representing the parent row. You can do this in a number of ways, all strictly dependent on the structure of your application.

For example, if you know the primary key value that uniquely identifies that row, you can use the Find method on the RowsCollection object that represents the rows of the table.

DataRow r = DataSet1.Tables["Customers"].Rows.Find(nCustID);

Once you hold the right DataRow object, obtaining the child rows according to a given relation is as easy as calling the method GetChildRows to fill up an array of DataRow objects.

DataRow[] rgCustomerOrders;
rgCustomerOrders = r.GetChildRows(relCustomerOrders);

GetChildRows takes one argument being a reference to a valid DataRelation object set on that DataSet. It returns the child rows as an array of DataRow objects. The following code shows how to dump all the orders of a given customer to the console .

for (int i=0; i < rgCustomerOrders.Length; i++)  {
   DataRow tmp = rgCustomerOrders[i];

To be honest, GetChildRows can be called through a couple of other prototypes. You can certainly specify the relation as a DataRelation object as shown above. However, you could also indicate the relation by name.

rgCustomerOrders = r.GetChildRows("CustomerOrders");

In addition, you can select the version of the various rows that must be returned. You do this through the following signatures:

public DataRow[] GetChildRows(
   DataRelation relation,
   DataRowVersion version
public DataRow[] GetChildRows(
   String relationName,
   DataRowVersion version

You indicate the version of the rows through the values in the DataRowVersion enumeration. Possible values are Default, Original, Current, and Proposed.

Automatic Master/Detail Views

Since the DataRelation object associates rows in one DataTable object with rows in another DataTable object, it lends itself very well to build master/detail views. The GetChildRows method is the key tool to build such views. If you find this behavior quite cool, you'll love the Windows Forms DataGrid control, which does even more.

You set the DataGrid control to show data coming from the source specified in its DataSource property. If DataSource happens to point to a cointainer control, like DataSet or DataViewManager, it will feature one row for each child table prefixed by a + symbol. Click there and you'll see the content of that table.

You can select a specific table by setting the DataMember property with the name of the child table.

theMasterGrid.DataSource = ds
theMasterGrid.DataMember = "Customers"

If you have two datagrid controls on your form and want to realize a master/detail view, you can associate each grid with a different table, and then hook up for the event that fires when a new item is selected in the master table. At this point, you could access the array with related child rows, create a DataTable on the fly, and update the DataSource property of the detail DataGrid. Notice that you must use the SetDataBinding method at run time to reset the DataSource property for a Windows Forms datagrid.

This approach works just fine, but DataGrid controls can better perform this action. You can have the DataGrid automatically refresh the detail view if you use a special syntax when setting the DataMember property.

theChildGrid.DataSource = ds
theChildGrid.DataMember = "Customers.CustomerOrders"

If you concatenate the name of the parent table with the name of an existing relation and put a dot character in the middle of the two, you instruct the DataGrid control to automatically and silently call GetChildRows for the CustomerOrders relation on the currently selected row of the Customers table.

The magic performed by the DataGrid doesn't end here. As long as the two grids have the same data source, the child one will automatically hook for the event that indicates that a new row has been selected in the master grid. At the end of the day, a relation, two Windows Forms datagrid and the following four lines of code, are enough to produce a free auto-refreshing master/detail view.

theMasterGrid.DataSource = ds
theMasterGrid.DataMember = "Customers"
theChildGrid.DataSource = ds
theChildGrid.DataMember = "Customers.CustomerOrders"

A good question to raise at this point is, "How can the child grid know about the parent grid?" Basically, any DataGrid that is assigned the content of a parent/child relationship investigates the running instance of another grid object in the same form with the same content in the DataSource property and with a DataMember that matches the first part of its member expression (Customers in the above example).

Rendering DataRelations in XML

Relations constitute a key information for DataSet objects. DataSets, though, can switch any time from their typical relational representation to a hierarchical one based on XML. When relations are set, internally the DataSet object works in a way that resembles what happens with the ADO data shaping representation. An extra field is spookily added to each row to link to its group of child rows in the child table. What happens of this information when you switch from the traditional representation to XML?

You can do this in two ways. Either you create a new instance of the XmlDataDocument class based on the DataSet

XmlDataDocument xmlDoc = new XmlDataDocument(DataSet1);

Or you can save the whole DataSet to XML using the WriteXml method.

In both cases, the results you get differ quite a bit depending on the value that the Nested property has on the DataRelation object. Nested is a Boolean value that is set to false by default. It controls the way in which the child rows of a relation are rendered in XML. Two DataSet tables are rendered like this:

    <Name>Acme Inc</Name>
    <Name>Foo Corp</Name>

Each record is rendered with a subtree with the table name and as many text nodes as the number of columns. This representation doesn't change if you have relations set as long as Nested remains set to false.

If you set Nested to true, then all the order nodes for any given customer will be rendered as a child subtree.

    <Name>Acme Inc</Name>
    <Name>Foo Corp</Name>

All the orders that correspond to a given customer go under the node of that customer, building up a more reasonable and useful structure.


DataRelation is the ADO.NET object that represents a logical link between two tables through a common column. The DataRelation defines the parent/child relationship, but the tables and the columns remain separate entities. Once a relation has been set, you can easily access the child rows of the detail table either by using methods on the DataRow object or switching to the XML hierarchical representation. The DataRelation object looks like an in-memory INNER JOIN, but without the same redundancy of information.

Dialog Box: Modifications Through the View

What's the role of all those AllowXXX properties on the DataView object? Can I modify rows or not through a DataTable's view?

A DataView is an object that provides a particular representation of the content of a given table. The DataView and the DataTable are independent objects, and the DataView simply holds a link to the parent table. The DataView doesn't cache the table, nor does it makes an internal copy of the data. The DataView is simply an object that contains some information about the way and the order in which the content of the table must be shown.

The core function that you execute on a DataView is the enumeration of the items. This happens explicitly when you loop through its content or implicitly when you assign the DataView to the DataSource property of a data-bound control. When a data-bound control calls its DataBind method, the content of the data source is enumerated and the control's Items collection is properly populated.

When a view is involved, the caller enumerates through the view, and the view in turn enumerates through the parent table and applies sorting expressions and filters.

The AllowEdit, AllowDelete, and AllowNew Boolean properties indicate whether the DataView and the user interface associated with it allow updates, deletions, and insertions. This doesn't affect the way in which the parent table is updated. Those properties apply only to the edit operations carried out through the DataView control or a data-bound control that uses it.


Dino Esposito works for Wintellect,Non-MS link where he takes on ADO.NET and ASP.NET training and consulting. He is the co-founder of VB-2-The-Max,Non-MS link and also contributes the Cutting Edge column to MSDN Magazine. You can reach Dino at dinoe@wintellect.com.

posted on 2006-01-10 22:25 jinfeng_wang 阅读(765) 评论(0)  编辑  收藏 所属分类: .Net