![]() ![]() The method of claim 1, further comprising: calculating a probability that a random walk from the source cluster will reach the target cluster, wherein the random walk includes randomly selecting table join edges to advance between neighboring clusters, such that a given cluster is not selected more than once for the random walk. The method of claim 1, further comprising: displaying a cluster graph indicative of the cluster pairs and their corresponding weightings on a display device. The method of claim 1, wherein the table join structure is limited to a maximum number of table join edges. The method of claim 1, wherein each sample is a random sample of records in a given table, and wherein a sample size of each sample is proportional to a total number of records for the given table. A method for mapping relationships in a database, the database including a plurality of tables having a table join structure, wherein the table join structure is indicated by table join edges in a schema graph of the database and wherein each of the plurality of tables includes a corresponding set of records, the method comprising: for each of the plurality of tables, grouping, by a computer system, a sample of the corresponding set of records into clusters, wherein records grouped in a cluster instantiate a common set of table join edges identifying cluster pairs, wherein a cluster pair corresponds to two clusters from different tables, wherein the two clusters instantiate a common table join edge weighting the cluster pairs according to a number of records that instantiate the common table join edge filtering any cluster pairs weighted below a threshold weighting, wherein the filtering includes a process selected from excluding the cluster pairs weighted below the threshold and combining each cluster associated with each cluster pair weighted below the threshold weighting with another cluster selecting a source cluster from a first table and a target cluster from a second table, wherein the first table and second tables are different tables selecting a third table in the database, wherein the third table shares a table join edge with the first table anddetermining a relative frequency, with respect to the first table, with which the second table is reached from the third table. ![]() A method for mapping relationships in a database, the database including a plurality of tables having a table join structure, wherein the table join structure is indicated by table join edges in a schema graph of the database and wherein each of the plurality of tables includes a corresponding set of records, the method comprising: for each of the plurality of tables, grouping, by a computer system, a sample of the corresponding set of records into clusters, wherein records grouped in a cluster instantiate a common set of table join edges identifying. ![]() Analyses of the cluster graph may reveal important characteristics of the database.ġ. Meaningful cluster pairs above a weighted threshold may be ordered according to table and displayed as a cluster graph. A weighting may be applied to cluster pairs based on the number of records for the cluster pair. Cluster pairs which share a join relationship between two tables are identified. Records with corresponding nearest neighbor join edges are grouped into clusters. A representative sample of records in each of a plurality of tables in the database is analyzed for nearest neighbor join edges instantiated by the record. Ī method for mapping relationships in a database results in a cluster graph. Meaningful cluster pairs above a weighted threshold may be ordered. ![]() A method for mapping relationships in a database results in a cluster graph. ![]()
0 Comments
Leave a Reply. |