Java滚动数组计算编辑距离操作示例

时间:2021-05-19

本文实例讲述了Java滚动数组计算编辑距离操作。分享给大家供大家参考,具体如下:

编辑距离(Edit Distance),也称Levenshtein距离,是指由一个字符串转换为另一个字符串所需的最少编辑次数。

下面的代码摘自org.apache.commons.lang.StringUtils

用法示例:

StringUtils.getLevenshteinDistance(null, *) = IllegalArgumentExceptionStringUtils.getLevenshteinDistance(*, null) = IllegalArgumentExceptionStringUtils.getLevenshteinDistance("","") = 0StringUtils.getLevenshteinDistance("","a") = 1StringUtils.getLevenshteinDistance("aaapppp", "") = 7StringUtils.getLevenshteinDistance("frog", "fog") = 1StringUtils.getLevenshteinDistance("fly", "ant") = 3StringUtils.getLevenshteinDistance("elephant", "hippo") = 7StringUtils.getLevenshteinDistance("hippo", "elephant") = 7StringUtils.getLevenshteinDistance("hippo", "zzzzzzzz") = 8StringUtils.getLevenshteinDistance("hello", "hallo") = 1

Java代码:

public static int getLevenshteinDistance(String s, String t) { if (s == null || t == null) { throw new IllegalArgumentException("Strings must not be null"); } int n = s.length(); // length of s int m = t.length(); // length of t if (n == 0) { return m; } else if (m == 0) { return n; } if (n > m) { // swap the input strings to consume less memory String tmp = s; s = t; t = tmp; n = m; m = t.length(); } int p[] = new int[n+1]; //'previous' cost array, horizontally int d[] = new int[n+1]; // cost array, horizontally int _d[]; //placeholder to assist in swapping p and d // indexes into strings s and t int i; // iterates through s int j; // iterates through t char t_j; // jth character of t int cost; // cost for (i = 0; i<=n; i++) { p[i] = i; } for (j = 1; j<=m; j++) { t_j = t.charAt(j-1); d[0] = j; for (i=1; i<=n; i++) { cost = s.charAt(i-1)==t_j ? 0 : 1; // minimum of cell to the left+1, to the top+1, diagonally left and up +cost d[i] = Math.min(Math.min(d[i-1]+1, p[i]+1), p[i-1]+cost); } // copy current distance counts to 'previous row' distance counts _d = p; p = d; d = _d; } // our last action in the above loop was to switch d and p, so p now // actually has the most recent cost counts return p[n];}

实际上,上述代码的空间复杂度还可以进一步简化,使用一维数组替换滚动数组。

Java代码:

public int minDistance(String s, String t) { if (s == null || t == null) { throw new IllegalArgumentException("Strings must not be null"); } int n = s.length(); // length of s int m = t.length(); // length of t if (n == 0) { return m; } else if (m == 0) { return n; } if (n > m) { // swap the input strings to consume less memory String tmp = s; s = t; t = tmp; n = m; m = t.length(); } int d[] = new int[n+1]; // cost array, horizontally // indexes into strings s and t int i; // iterates through s int j; // iterates through t char t_j; // jth character of t int cost; // cost for (i = 0; i<=n; i++) { d[i] = i; } for (j = 1; j<=m; j++) { t_j = t.charAt(j-1); int pre = d[0]; d[0] = j; for (i=1; i<=n; i++) { int temp = d[i]; cost = s.charAt(i-1)==t_j ? 0 : 1; // minimum of cell to the left+1, to the top+1, diagonally left and up +cost d[i] = Math.min(Math.min(d[i-1]+1, d[i]+1), pre+cost); pre = temp; } } return d[n];}

更多关于java相关内容感兴趣的读者可查看本站专题:《Java数组操作技巧总结》、《Java字符与字符串操作技巧总结》、《Java数学运算技巧总结》、《Java数据结构与算法教程》及《Java操作DOM节点技巧总结》

希望本文所述对大家java程序设计有所帮助。

声明:本页内容来源网络,仅供用户参考;我单位不保证亦不表示资料全面及准确无误,也不保证亦不表示这些资料为最新信息,如因任何原因,本网内容或者用户因倚赖本网内容造成任何损失或损害,我单位将不会负任何法律责任。如涉及版权问题,请提交至online#300.cn邮箱联系删除。

相关文章