java方法不适用于大型数据集
我试图在一个包含漫威人物和他们读过的每本书的数据集中找到最核心的人物。我在下面编写的代码适用于我们自己创建的一个小测试文件,以更快地测试该方法,但当我在漫威文件上运行代码时,代码从一开始就中断了。我把print语句放在整个代码中,以找到它停止工作的地方,我认为这可能与遍历这么多字符有关,但它从一开始就不起作用。 在第一个while()循环中,我将startVertex添加到组中,并编写了一个系统。出来println(group)语句在我添加startVertex之后,当我运行测试时,print语句给出“[]”(我很确定这意味着该组没有从startVertex获得任何东西),然后陷入无限循环(但对于一小部分字符/书籍,代码运行得非常好)。。。关于如何让它在更大的文件中工作,有什么建议吗
编辑:这里是文件的链接。大文件必须是原始格式,因为github无法打开它。它们的格式完全相同,并且两个文件都正确地从tsv文件解析为多重图
小文件: https://github.com/EECE-210/2013-L1A1/blob/master/mp5/testTSVfile.tsv
/**
* First find the largest connected set of characters and then
* find the most central character of all characters in this set.
*
* @param none
* @return the name of the character most central to the graph
*/
public String findMostCentral() {
Set<String> vertexSet = new LinkedHashSet<String>();
vertexSet = vertexMap.keySet();
Iterator<String> iterator = vertexSet.iterator();
List<String> group = new ArrayList<String>();
List<String> largestGroup = new ArrayList<String>();
List<String> Path = new ArrayList<String>();
Map<String, Integer> longestPathMap = new HashMap<String, Integer>();
/*
* This first while loop sets the starting vertex (ie the character that will be checked
* with every other character to identify if there is/isn't a path between them.
* We add the character to a group list to later identify the largest group of
* connected characters.
*/
while(iterator.hasNext()){
String startVertex = iterator.next();
group.add(startVertex);
/*
* This second while loop sets the destination/end vertex (ie the character that is the
* destination when compared to the starting character) to see if there is a path between
* the two characters. If there is, we add the end vertex to the group with the starting
* vertex.
*/
for(String key : vertexSet){
String endVertex = key;
if( findShortestPath(startVertex, endVertex) != null )
group.add(endVertex);
}
/*
* If the group of connected characters is larger than the largest group, the largest
* group is cleared and replaced with the new largest group.
* After the group is copied to largest group, clear group.
*/
if(group.size() > largestGroup.size()){
largestGroup.clear();
for(int i = 0; i < group.size(); i++){
largestGroup.add(group.get(i));
}
}
group.clear();
}
/*
* Iterate through the largest group to find the longest path each character has
* to any other character.
*/
for(String LG : largestGroup){
String startingVertex = LG;
int longestPath = 0;
for(String LG2 : largestGroup){
String endingVertex = LG2;
Path = findShortestPath(startingVertex, endingVertex);
/*
* If the path size from startingVertex to endingVertex is longer than any other
* path that startingVertex is connected to, set it as the longest path for that
* startingVertex.
*/
if(Path.size() > longestPath){
longestPath = Path.size();
}
}
//save the starting vertex and it's longest path to a map
longestPathMap.put(startingVertex, longestPath);
}
/*
* Iterates through the longestPathMap and finds the shortest longest path and assigns
* the character with the shortest longest path to mostCentralCharacter.
*/
int shortestLongestPath = Integer.MAX_VALUE;
String mostCentralCharacter = new String();
for(Map.Entry<String, Integer> entry : longestPathMap.entrySet()){
if((Integer) entry.getValue() < shortestLongestPath){
shortestLongestPath = (Integer) entry.getValue();
mostCentralCharacter = (String) entry.getKey();
}
}
return mostCentralCharacter;
}
# 1 楼答案
谢谢你的快速回复!在任何for in循环开始之前打印顶点集时,我发现了这个问题。vertexSet的第一个字符串是“”(即nothing),因此它会将“”的第一个字符串存储在startVertex中,然后获取endVertex,然后陷入一个无限循环中,试图在nothing和字符之间找到ShorteSpath。。。。谢谢你的帮助