[Java]統計目錄下Java原始檔的關鍵字出現次數
阿新 • • 發佈:2018-11-06
題目
題目也可抽象為統計檔案正文中某字串出現的次數.
解題思路
1.Java中關鍵字共有50個,分別為:
final String[] KEYWORDS = { //50個關鍵字
"abstract", "assert", "boolean", "break", "byte",
"case", "catch", "char", "class", "const",
"continue", "default", "do", "double", "else",
"enum", "extends" , "final", "finally", "float",
"for", "goto", "if", "implements", "import",
"instanceof", "int", "interface", "long", "native",
"new", "package", " private", " protected", "public",
"return", "strictfp", "short", "static", "super",
"switch" , "synchronized", "this", "throw", "throws",
"transient", "try", "void", "volatile", "while"
};
2.說明與初始化
...
ArrayList<File> fileList;//儲存Java檔案列表
File root;//給定的目錄
Map keywords; //HashMap用於儲存關鍵字與出現次數, 例如:<key,value>=<"int",3>
...
public KeywordsAnalyzer (String pathName) {
root = new File(pathName);
fileList = new ArrayList<>();
keywords = new HashMap();
for (String word : KEYWORDS) {
keywords.put(word,0);//按KEYWORDS順序初始化Map
}
}
3.使用遞迴搜尋目錄下所有的Java檔案
ArrayList<File> fileList;
File root;
public void searchFiles() {
File[] files = root.listFiles();
int length = files.length;
for (int i = 0; i < length; i++) {
if (files[i].isDirectory()) {
root = files[i];
searchFiles();
} else {
if (files[i].getName().endsWith(".java"))
fileList.add(files[i]);
}
}
}
3.關鍵字篩查
讀取檔案中的某一行,將該行split為字串陣列,逐個判斷是否為關鍵字.
需要首先去除非字母和數字字元的影響,例如:
private void fixUp(int k) {
//直接分割會少計算了一個int
private
void
fixUp(int //此處有一個關鍵字int
k)
{
//使用正則表示式"\\W"處理成
private
void
fixUp
int
k
程式碼如下:
public void matchKeywords(String line) {
String[] wordList = line.replaceAll("\\W", " ").split(" ");
for (int i = 0; i < wordList.length; i++) {
for (int j = 0; j < 50; j++) {
if (wordList[i].equals(KEYWORDS[j])) { //迴圈判斷
int count = (int) keywords.get(KEYWORDS[j]);
keywords.put(KEYWORDS[j], count + 1);
}
}
}
}
4.處理註釋
說明有四種不同的註釋,分別為:
/**
文件註釋
*/
/*
多行註釋
*/
//單行註釋
int number; /*第一行當作程式碼
*
其他行當作註釋 */
讀取檔案中的每一行,首先判斷是否屬於註釋,若屬於則跳過,若不屬於則進行關鍵字篩查.
public void countKeyWords(File file) throws IOException {
BufferedReader input = new BufferedReader(new FileReader(file));
String line = null;
while ((line = input.readLine()) != null) {
line = line.trim();
if (line.startsWith("//")) continue; //不處理單行註釋
else if (line.contains("/*")) { //多行,文件與尾行註釋
if (!line.startsWith("/*")) matchKeywords(line);//第一行算程式碼,其餘算註釋
while (!line.endsWith("*/")) {
line = input.readLine().trim();
}
}
matchKeywords(line); //對程式碼行進行統計
}
}
流程與結果輸出
public void keywordsAnalyze() {
for (File file : fileList) {
try {
countKeyWords(file);
} catch (IOException e) {
e.printStackTrace();
}
}
//排序並輸出結果
List<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(keywords.entrySet());
Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
@Override
public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
return o2.getValue().compareTo(o1.getValue());
}
});
int count = 0;
for (Map.Entry<String, Integer> word : list) {
count++;
System.out.print(word.getKey() + ": " + word.getValue() + " ");
if (count == 5) { //每行輸出5個關鍵字
count = 0;
System.out.println();
}
}
}
這裡輸出的結果是按照出現次數的多少降序排序.這裡涉及了HashMap的按值排序的思路.詳情可以參考我的另外一篇文章 還沒寫好,文章裡面同樣以關鍵字為例,分析了HashMap按鍵排序和按值排序兩種簡便方法.
測試結果
對測試用例進行統計得到如下結果:
原始碼下載
包含完整程式碼與測試用例.