0%

Java递归解压缩

做邮件分析的时候,因为前台上传的是邮件的压缩包,在后台处理的时候需要先对压缩包进行解压缩,然后再对邮件进行提取操作。

因为压缩包的扩展名名字可是rar、7z、zip。所以需要分情况进行解压缩。而且压缩包里面可能还有压缩包,需要递归解压缩。

对于zip文件当时用的是zip4j这个jar包,rar、7z文件用的是zip4j这个jar包。maven导入外部jar包的配置文件如下片段:

1
2
3
4
5
6
7
8
9
10
<dependency>
<groupId>net.lingala.zip4j</groupId>
<artifactId>zip4j</artifactId>
<version>1.3.2</version>
</dependency>
<dependency>
<groupId>net.sf.sevenzipjbinding</groupId>
<artifactId>sevenzipjbinding-all-platforms</artifactId>
<version>9.20-2.00beta</version>
</dependency>

对于rar、7z格式文件解压缩的函数如下:

1
2
3
4
5
6
7
8
public void rar7zdecompress(String path,String destination) {
String filter = null;
try {
new SevenExtract(path, destination, false, filter).extract();
} catch (SevenExtract.ExtractionException e) {
e.printStackTrace();
}
}

对于zip格式文件解压缩的函数如下:(对于文件名可能是繁体中文、其他编码格式做了一些处理,不然解压缩会出错,当时设置了压缩文件初始密码,需要先解密):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
public void zipdecompress(String path,String destination) {
String password = "password";
try {
ZipFile zipFile = new ZipFile(path);
UnzipParameters param = new UnzipParameters();
zipFile.setFileNameCharset("ISO8859-1");
// zipFile.setFileNameCharset("UTF-8");
// zipFile.setFileNameCharset("GBK");
if (!zipFile.isValidZipFile())
throw new ZipException("ZipFile Format Invalid!");
if (zipFile.isEncrypted()) {
zipFile.setPassword(password);
}
// zipFile.extractAll(destination);
List list = zipFile.getFileHeaders();
for (Iterator iterator = list.iterator(); iterator.hasNext(); ) {
FileHeader fh = (FileHeader) iterator.next();
byte[] b = fh.getFileName().getBytes("ISO8859-1");
String fname = null;
try {
fname = new String(b, "UTF-8");
if (fname.getBytes("UTF-8").length != b.length) {
fname = new String(b, "GBK");//most possible charset
}
} catch (Throwable e) {
//try other charset or ...
e.printStackTrace();
}
zipFile.extractFile(fh, destination, param, fname);
}
//System.out.println("Total File count: "+count);
} catch (ZipException e) {
// todo record compress exception information
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
}

在解压缩完成之后,会把压缩文件删除掉。简单的删除函数:

1
2
3
4
5
6
7
8
public  void deleteFile(Path path) {
try{
Files.delete(path);
}catch (Exception e)
{
e.printStackTrace();
}
}

递归解压缩的函数代码如下,这里统计了共解压得到多少封邮件,用于后续的前台进度展示(增加了对自定义后缀文件的解压缩,实际上就是zip文件):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public  long traverseFolder(String path, String destination, String originalFilename) {
long count = 0;
if (FilenameUtils.getExtension(originalFilename).compareToIgnoreCase("zip") == 0) {
zipdecompress(path, destination);
count = count + traverseFolder(destination, destination,"");
// count = count+traverseFolder(destination + File.separator + FilenameUtils.getBaseName(path), destination + File.separator + FilenameUtils.getBaseName(path));
} else if (FilenameUtils.getExtension(originalFilename).compareToIgnoreCase("7z") == 0 ||
FilenameUtils.getExtension(path).compareToIgnoreCase("rar") == 0
) {
rar7zdecompress(path,destination);
count = count+traverseFolder(destination,destination,"");
} else {
Path dir = Paths.get(path);
try {
DirectoryStream<Path> directorySteam = Files.newDirectoryStream(dir);
for (Path filePath : directorySteam) {
if (Files.isDirectory(filePath)) {
count = count + traverseFolder(destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),
destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),"");
} else {
if (FilenameUtils.getExtension(filePath.toString()).compareToIgnoreCase("zip") == 0) {
zipdecompress(filePath.toString(), destination + File.separator + FilenameUtils.getBaseName(filePath.toString()));
deleteFile(filePath);
count = count + traverseFolder(destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),
destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),"");
} else if (FilenameUtils.getExtension(filePath.toString()).compareToIgnoreCase("7z") == 0 ||
FilenameUtils.getExtension(filePath.toString()).compareToIgnoreCase("rar") == 0
) {
rar7zdecompress(filePath.toString(), destination + File.separator + FilenameUtils.getBaseName(filePath.toString()));
deleteFile(filePath);
count = count+traverseFolder(destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),
destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),"");
} else if(FilenameUtils.getExtension(filePath.toString()).compareToIgnoreCase("pzt")==0){
zipdecompress(filePath.toString(), destination + File.separator + FilenameUtils.getBaseName(filePath.toString()));
deleteFile(Paths.get(filePath.toString()));
//只处理json文件,所有计数加1即可
// count = count + traverseFolder(destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),
// destination + File.separator + FilenameUtils.getBaseName(filePath.toString()),"");
count++;
}else if(Files.isRegularFile(filePath)) {
count++;
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
return count;
}